On 10/13/19 9:42 AM, Dmitrii Dolgov wrote:
> To trace io_uring activity one can get an information from workqueue and
> io trace events, but looks like some parts could be hard to identify via
> this approach. Making what happens inside io_uring more transparent is
> important to be able to reason about many aspects of it, hence introduce
> the set of tracing events.
>
> All such events could be roughly divided into two categories:
>
> * those, that are helping to understand correctness (from both kernel
> and an application point of view). E.g. a ring creation, file
> registration, or waiting for available CQE. Proposed approach is to
> get a pointer to an original structure of interest (ring context, or
> request), and then find relevant events. io_uring_queue_async_work
> also exposes a pointer to work_struct, to be able to track down
> corresponding workqueue events.
>
> * those, that provide performance related information. Mostly it's about
> events that change the flow of requests, e.g. whether an async work
> was queued, or delayed due to some dependencies. Another important
> case is how io_uring optimizations (e.g. registered files) are
> utilized.
I like this in general, a few questions below.
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index bfbb7ab3c4e..730f7182b2a 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -71,6 +71,9 @@
> #include <linux/sizes.h>
> #include <linux/hugetlb.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/io_uring.h>
> +
> #include <uapi/linux/io_uring.h>
>
> #include "internal.h"
> @@ -483,6 +486,7 @@ static inline void io_queue_async_work(struct io_ring_ctx
> *ctx,
> }
> }
>
> + trace_io_uring_queue_async_work(ctx, rw, req, &req->work, req->flags);
> queue_work(ctx->sqo_wq[rw], &req->work);
> }
>
> @@ -707,6 +711,7 @@ static void io_fail_links(struct io_kiocb *req)
> {
> struct io_kiocb *link;
>
> + trace_io_uring_fail_links(req);
> while (!list_empty(&req->link_list)) {
> link = list_first_entry(&req->link_list, struct io_kiocb, list);
> list_del(&link->list);
Doesn't look like you have completion events, which makes it hard to
tell which dependants got killed when failing the links. Maybe a good
thing to add?
> @@ -1292,6 +1297,7 @@ static ssize_t io_import_iovec(struct io_ring_ctx *ctx,
> int rw,
> iovec, iter);
> #endif
>
> + trace_io_uring_import_iovec(ctx, buf);
> return import_iovec(rw, buf, sqe_len, UIO_FASTIOV, iovec, iter);
> }
>
Not sure I see much merrit in this trace event.
> @@ -2021,6 +2027,7 @@ static int io_req_defer(struct io_ring_ctx *ctx, struct
> io_kiocb *req,
> req->submit.sqe = sqe_copy;
>
> INIT_WORK(&req->work, io_sq_wq_submit_work);
> + trace_io_uring_defer(ctx, req, false);
> list_add_tail(&req->list, &ctx->defer_list);
> spin_unlock_irq(&ctx->completion_lock);
> return -EIOCBQUEUED;
> @@ -2327,6 +2334,7 @@ static int io_req_set_file(struct io_ring_ctx *ctx,
> const struct sqe_submit *s,
> } else {
> if (s->needs_fixed_file)
> return -EBADF;
> + trace_io_uring_file_get(ctx, fd);
> req->file = io_file_get(state, fd);
> if (unlikely(!req->file))
> return -EBADF;
> @@ -2357,6 +2365,8 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx,
> struct io_kiocb *req,
> INIT_WORK(&req->work, io_sq_wq_submit_work);
> io_queue_async_work(ctx, req);
> }
> + else
> + trace_io_uring_add_to_prev(ctx, req);
>
> /*
> * Queued up for async execution, worker will release
Maybe put this one in io_add_to_prev_work()? Probably just using the
'ret' as part of the trace, to avoid a branch for this?
Failing that, the style is off a bit, should be:
} else {
trace_io_uring_add_to_prev(ctx, req);
}
> @@ -4194,6 +4210,9 @@ SYSCALL_DEFINE4(io_uring_register, unsigned int, fd,
> unsigned int, opcode,
> mutex_lock(&ctx->uring_lock);
> ret = __io_uring_register(ctx, opcode, arg, nr_args);
> mutex_unlock(&ctx->uring_lock);
> + if (ret >= 0)
> + trace_io_uring_register(ctx, opcode, ctx->nr_user_files,
> +
> ctx->nr_user_bufs, ctx->cq_ev_fd != NULL);
Just trace 'ret' as well?
> + * io_uring_add_to_prev - called after a request was added into a previously
> + * submitted work
> + *
> + * @ctx: pointer to a ring context structure
> + * @req: pointer to a request, added to a previous
> + *
> + * Allows to track merged work, to figure out how oftern requests are piggy
often
--
Jens Axboe