On 2026/6/28 19:39, Masashi Honma wrote:
> Hello, I am re-posting this mail because I forget to add [RFC].
>
> The bpf_xdp_link_attach_failed tracepoint (added in commit bf4ea1d0b2cb
> "xdp: Add tracepoint for xdp attaching failure") exposes the netlink
> extack message produced when attaching an XDP program via BPF_LINK_CREATE
> fails. This is useful because, unlike the netlink attach path, the
I really appreciate that the XDP tracepoint helped someone.
> bpf_link attach path does not return the extack to userspace -- the caller
> only gets an errno (e.g. EINVAL/ERANGE).
>
> We would like to use this in Cilium [1][2]: when attaching the XDP
> datapath program fails, surface the kernel's reason (e.g. "single-buffer
> XDP requires MTU less than ...") in the agent logs instead of an opaque
> errno, so operators don't have to inspect dmesg on the host.
>
> The limitation we hit is that the tracepoint only carries the message
> string, so a consumer cannot tell which device a failure belongs to.
> This matters for two reasons:
>
> 1. Correlation: with only the message, a consumer cannot reliably
> attribute a failure to a specific attach, particularly if multiple
> XDP attaches happen concurrently.
> 2. Scoping: a consumer watching this tracepoint sees XDP attach
> failures system-wide and cannot limit them to the devices it
> manages.
>
> At the call site (bpf_xdp_link_attach() in net/core/dev.c) the net_device
> is in scope, so exposing it looks straightforward:
>
> TRACE_EVENT(bpf_xdp_link_attach_failed,
> TP_PROTO(const char *msg, const struct net_device *dev),
> TP_ARGS(msg, dev),
> TP_STRUCT__entry(
> __string(msg, msg)
> __field(int, ifindex)
> ),
> TP_fast_assign(
> __assign_str(msg);
> __entry->ifindex = dev->ifindex;
> ),
> TP_printk("ifindex=%d errmsg=%s", __entry->ifindex, __get_str(msg))
> );
>
> - trace_bpf_xdp_link_attach_failed(extack._msg);
> + trace_bpf_xdp_link_attach_failed(extack._msg, dev);
>
> Before sending a formal patch I'd appreciate guidance on a few points:
>
> - Should the tracepoint take const struct net_device *dev (consistent
> with the other tracepoints in this file, and lets TP_printk show the
> device), or just the ifindex as an int (simpler for raw_tp BPF
> consumers, which otherwise read dev->ifindex via CO-RE)?
>
> - For raw_tp consumers the argument order is effectively ABI: prepending
> dev would shift the existing msg argument. I've appended dev above to
> keep msg at args[0]. Is preserving the existing argument position the
> right call, or is reordering acceptable given how new and rarely
> consumed this tracepoint is?
>
Good concerns. I'm not sure about these parts.
> - Is extending the existing tracepoint preferred, or would you rather
> keep it as-is and expose the device context some other way?
>
I'm planning to retire this tracepoint. But I think I cannot do it, if
there's user space application relied on the tracepoint.
I'm planning to add BPF syscall common attributes support for
BPF_LINK_CREATE, including XDP link. By that way, the kernel will be
able to back-propagate the 'extack._msg' to user space, when fail to
create XDP link. Thereafter, the user space library will be able to get
the error message alongside the errno.
Thanks,
Leon
> This would be my first XDP/BPF tracepoint change, so any direction is
> welcome. I'm happy to send a proper patch once the shape is agreed.
>
> Regards,
> Masashi Honma
>
> [1] https://github.com/cilium/cilium/issues/40777
> [2] https://github.com/cilium/cilium/pull/46546