On Tue, Apr 21, 2026 at 5:19 PM Petr Mladek <[email protected]> wrote:
>
> On Sun 2026-04-19 11:19:19, Yafang Shao wrote:
> > On Fri, Apr 17, 2026 at 11:52 PM Song Liu <[email protected]> wrote:
> > >
> > > On Fri, Apr 17, 2026 at 6:20 AM Petr Mladek <[email protected]> wrote:
> > > >
> > > > On Thu 2026-04-16 09:32:46, Song Liu wrote:
> > > [...]
> > > > Let' use the code from this patch:
> > > >
> > > > static int __init livepatch_bpf_init(void)
> > > > {
> > > > int ret;
> > > >
> > > > ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
> > > > &klp_bpf_kfunc_set);
> > > > ret = ret ?: register_bpf_struct_ops(&bpf_klp_bpf_cmdline_ops,
> > > > klp_bpf_cmdline_ops);
> > > > if (ret)
> > > > return ret;
> > > >
> > > > ---> /*
> > > > ---> * We would need to wait here until the BPF program gets loaded.
> > > > ---> * for the new bpf_struct_ops in this new livepatch.
> > > > ---> */
> >
> > No waiting is necessary. If the BPF program is not attached, the
> > default logic can be executed instead.
>
> But it means a regression. I guess that you need the BPF program
> for a reason. The default logic is not good enough indeed.
>
> > Consider Song's test case: we can handle it as follows.
> >
> > static int livepatch_cmdline_proc_show(struct seq_file *m, void *v)
> > {
> > struct klp_bpf_cmdline_ops *ops = READ_ONCE(active_ops);
> >
> > if (ops && ops->set_cmdline)
> > return ops->set_cmdline(m);
> >
> > // If no BPF program is attached, the default kernel function runs.
> > return cmdline_proc_show(m, v);
> > }
> >
> > However, as Song explained below, if we want atomic replace to work,
> > we may need to wait for the new BPF program here. But that would make
> > the combination of livepatch and BPF more complex.
> >
> > Currently, on our production servers, we handle this through a user
> > script, such as:
> >
> > stop_traffic_relying_on_livepatch_bpf
> > kpatch load new-livepatch-bpf-module.ko
> > reattach_the_bpf_program
> > start_the_traffic_again
> >
> > Although this approach requires restarting the affected traffic, other
> > services running on the same server remain unaffected.
>
> We put a lot of effort to make livepatching as less disruptive
> as possible. The atomic replace is supposed to work without
> any disruption.
>
> > > > return klp_enable_patch(&patch);
> > > > }
> > >
> > > Yes, something in this direction is needed to make atomic replace work.
> > > We have no plan to use this in production. I will let Yafang figure out
> > > his plan.
> > >
> > > > Or maybe, the bpf_struct_ops can be _allocated dynamically_ and
> > > > the pointer might be _passed via shadow variables_.
> > > >
> > > > One problem is that shadow variables would add another overhead
> > > > and need not be suitable for hot paths.
> > > >
> > > >
> > > > Anyway, I think that I have similar feelings as Miroslav.
> > > > The combination of livepatches and BPF programs increases
> > > > the complexity for all involved parties: core kernel maintainers,
> > > > livepatch and BPF program authors, and system maintainers.
> > > >
> > > > Do we really want to propagate it?
> > > > Is there any significant advantage in combining these two, please?
> > > > Is it significantly easier to write BPF program then a livepatch?
> > > > Is it significantly easier to update BPF programs then livepatches?
> >
> > This is an important feature for avoiding server restarts,
> > particularly in a VM host environment. Since only one VM on the host
> > may be affected by this feature, we can deploy it rapidly without
> > impacting other VMs on the same host.
>
> This does not answer the question why you need the combination
> of livepatch + BPF. Why a livepatch is not enough?
Consider this recent use case from our production servers:
https://lore.kernel.org/live-patching/caloahbdnnba_w_nwh3-s9gaxw0+vkulth1gy5hy9yqgeo4c...@mail.gmail.com/
In one of our clusters, we needed to route BGP traffic through
specific NICs based on destination IP addresses. To achieve this
without service interruption, we applied a livepatch to
bond_xmit_3ad_xor_slave_get() to introduce a new hook,
bond_get_slave_hook(). We then attached a BPF program to this hook to
select the outgoing NIC by parsing the SKB. Because the destination
IPs must be adjusted on demand, a static livepatch alone was
insufficient; the BPF integration provided the necessary dynamic
flexibility.
>
> Let me repeat the questions:
>
> Is it significantly easier to write BPF program then a livepatch?
> Is it significantly easier to install BPF programs then livepatches?
>
> > > > Would the support of different replace tags help?
> > > > They would allow to replace only livepatches with the same tag.
> >
> > Right, it will help.
>
> Would this make a rapid update of livepatches easy enough so that
> you won't need the BPF part?
As explained above, we cannot rely solely on livepatching to handle
destination IP changes, as these require real-time updates.
--
Regards
Yafang