On Sat, Jun 27, 2020 at 12:10:24PM +0200, Martin Pieuchot wrote:
> On 27/06/20(Sat) 00:35, Vitaliy Makkoveev wrote:
> > On Fri, Jun 26, 2020 at 09:12:16PM +0200, Martin Pieuchot wrote:
> > > On 26/06/20(Fri) 16:56, Vitaliy Makkoveev wrote:
> > > > if_clone_create() has the races caused by context switch.
> > > 
> > > Can you share a backtrace of such race?  Where does the kernel panic?
> > >
> > 
> > This diff was inspired by thread [1]. As I explained [2] here is 3
> > issues that cause panics produced by command below:
> > 
> > ---- cut begin ----
> > for i in 1 2 3; do while true; do ifconfig bridge0 create& \
> >     ifconfig bridge0 destroy& done& done
> > ---- cut end ----
> 
> Thanks, I couldn't reproduce it on any of the machines I tried.  Did you
> managed to reproduce it with other pseudo-devices or just with bridge0?
> 

In thread [1] you talked about bridge(4), tun(4) and vether(4). A first
I fixed races in if_clone_destroy() and I caught the races with
if_clone_create() while I run your initial comman but with vether(4)

---- cut begin ----
for i in 0 1 2 3 4 5 6 7; do while true; \
        do cat /dev/vether0& ifconfig vether0 destroy& done& done
---- cut end ----

It's hard to reproduce this issue. The best chances for me is bare metal
8 cores, fully unloaded system, no X, no active processes, test started
at console and all output redirected to /dev/null. And it can take
*hours* to catch. I can't reproduce this on 2 cores. I can't reproduce
this at 4 cores under kvm but it's reproducible under virtual box under
osx. The hosts has 8 cores. I can reproduce this on bare metal with 4
cores, but also it takes time.

Routine called by `ifc_create' within if_clone_attach() is very specific
to each pseudo interface. if_attach() is the only common point to sleep
for them, but you also can sleep in any point of sleep before
`ifc_create' will call if_attach(), For exmaple you will alloc software
context with `M_WAITOK'.

bridge(4) is just the best way to reproduce to me.

You have all `ifnet's linked to `if_list'. ifunit() does linear search
in this list by compare `ifp->if_xname' and given `name'. So if you
inserted many `ifnet's to this list ifunit() will return you first. but
if_get(9) doesn't work with this list. So if you have the case I talk
above if_get(9) and ifname() are inconsistent. Some times in the stack
you use if_get(9) sometimes you use ifunit() so you work every time with
diffetrent `ifnet's with the same `if_xname'. You can't predict where
`ifnet' will be corrupted.

> > My system was stable with the last diff I did for thread [1]. But since
> > this final diff [3] which include fixes for tun(4) is quick and dirty
> > and not for commit I decided to make the diff to fix the races caused by
> > if_clone_create() at first.
> > 
> > I included screenshot with panic.
> 
> Thanks, interesting that the corruption happens on a list that should be
> initialized.  Does that mean the context switch on Thread 1 is happening
> before if_attach_common() is called?
> 

I don't know where it was. if_attach() doesn't checks if `ifnet' with
the name in `if_xname' already linked. You will insert passed `ifnet' in
any cases. If you have more then one `ifnet' with identical `if_xname'
you have broken ifunit() and if_get() logic.

Look at if_attach():

---- cut begin ----
if_attach(struct ifnet *ifp)
{
        if_attach_common(ifp);
        NET_LOCK();
        TAILQ_INSERT_TAIL(&ifnet, ifp, if_list); /* (1) */
        if_attachsetup(ifp);
        NET_UNLOCK();
}

You link `ifp' at (1). And it's still your `ifp' before and after context
switch ot without context switch. You will brake it later. The reason is
pseudo driver received the same `unit' more than once. And it created
two or more software context with identical `unit'. And internal pseudo
driver's logic is broken. Also ifunit() and if_get(9) are inconsistent
now. You can break memory everythere.

---- cut end ----
> You said your previous email that there's a context switch.  Do you know
> when it happens?  You could see that in ddb by looking at the backtrace
> of the other CPU.
> 
> Is the context switch leading to the race common to all pseudo-drivers
> or is it in the bridge(4) driver?

ddb(4) is useless. The panic occured while we are trying to if_detach()
already broken `ifnet'. There is no reces here. But the rases was
*before* and we inserted two or more `ifnet's with the same name to
`if_list'. This insertion is no panic condition.

The first time I caught this races while I connected to you [1] thread.
I inserted ifunit() call to if_attach() as below and received panic so
I'am shure about the reason:

---- cut begin ----
if_attach(struct ifnet *ifp)
{
        if_attach_common(ifp);
        NET_LOCK();
        KASSERT(ifunit(ifp->if_xname));
        TAILQ_INSERT_TAIL(&ifnet, ifp, if_list);
        if_attachsetup(ifp);
        NET_UNLOCK();
}
---- cut end ----

But in thread [1] you said these races with pseudo interfaces are very
old well know issue, so I didn't take photos of panics caused by races
of if_clone_{create,destroy}().

> 
> Regarding your solution, do I understand correctly that the goal is to
> serialize all if_clone_create()?  Is it really needed to remember which
> unit is being currently created or can't we just serialize all of them?
> 

No. There is no serialization required. And you can do simultaneous
if_clone_create() but with different units:
if_clone_create("ifname0");
if_clone_create("ifname1");
if_clone_create("ifname2");
....
if_clone_create("ifnameN");

The goal is to be shure that obtained `unit' will not be used until we
release it. This denies to initialize software context with the same
`units' and you can't insert duplicate `if_xname' to `if_list'. This
makes if_get(9) and ifunit() happy.

And yes it's really needed to store already used units. Or pseudo
driver should store it's units and check it. Since you obtained the
`unit' you should *deny* to use it to anyone else until you release it.
Pesudo driver trusts you that `unit' passed to `ifc_create' is unique.
And it should be unique elsewhere you should duplicate the same check
logic in *every* pseudo-driver.

> The fact that a lock is not held over the cloning operation is imho
> positive.
> 

Also I guess I fixed the races you discussed in thread [1]. So I decided
to rework my diff [2] to many diffs to be ready to commit.

Diff I posted to this thread is for races in if_clone_create() only. But
also you have use-after-free issues caused by races in
if_clone_destroy(). And panics caused by these races are easy to catch.

Also pppx(4) and pipex(4) had fully identical issues and I fixed them
already :)

1. https://marc.info/?t=159289590100001&r=1&w=2
2. https://marc.info/?l=openbsd-tech&m=159308633126243&w=2

Reply via email to