On Sat, Jun 27, 2020 at 12:10:24PM +0200, Martin Pieuchot wrote: > On 27/06/20(Sat) 00:35, Vitaliy Makkoveev wrote: > > On Fri, Jun 26, 2020 at 09:12:16PM +0200, Martin Pieuchot wrote: > > > On 26/06/20(Fri) 16:56, Vitaliy Makkoveev wrote: > > > > if_clone_create() has the races caused by context switch. > > > > > > Can you share a backtrace of such race? Where does the kernel panic? > > > > > > > This diff was inspired by thread [1]. As I explained [2] here is 3 > > issues that cause panics produced by command below: > > > > ---- cut begin ---- > > for i in 1 2 3; do while true; do ifconfig bridge0 create& \ > > ifconfig bridge0 destroy& done& done > > ---- cut end ---- > > Thanks, I couldn't reproduce it on any of the machines I tried. Did you > managed to reproduce it with other pseudo-devices or just with bridge0? >
In thread [1] you talked about bridge(4), tun(4) and vether(4). A first I fixed races in if_clone_destroy() and I caught the races with if_clone_create() while I run your initial comman but with vether(4) ---- cut begin ---- for i in 0 1 2 3 4 5 6 7; do while true; \ do cat /dev/vether0& ifconfig vether0 destroy& done& done ---- cut end ---- It's hard to reproduce this issue. The best chances for me is bare metal 8 cores, fully unloaded system, no X, no active processes, test started at console and all output redirected to /dev/null. And it can take *hours* to catch. I can't reproduce this on 2 cores. I can't reproduce this at 4 cores under kvm but it's reproducible under virtual box under osx. The hosts has 8 cores. I can reproduce this on bare metal with 4 cores, but also it takes time. Routine called by `ifc_create' within if_clone_attach() is very specific to each pseudo interface. if_attach() is the only common point to sleep for them, but you also can sleep in any point of sleep before `ifc_create' will call if_attach(), For exmaple you will alloc software context with `M_WAITOK'. bridge(4) is just the best way to reproduce to me. You have all `ifnet's linked to `if_list'. ifunit() does linear search in this list by compare `ifp->if_xname' and given `name'. So if you inserted many `ifnet's to this list ifunit() will return you first. but if_get(9) doesn't work with this list. So if you have the case I talk above if_get(9) and ifname() are inconsistent. Some times in the stack you use if_get(9) sometimes you use ifunit() so you work every time with diffetrent `ifnet's with the same `if_xname'. You can't predict where `ifnet' will be corrupted. > > My system was stable with the last diff I did for thread [1]. But since > > this final diff [3] which include fixes for tun(4) is quick and dirty > > and not for commit I decided to make the diff to fix the races caused by > > if_clone_create() at first. > > > > I included screenshot with panic. > > Thanks, interesting that the corruption happens on a list that should be > initialized. Does that mean the context switch on Thread 1 is happening > before if_attach_common() is called? > I don't know where it was. if_attach() doesn't checks if `ifnet' with the name in `if_xname' already linked. You will insert passed `ifnet' in any cases. If you have more then one `ifnet' with identical `if_xname' you have broken ifunit() and if_get() logic. Look at if_attach(): ---- cut begin ---- if_attach(struct ifnet *ifp) { if_attach_common(ifp); NET_LOCK(); TAILQ_INSERT_TAIL(&ifnet, ifp, if_list); /* (1) */ if_attachsetup(ifp); NET_UNLOCK(); } You link `ifp' at (1). And it's still your `ifp' before and after context switch ot without context switch. You will brake it later. The reason is pseudo driver received the same `unit' more than once. And it created two or more software context with identical `unit'. And internal pseudo driver's logic is broken. Also ifunit() and if_get(9) are inconsistent now. You can break memory everythere. ---- cut end ---- > You said your previous email that there's a context switch. Do you know > when it happens? You could see that in ddb by looking at the backtrace > of the other CPU. > > Is the context switch leading to the race common to all pseudo-drivers > or is it in the bridge(4) driver? ddb(4) is useless. The panic occured while we are trying to if_detach() already broken `ifnet'. There is no reces here. But the rases was *before* and we inserted two or more `ifnet's with the same name to `if_list'. This insertion is no panic condition. The first time I caught this races while I connected to you [1] thread. I inserted ifunit() call to if_attach() as below and received panic so I'am shure about the reason: ---- cut begin ---- if_attach(struct ifnet *ifp) { if_attach_common(ifp); NET_LOCK(); KASSERT(ifunit(ifp->if_xname)); TAILQ_INSERT_TAIL(&ifnet, ifp, if_list); if_attachsetup(ifp); NET_UNLOCK(); } ---- cut end ---- But in thread [1] you said these races with pseudo interfaces are very old well know issue, so I didn't take photos of panics caused by races of if_clone_{create,destroy}(). > > Regarding your solution, do I understand correctly that the goal is to > serialize all if_clone_create()? Is it really needed to remember which > unit is being currently created or can't we just serialize all of them? > No. There is no serialization required. And you can do simultaneous if_clone_create() but with different units: if_clone_create("ifname0"); if_clone_create("ifname1"); if_clone_create("ifname2"); .... if_clone_create("ifnameN"); The goal is to be shure that obtained `unit' will not be used until we release it. This denies to initialize software context with the same `units' and you can't insert duplicate `if_xname' to `if_list'. This makes if_get(9) and ifunit() happy. And yes it's really needed to store already used units. Or pseudo driver should store it's units and check it. Since you obtained the `unit' you should *deny* to use it to anyone else until you release it. Pesudo driver trusts you that `unit' passed to `ifc_create' is unique. And it should be unique elsewhere you should duplicate the same check logic in *every* pseudo-driver. > The fact that a lock is not held over the cloning operation is imho > positive. > Also I guess I fixed the races you discussed in thread [1]. So I decided to rework my diff [2] to many diffs to be ready to commit. Diff I posted to this thread is for races in if_clone_create() only. But also you have use-after-free issues caused by races in if_clone_destroy(). And panics caused by these races are easy to catch. Also pppx(4) and pipex(4) had fully identical issues and I fixed them already :) 1. https://marc.info/?t=159289590100001&r=1&w=2 2. https://marc.info/?l=openbsd-tech&m=159308633126243&w=2