On Wed, Apr 08, 2020 at 09:51:45AM +0200, Martin Pieuchot wrote: > On 07/04/20(Tue) 19:58, Vitaliy Makkoveev wrote: > > > > > > > On 7 Apr 2020, at 17:43, Martin Pieuchot <m...@openbsd.org> wrote: > > > > > > On 07/04/20(Tue) 17:14, Vitaliy Makkoveev wrote: > > >> As Claudio Jeker noticed, NET_LOCK() can release KERNEL_LOCK(). pppx(4) > > >> code has some NET_LOCK() dances which make it unsafe. [...] > > > > > > The easiest way to fix that is to move if_detach() out of > > > pppx_if_destroy(). > > > > pppxioctl() is under KERNEL_LOCK(). pppxioctl() holds NET_LOCK() it’s whole > > runtime. So your suggestion is (pseudo code): > > > > KERNEL_LOCK(); > > > > pppx_ioctl() > > { > > NET_LOCK(); > > > > switch() { > > case PIPEXDSESSION: > > NET_UNLOCK(); > > /* > > * KERNEL_LOCK() can be released here and concurrent > > * thread with PIPEXDSESSION case will enter here > > */ > > if_detach(); > > NET_LOCK(); > > > > pppx_if_destroy(); > > break; > > } > > > > NET_UNLOCK(); > > } > > > > KERNEL_UNLOCK(); > > > > I understood well? > > That or the other way around. The question is who can grab a reference > on `ifp'? What is currently serializing this? Is the NET_LOCK() at all > necessary in pppx_ioctl() what is it protecting? All the pipex(4) code > seems to be running under KERNEL_LOCK() anyway. > > I'm wondering if it isn't simpler to answer those questions and fix the > current code once pppx(4) is using pipex_add_session().
As I see (pseudo code): KERNEL_LOCK(); pppx_ioctl() { NET_LOCK(); switch() { case PIPEXASESSION: /* pppx_add_session() */ pxi->pxi_ready = 0; rw_enter_write(&pppx_ifs_lk); RBT_INSERT(pxi) rw_exit_write(&pppx_ifs_lk); NET_UNLOCK(); /* (1) release KERNEL_LOCK() */ if_attach(); NET_LOCK(); rw_enter_write(&pppx_ifs_lk); pxi->pxi_ready = 1; rw_exit_write(&pppx_ifs_lk); break; case PIPEXDSESSION: /* pppx_del_session() */ /* * rw_enter_read(&pppx_ifs_lk); * find and return pxi with 'pxi_ready != 0' * rw_exit_read(&pppx_ifs_lk); */ pxi = pppx_if_find(); /* (3) unlocked pxi gap. pxi can be grabbed */ NET_UNLOCK(); /* (2) release KERNEL_LOCK() */ if_detach(); NET_LOCK(); /* kill pxi */ break; case PIPEXCOMMAND: /* * pppx_config_session() * pppx_get_stat() * pppx_get_closed() * pppx_set_session_descr() */ /* * rw_enter_read(&pppx_ifs_lk); * find and return pxi with 'pxi_ready != 0' * rw_exit_read(&pppx_ifs_lk); */ pxi = pppx_if_find(); break; } NET_UNLOCK(); } KERNEL_UNLOCK(); 1. Since pppx_if_find() can't return half initialized pppx_if there is no concurency with pppx_add_session() and others. 2. Except pppx_del_session(), various pppx commands don't play with NET_LOCK() but the can receive cpu if someone is in (1) or (2). They can't grab half-initilaized pppx_if, but can grab half destroyed pppx_if. 3. pppx_del_session() can grab half destroyed pppx_if 4. Half destroyed pppx_if is still in pppx_if three and queue, so it can be grabbed (3) my diff prevents (3) (pseudo code): KERNEL_LOCK(); pppx_ioctl() { NET_LOCK(); switch() { case PIPEXDSESSION: /* pppx_del_session() */ rw_enter_write(&pppx_ifs_lk); /* * find pxi with 'pxi_ready != 0' * remove it from tree and list */ rw_exit_write(&pppx_ifs_lk); /* * nobody can grab this pxi here we are the only holder * unlocked gap (3) gone */ NET_UNLOCK(); /* (2) release KERNEL_LOCK() */ /* nobody enters here with our pxi */ if_detach(); NET_LOCK(); /* kill pxi */ break; case PIPEXCOMMAND: /* * pppx_config_session() * pppx_get_stat() * pppx_get_closed() * pppx_set_session_descr() */ /* * rw_enter_read(&pppx_ifs_lk); * find and return pxi with 'pxi_ready != 0' * rw_exit_read(&pppx_ifs_lk); */ pxi = pppx_if_find(); break; } NET_UNLOCK(); } KERNEL_UNLOCK(); So, I fixed this issue :)