On Sat, Jul 11, 2020 at 10:11:03AM +0200, Martin Pieuchot wrote: > On 10/07/20(Fri) 14:38, Vitaliy Makkoveev wrote: > > On Fri, Jul 10, 2020 at 01:22:40PM +0200, Martin Pieuchot wrote: > > > On 10/07/20(Fri) 14:07, Vitaliy Makkoveev wrote: > > > > We have some races in pppac(4) > > > > 1. malloc(9) can sleep so we must check `sc' presence after malloc(9) > > > > > > Makes sense. > > > > > > > 2. we can sleep between `sc' insertion to `sc_entry' list and > > > > `sc_pipex_iface' initialization. Concurrent pppacioctl() can touch > > > > this incomplete `sc'. > > > > > > Why not insert the descriptor at the end? Shouldn't the order of > > > operations be: > > > > > > pipex_iface_init(); > > > if_attach(); > > > LIST_INSERT_HEAD() > > > > > > This way there's no need for a `ready' flag since the descriptor is only > > > added to global data structures once it is completely initialized. > > > > > > Using a `sc_ready' or `sc_dead' approach is something that require > > > touching all drivers whereas serializing insertions to global data > > > structures can be done at once for all the kernel. > > > > No, because we introduce the races with if_attach(). The similar races > > are in if_clone_attach(). We can do multiple `ifp' attachment with the > > same name. > > Yes that's the same problem. It is also present in other parts of the > userland/network stack boundary. That's why I'm arguing that the best > approach is to use a lock and document which data structures it > protects. > > We should concentrate on protecting access to data structures and not > code paths. >
Let's look what we should protect in pppac(4) layer. We allocate software context in pppacopen() and destroy it in pppacclose(). We should have only one `sc' allocated for each device entry. We don't attach out `sc' to device entry, we link it to list or three and we do search of this `sc' while we access it through device entry. The criteria for search is device minor number often called as `unit'. So we should have entries in list with *unique* units elsewhere we have out device list inconsistent. We protect *unit*, not the list. Also we use `pppac_devs' list to store units which are already in use. We insert incompete `sc' to `pppac_devs' list to prevent double allocation of `sc' referenced by unique unit. Also we should keep this unit busy before we fully destroy this `sc'. pppx(4) goes this way for units used by `pppx_if' and corresponding `ifnet's. We have pppac{open,close,else}() serialized by KERNEL_LOCK() but we sleep within pppac{open,close,else}(). That means what in fact they are *not* serialized. Also than means that software context initialization and destruction are *not* atomic. The way you suggest to go is to introduce rwlock and serialize pppacopen() and pppacclose(). This is bad idea because we will sleep while we are holding rwlock. Also this is bad idea because you should prevent access to `sc' which is being destroyed because you can grab it by concurrent thread. You must serialize *all* access to this `sc' elsewhere your "protection" is useless. pppx(4) had no problems with unit protection. Also it had no problems to access incomplete `pxi'. Now pppx(4) has fixed access to `pxi' which is being destroyed. And this is the way to go in pppac(4) layer too. We have pppx_dev2pxd() to obtain `pxd'. While we adding extra check to pppx_dev2pxd() this is not system wide. Also pppac(4) already has `sc_dead' to prevent concurrent pppac_ioctl() access to dying `sc'. You suggest to serialize pppac_ioctl() too?