On Wed, Aug 19, 2020 at 11:50:55AM -0700, Linus Torvalds wrote: > On Wed, Aug 19, 2020 at 12:22 AM <pet...@infradead.org> wrote: > > > > That is, the external serialization comes from the non-atomic > > test-and-set they both have. This works nicely when there is external > > state that already serializes things, but totally comes apart (and > > causes trivial list corruption) when you get it wrong. > > Quite often, there just isn't any *need* for serialization, because > there is only ever one op active. > > That can be either because the csd is fundamentally a single thing ("I > will transfer this object to another CPU"), or it can be because the > CSD is already per-cpu (ie smp_call_function_single).
Note that smp_call_function_single() doesn't use _async(), and while it does use per-cpu CSDs, the reqular .wait=false case then potentially waits before (instead of after) in the unlikely case the local CSD was already taken. This makes that we cannot use it with IRQs disabled. > You seem to make that common situation much worse. Probably -- like I said, I'm not really happy with this either. > Not only do you add that expensive atomic op, you add that expensive > "use irq_work queues" for something that doesn't _need_ to use them. I'm not sure I get the "expensive irq_work queues" argument, I fully agree with you that adding the atomic op is fairly crap. The remote irq_work is exactly same llist and IPI vector as remote smp_call_function_single(). And the new irq_work_queue_remote_static() is exactly as cheap as smp_call_function_single_async() for not having an atomic op, queues to the same llist, sends the same IPI, but uses the embedded data structure instead of pass a data pointer pattern. Because most of the async users end up having another data structure to pass arguments around anyway. > I have to say, I'm not a fan. What are the real advantages? Your > listed disadvantages are very very questionable. > > IOW, what are the actual examples of "totally comes apart" that justifies > this? > > If the example is theoretical ("if you use csd's wrong") then I think > they are worthless. Well, I did use the CSD's wrong. I forgot the gotcha and made a mess of things and stuff crashed, the wreckage is in the git history :/ I fixed it, but I got burned. I then went and looked at a bunch of other _async users and it wasn't immediately obvious that they were doing it right (they were). I wanted to improve things, but I'm willing to admit to just making it worse in these last few patches. Anyway, the pattern I wanted is more easily expressed with the new irq_work_queue_remote(), even though I don't need it anymore. I can throw away all the patches after that without loosing too much sleep.