[snip] >> Hmmm, UML is UP and does not support PREEMPT, so all spinlocks >> should be a no-op. > In that case, if I understand correctly what is going on, there are a > couple of places - the free_irqs(), activate_fd and the sigio handler > itself, where it should not be a mutex, not a spinlock. It is there to > ensure that you cannot use it in an interrupt context while it is being > modified. > > If spinlock is a NOP it fails to perform this duty. The code should also > be different - it should return on try_lock so it does not deadlock so > spinlock_irqsave is the wrong locking primitive as it does not have this > functionality. > > That is an issue both with this patch and with the original poll based > controller - there free_irq, add_fd, reactivate_fd can all theoretically > produce a race if you are adding/removing devices while under high IO load.
We, however cannot use mutex here as it is interrupt. I tried with spin_trylock and finally got the correct behaviour. It throws an occasional warning here and there while inserting/removing devices, but works correctly with either config. No more BUGs. A bare (not try) spinlock with UP/PREEMPT set as they are in UML actually does not guard anything effectively - it is a NOP. The try form is an exemption - if you look at spinlock.h it is actually "viable" even on UP. It will however throw a warning that it is activated in an inappropriate context if it hits an existing lock. In theory - the code in signal.c should guard against nested interrupt invocation. I am still struggling to understand why it fails to work in practice. This also leaves open the question on how to add/remove interrupts. If the spinlock does not actually guard the irq data structures properly modifying them in a safe manner becomes a very interesting exercise. I have it working with the try form, but that throws the occasional warning. I am going to clean it up and re-submit so we have a "working version" which people can comment on. A. > > A. > >> Do you have lock debugging enabled? >> >> I this case I'd start gdb and inspect the memory. Maybe a stack corruption. >> > ------------------------------------------------------------------------------ > _______________________________________________ > User-mode-linux-devel mailing list > User-mode-linux-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel > ------------------------------------------------------------------------------ _______________________________________________ User-mode-linux-devel mailing list User-mode-linux-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel