On 12/28/2014 03:17 PM, Davidlohr Bueso wrote: > On Sat, 2014-12-27 at 10:52 -0500, Sasha Levin wrote: >> > There's a chance that lock->owner would change, but how would you explain >> > it changing to 'current'? > So yeah, the above only deals with the weird printk values, not the > actual issue that triggers the BUG_ON. Lets sort this out first and at > least get correct data.
Is there an issue with weird printk values? I haven't seen a report of something like that, nor have seen it myself. >> > That is, what race condition specifically creates the >> > 'lock->owner == current' situation in the debug check? > Why do you suspect a race as opposed to a legitimate recursion issue? > Although after staring at the code for a while, I cannot see foul play > in sched_rr_get_interval. > > Given that all reports show bogus contending CPU and .owner_cpu, I do > wonder if this is actually a symptom of the BUG_ON where something fishy > is going on.. although I have no evidence to support that. I also ran > into this https://lkml.org/lkml/2014/11/7/762 which shows the same bogus > values yet a totally different stack. > > Sasha, I ran trinity with CONFIG_DEBUG_SPINLOCK=y all night without > triggering anything. How are you hitting this? I don't have any reliable way of reproducing it. The only two things I can think of are: - Try running as root in a disposable vm - Try running with really high load (I use ~800 children on 16 vcpu guests). Thanks, Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/