Re: [ufs]: [scsi]: BUG: spinlock recursion on CPU#4
On Mon, 2017-06-05 at 12:16 +0530, Asutosh Das (asd) wrote: > It's on 4.4 and its an Android kernel. > > No - I haven't tried it out yet. I could get some clues from the > call-stack itself, like I explained before. I can try these configs > though. While I do that, I'd like to know your thoughts on my analysis. > Do you think with the current data, it makes sense? Hello Asutosh, If your analysis is correct then I think the easiest solution will be to switch to scsi-mq. The scsi-mq .queue_rq function is called without the host lock held and hence there is no need to unlock the host lock from inside the queue_rq function. Bart.
Re: [ufs]: [scsi]: BUG: spinlock recursion on CPU#4
On 6/1/2017 7:32 PM, Bart Van Assche wrote: On Thu, 2017-06-01 at 12:28 +0530, Asutosh Das (asd) wrote: Please can you check if this is actually a bug and my understanding is correct. Hello Asutosh, Spinlock recursion is always a bug. With what kernel version did you encounter this? Was it with a kernel from kernel.org, a distro kernel or an Android kernel? Have you already tried to reproduce this with kernel debugging enabled? Enabling CONFIG_DEBUG_SPINLOCK and CONFIG_PROVE_LOCKING should provide more detailed information. Bart. Hello Bart, Thanks. It's on 4.4 and its an Android kernel. No - I haven't tried it out yet. I could get some clues from the call-stack itself, like I explained before. I can try these configs though. While I do that, I'd like to know your thoughts on my analysis. Do you think with the current data, it makes sense? -- Asutosh. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [ufs]: [scsi]: BUG: spinlock recursion on CPU#4
On Thu, 2017-06-01 at 12:28 +0530, Asutosh Das (asd) wrote: > Please can you check if this is actually a bug and my understanding is > correct. Hello Asutosh, Spinlock recursion is always a bug. With what kernel version did you encounter this? Was it with a kernel from kernel.org, a distro kernel or an Android kernel? Have you already tried to reproduce this with kernel debugging enabled? Enabling CONFIG_DEBUG_SPINLOCK and CONFIG_PROVE_LOCKING should provide more detailed information. Bart.
[ufs]: [scsi]: BUG: spinlock recursion on CPU#4
Hi All, Recently, I came across an issue with the below call stack. -000|arch_counter_get_cntvct(inline) -000|__delay() -001|__const_udelay(?) -002|msm_trigger_wdog_bite() -003|spin_dump(inline) -003|spin_bug(lock = ?, ?) -004|current_thread_info(inline) -004|debug_spin_lock_before(inline) -004|do_raw_spin_lock() -005|raw_spin_lock_irqsave(lock = ?) -006|blk_end_bidi_request(inline) -006|blk_end_request_all(rq = ?, error = 0) <-- this tries to acquire the lock acquired by blk_delay_work (-024) and spinbug recursion occurs -007|dm_end_request(clone = ?, error = 0) -008|dm_done(inline) -008|dm_softirq_done() -009|blk_done_softirq() -010|__read_once_size(inline) -010|static_key_count(inline) -010|static_key_false(inline) -010|trace_softirq_exit(inline) -010|__do_softirq() -011|do_softirq_own_stack(inline) -011|invoke_softirq(inline) <-- softirq is triggered because scsi_request_fn (-016) enabled interrupts on this cpu -011|irq_exit() -012|handle_IPI() -013|gic_handle_irq() -014|el1_irq(asm) -->|exception -015|__raw_spin_unlock_irq(inline) -015|raw_spin_unlock_irq(lock = ?) -016|scsi_request_fn() <-- Unlocks the queue using spin_unlock, doesn't restore the flags, thus enabling the interrupts -017|__blk_run_queue_uncond(inline) -017|__blk_run_queue(q = ?) -018|__elv_add_request() -019|blk_insert_cloned_request() <-- acquires the queue lock & saves the flags -020|dm_dispatch_clone_request(clone = ?, rq = ?) -021|map_request() -022|dm_request_fn() -023|__blk_run_queue_uncond(inline) -023|__blk_run_queue -024|spin_unlock_irq(inline) -024|blk_delay_work(?) <-- also acquires a queue lock, but this is a different queue, blk_end_request_all will reference this queue -025|__read_once_size(inline) -025|static_key_count(inline) -025|static_key_false(inline) -025|trace_workqueue_execute_end(inline) -025|process_one_work() -026|worker_thread() -027|kthread() -028|ret_from_fork(asm) ---|end of frame Please can you check if this is actually a bug and my understanding is correct. If so, I can put up a patch for the same. -- Asutosh Das (asd) Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project