On 11/07/2016 04:15 PM, Chris Leech wrote: > Hi, > > I'm kicking this old thread because I don't think this ever got > resolved. I wish I had more info, but it seems to involve target > specific behavior that hasn't come up in our test labs.
Thanks very much for reopening this thread! We have the Or's patch reverted in multiple distros, so the issue is not likely to happen on customer's from IBM, but upstream kernel never saw a proper fix for this. > > So what can I do at this point to help resolve this? > > On Sun, Nov 15, 2015 at 12:10:48PM +0200, Or Gerlitz wrote: >> Mike, Chris >> >> After the locking change, adding a task to any of the connection >> mgmtqueue, cmdqueue, or requeue lists is under the session forward lock. >> >> Removing tasks from any of these lists in iscsi_data_xmit is under >> the session forward lock and **before** calling down to the transport >> to handle the task. >> >> The iscsi_complete_task helper was added by Mike's commit >> 3bbaaad95fd38ded "[SCSI] libiscsi: handle cleanup task races" >> and is indeed typically called under the backward lock && has this section >> >> + if (!list_empty(&task->running)) >> + list_del_init(&task->running); >> >> which per my reading of the code never comes into play, can you comment? >> >> Lets address this area before we move to the others claims made on the patch. > > This bit in particular is where I see a cause for concern. If that > list_del_init call ever races against other list operations, there's a > potential corruption. It's presumably there for a reason, and Mike > explained a case where some targets have been known to send a check > condition at unexpected times that would hit it. > > I don't like having known list locking violations hanging around, based > on an expectation that we'll never hit that path with well behaved > targets. > > If we can get a fix worked up for the list locking here, can we get any > testing on it from the original reports at IBM? That was very helpful > in testing a full reversion patch. Sure! Count on us to test any patches. I guess the first step is to reproduce on upstream right? We haven't tested specifically this scenario for long time. Will try to reproduce on 4.9-rc4 and update here. Cheers, Guilherme > > - Chris Leech > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html