On Fri, 2017-05-19 at 09:36 +0000, Dashi DS1 Cao wrote:
> It seems there is a race of multiple "fc_starget_delete" of the same rport,
> thus of the same SCSI host. The race leads to the race of scsi_remove_target
> and it cannot be prevented by the code snippet alone, even of the most recent
> version:
>         spin_lock_irqsave(shost->host_lock, flags);
>         list_for_each_entry(starget, &shost->__targets, siblings) {
>                 if (starget->state == STARGET_DEL ||
>                     starget->state == STARGET_REMOVE)
>                         continue;
> If there is a possibility that the starget is under deletion(state ==
> STARGET_DEL), it should be possible that list_next_entry(starget, siblings)
> could cause a read access violation.

Hello Dashi,

Something else must be going on. From scsi_remove_target():

restart:
        spin_lock_irqsave(shost->host_lock, flags);
        list_for_each_entry(starget, &shost->__targets, siblings) {
                if (starget->state == STARGET_DEL ||
                    starget->state == STARGET_REMOVE)
                        continue;
                if (starget->dev.parent == dev || &starget->dev == dev) {
                        kref_get(&starget->reap_ref);
                        starget->state = STARGET_REMOVE;
                        spin_unlock_irqrestore(shost->host_lock, flags);
                        __scsi_remove_target(starget);
                        scsi_target_reap(starget);
                        goto restart;
                }
        }
        spin_unlock_irqrestore(shost->host_lock, flags);

In other words, before scsi_remove_target() decides to call
__scsi_remove_target(), it changes the target state into STARGET_REMOVE
while holding the host lock. This means that scsi_remove_target() won't
call __scsi_remove_target() twice and also that it won't invoke
list_next_entry(starget, siblings) after starget has been freed.

Bart.

Reply via email to