On 07/10/2017 09:06 AM, Yijing Wang wrote:
> Disco mutex was introudced to prevent domain rediscovery competing
> with ata error handling(87c8331). If we have already hold the lock
> in sas_revalidate_domain and sync executing probe, deadlock caused,
> because, sas_probe_sata() also need hold disco_mutex. Since disco mutex
> use to prevent revalidata domain happen during ata error handler,
> it should be safe to release disco mutex when sync probe, because
> no new revalidate domain event would be process until the sync return,
> and the current sas revalidate domain finish.
> 
> Signed-off-by: Yijing Wang <wangyij...@huawei.com>
> CC: John Garry <john.ga...@huawei.com>
> CC: Johannes Thumshirn <jthumsh...@suse.de>
> CC: Ewan Milne <emi...@redhat.com>
> CC: Christoph Hellwig <h...@lst.de>
> CC: Tomas Henzl <the...@redhat.com>
> CC: Dan Williams <dan.j.willi...@intel.com>
> ---
>  drivers/scsi/libsas/sas_expander.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/scsi/libsas/sas_expander.c 
> b/drivers/scsi/libsas/sas_expander.c
> index 9d26c28..077024e 100644
> --- a/drivers/scsi/libsas/sas_expander.c
> +++ b/drivers/scsi/libsas/sas_expander.c
> @@ -776,6 +776,7 @@ static struct domain_device *sas_ex_discover_end_dev(
>       struct ex_phy *phy = &parent_ex->ex_phy[phy_id];
>       struct domain_device *child = NULL;
>       struct sas_rphy *rphy;
> +     bool prev_lock;
>       int res;
>  
>       if (phy->attached_sata_host || phy->attached_sata_ps)
> @@ -803,6 +804,7 @@ static struct domain_device *sas_ex_discover_end_dev(
>       sas_ex_get_linkrate(parent, child, phy);
>       sas_device_set_phy(child, phy->port);
>  
> +     prev_lock = mutex_is_locked(&child->port->ha->disco_mutex);
>  #ifdef CONFIG_SCSI_SAS_ATA
>       if ((phy->attached_tproto & SAS_PROTOCOL_STP) || 
> phy->attached_sata_dev) {
>               res = sas_get_ata_info(child, phy);
> @@ -832,7 +834,11 @@ static struct domain_device *sas_ex_discover_end_dev(
>                                   SAS_ADDR(parent->sas_addr), phy_id, res);
>                       goto out_list_del;
>               }
> +             if (prev_lock)
> +                     mutex_unlock(&child->port->ha->disco_mutex);
>               sas_disc_wait_completion(child->port, DISCE_PROBE);
> +             if (prev_lock)
> +                     mutex_lock(&child->port->ha->disco_mutex);
>  
>       } else
>  #endif
> @@ -861,7 +867,11 @@ static struct domain_device *sas_ex_discover_end_dev(
>                                   SAS_ADDR(parent->sas_addr), phy_id, res);
>                       goto out_list_del;
>               }
> +             if (prev_lock)
> +                     mutex_unlock(&child->port->ha->disco_mutex);
>               sas_disc_wait_completion(child->port, DISCE_PROBE);
> +             if (prev_lock)
> +                     mutex_lock(&child->port->ha->disco_mutex);
>       } else {
>               SAS_DPRINTK("target proto 0x%x at %016llx:0x%x not handled\n",
>                           phy->attached_tproto, SAS_ADDR(parent->sas_addr),
> 
I would rather have an analysis if this really cannot happen; 'should
not' is rather vague. But seeing that it _is_ quite complex:

Reviewed-by: Hannes Reinecke <h...@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Teamlead Storage & Networking
h...@suse.de                                   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Reply via email to