-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

"Matthew R. Ochs" <mro...@linux.vnet.ibm.com> writes:


> The process_sense() routine can perform a read capacity which
> can take some time to complete. If an EEH occurs while waiting
> on the read capacity, the EEH handler is unable to obtain the
> context's mutex in order to put the context in an error state.
> The EEH handler will sit and wait until the context is free,
> but this wait can last longer than the EEH handler tolerates,
> leading to a failed recovery.

I'm not quite clear on what you mean by the EEH handler timing
out. AFAIK there's nothing in eehd and the EEH core that times out if a
driver doesn't respond - indeed, it's pretty easy to hang eehd with a
misbehaving driver.

Are you referring to your own internal timeouts?
cxlflash_wait_for_pci_err_recovery and anything else that uses
CXLFLASH_PCI_ERROR_RECOVERY_TIMEOUT?

Regards,
Daniel

>
> To address this issue, make the context unavailable to new,
> non-system owned threads and release the context while calling
> into process_sense(). After returning from process_sense() the
> context mutex is reacquired and the context is made available
> again. The context can be safely moved to the error state if
> needed during the unavailable window as no other threads will
> hold its reference.
>
> Signed-off-by: Matthew R. Ochs <mro...@linux.vnet.ibm.com>
> Signed-off-by: Manoj N. Kumar <ma...@linux.vnet.ibm.com>
> Reviewed-by: Brian King <brk...@linux.vnet.ibm.com>
> ---
>  drivers/scsi/cxlflash/superpipe.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drivers/scsi/cxlflash/superpipe.c 
> b/drivers/scsi/cxlflash/superpipe.c
> index a6316f5..7283e83 100644
> --- a/drivers/scsi/cxlflash/superpipe.c
> +++ b/drivers/scsi/cxlflash/superpipe.c
> @@ -1787,12 +1787,21 @@ static int cxlflash_disk_verify(struct scsi_device 
> *sdev,
>        * inquiry (i.e. the Unit attention is due to the WWN changing).
>        */
>       if (verify->hint & DK_CXLFLASH_VERIFY_HINT_SENSE) {
> +             /* Can't hold mutex across process_sense/read_cap16,
> +              * since we could have an intervening EEH event.
> +              */
> +             ctxi->unavail = true;
> +             mutex_unlock(&ctxi->mutex);
>               rc = process_sense(sdev, verify);
>               if (unlikely(rc)) {
>                       dev_err(dev, "%s: Failed to validate sense data (%d)\n",
>                               __func__, rc);
> +                     mutex_lock(&ctxi->mutex);
> +                     ctxi->unavail = false;
>                       goto out;
>               }
> +             mutex_lock(&ctxi->mutex);
> +             ctxi->unavail = false;
>       }
>  
>       switch (gli->mode) {
> -- 
> 2.1.0
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJWCeieAAoJEPC3R3P2I92F+hMP/1OdLQCin+kKbOb9qxf952bH
DUAkmEhc0oD7xZFQI8HgDmHRxpes5HHxXtwXFsLgsr8QYG+aOIV568GXIZtTbrl0
aCFMqtKXZ6jVqv5L60r1tgzcWxmWdshMLd1op6t3BwA67nUc5Edcr94ePUyDDLj1
at335wCnxuGxn0kdB0Ud/lbPzTsgDPcuV6tCLy0o4J15KFOyFt9hCjO4nmL/wcIt
kmjyn5SHbdgje+73uaRQnXkli4wDA9x7x6/8wFgLspnOxgMEJgnHmm+HYbOXnHyX
nFFHw9+X2ETUcucVWuKNaFzW1vH+WJDteEZbjS7t7liJIkmIiZSFHyUTtVGdBkl1
FsWswA0pkzuGq94Wsb0nGtNHbsMw+WeWTcTlNN46DMG/wqz75iO3yMGK9MZuddSX
9jUokiM0kQvvfwAoujmvpMCVB4b2oseRRG4/yJ0lKSCcC8kETQTXgVHbT8oLmCdk
rUA0hxbbKzVQsDzw8s5HqYZjqHdLp3sDPeyukPeJl2CNhysrmnyHXpq8XgcLi3op
kbuuiR3z8UH3MW4BDpplnjhZ+5Wyw9cSI57vRF2Kr80NnU+5hBvftNh4rBneeny2
0gCDlPHDvB7Ks9HkcxkK9MW78FTgj50ePofS/dUUod4M9ohDd4MSwRKjpwQ+H3By
jmxnzfvWO/oTlL1D9+2W
=3BcU
-----END PGP SIGNATURE-----
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to