Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-03-16 Thread James Bottomley
On Thu, 2017-03-16 at 13:56 -0700, Bart Van Assche wrote:
> scsi_target_unblock() must unblock SCSI devices even if this function
> is called after unloading of the LLD that created these devices has
> started. This is necessary to prevent that __scsi_remove_device() 
> hangs on the SYNCHRONIZE CACHE command issued by the sd driver during
> shutdown.

Your special get function misses the try_module_get().  But this is all
really a bit ugly. Since the only problem is the SYNC CACHE triggered
by device_del, isn't a better solution a new state: SDEV_CANCEL_BLOCK. 
 This will make the device visible to scsi_get_device() and we can take
it back from CANCEL_BLOCKED->CANCEL when the queue is unblocked.  I
suspect we could also simply throw away the sync cache command when the
device is blocked (the cache should destage naturally in the time it
takes for the device to be unblocked).

James



Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-03-16 Thread Bart Van Assche
On Thu, 2017-03-16 at 15:53 -0700, James Bottomley wrote:
> On Thu, 2017-03-16 at 13:56 -0700, Bart Van Assche wrote:
> > scsi_target_unblock() must unblock SCSI devices even if this function
> > is called after unloading of the LLD that created these devices has
> > started. This is necessary to prevent that __scsi_remove_device() 
> > hangs on the SYNCHRONIZE CACHE command issued by the sd driver during
> > shutdown.
> 
> Your special get function misses the try_module_get().  But this is all
> really a bit ugly. Since the only problem is the SYNC CACHE triggered
> by device_del, isn't a better solution a new state: SDEV_CANCEL_BLOCK. 
>  This will make the device visible to scsi_get_device() and we can take
> it back from CANCEL_BLOCKED->CANCEL when the queue is unblocked.  I
> suspect we could also simply throw away the sync cache command when the
> device is blocked (the cache should destage naturally in the time it
> takes for the device to be unblocked).

Hello James,

The purpose of this patch series is to make sure that unblock also occurs
after module unloading has started. My understanding of try_module_get() is
that it fails once module unloading has started. In other words, it is on
purpose that try_module_get() is not called. From the kernel module code:

bool try_module_get(struct module *module)
{
bool ret = true;

if (module) {
preempt_disable();
/* Note: here, we can fail to get a reference */
if (likely(module_is_live(module) &&
   atomic_inc_not_zero(&module->refcnt) != 0))
trace_module_get(module, _RET_IP_);
else
ret = false;
preempt_enable();
}
return ret;
}

static inline int module_is_live(struct module *mod)
{
return mod->state != MODULE_STATE_GOING;
}

Regarding introducing a new device state: this is something I would like to
avoid. Any code that manipulates the SCSI device state is unnecessarily hard
to modify because multiple types of state information have been mixed up in
a single state variable (blocked / not blocked; created / running / cancel /
offline). Additionally, the SCSI device state is visible to user space.
Adding a new SCSI device state could break existing user space applications.

There is another problem with the introduction of the SDEV_CANCEL_BLOCKED
state: we do not want open() calls to succeed for devices that are in the
SDEV_DEL, SDEV_CANCEL nor for devices that are in the SDEV_CANCEL_BLOCKED
state. scsi_disk_get() in the sd driver relies on scsi_device_get() to check
the SCSI device state. If scsi_device_get() would succeed for devices in the
SDEV_CANCEL_BLOCKED state then an explicit check for that state would have
to be added in several users of scsi_device_get(). In other words, I think
adding the SDEV_CANCEL_BLOCKED state would result in a much more complex and
also harder to test patch.

Thanks,

Bart.

Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-03-17 Thread James Bottomley
On Thu, 2017-03-16 at 23:19 +, Bart Van Assche wrote:
> On Thu, 2017-03-16 at 15:53 -0700, James Bottomley wrote:
> > On Thu, 2017-03-16 at 13:56 -0700, Bart Van Assche wrote:
> > > scsi_target_unblock() must unblock SCSI devices even if this
> > > function
> > > is called after unloading of the LLD that created these devices
> > > has
> > > started. This is necessary to prevent that __scsi_remove_device()
> > > hangs on the SYNCHRONIZE CACHE command issued by the sd driver
> > > during
> > > shutdown.
> > 
> > Your special get function misses the try_module_get().  But this is
> > all
> > really a bit ugly. Since the only problem is the SYNC CACHE 
> > triggered by device_del, isn't a better solution a new state:
> > SDEV_CANCEL_BLOCK.  This will make the device visible to 
> > scsi_get_device() and we can take it back from CANCEL_BLOCKED
> > ->CANCEL when the queue is unblocked.  I suspect we could also 
> > simply throw away the sync cache command when the device is blocked 
> > (the cache should destage naturally in the time it takes for the
> > device to be unblocked).
> 
> Hello James,
> 
> The purpose of this patch series is to make sure that unblock also 
> occurs after module unloading has started. My understanding of
> try_module_get() is that it fails once module unloading has started.
>  In other words, it is on purpose that try_module_get() is not
> called. From the kernel module
> code:
> 
> bool try_module_get(struct module *module)
> {
>   bool ret = true;
> 
>   if (module) {
>   preempt_disable();
>   /* Note: here, we can fail to get a reference */
>   if (likely(module_is_live(module) &&
>  atomic_inc_not_zero(&module->refcnt) != 0))
>   trace_module_get(module, _RET_IP_);
>   else
>   ret = false;
>   preempt_enable();
>   }
>   return ret;
> }
> 
> static inline int module_is_live(struct module *mod)
> {
>   return mod->state != MODULE_STATE_GOING;
> }

So it's better to use the module without a reference in place and take
the risk that it may exit and release its code area while we're calling
it?

> Regarding introducing a new device state: this is something I would 
> like to avoid. Any code that manipulates the SCSI device state is
> unnecessarily hard to modify because multiple types of state 
> information have been mixed up in a single state variable (blocked / 
> not blocked; created / running / cancel / offline). Additionally, the 
> SCSI device state is visible to user space.
> Adding a new SCSI device state could break existing user space
> applications.

I'm not sure that's a real concern for a new cancel state, but it's
addressable by not exposing the state to user space (it can just show
up as blocked)

> There is another problem with the introduction of the 
> SDEV_CANCEL_BLOCKED state: we do not want open() calls to succeed for 
> devices that are in the SDEV_DEL, SDEV_CANCEL nor for devices that 
> are in the SDEV_CANCEL_BLOCKED state. scsi_disk_get() in the sd 
> driver relies on scsi_device_get() to check the SCSI device state. If 
> scsi_device_get() would succeed for devices in the
> SDEV_CANCEL_BLOCKED state then an explicit check for that state would 
> have to be added in several users of scsi_device_get().

That really doesn't matter: getting a reference via open is a race. 
 Currently if you do it just before SDEV_CANCEL you end up in the same
position: a properly refcounted open device that can't send any
commands, so this doesn't add any new problems.

>  In other words, I think adding the SDEV_CANCEL_BLOCKED state would
> result in a much more complex and also harder to test patch.

Fine, here's what I thing it would look like; it seems a lot shorter
and simpler to me, but if you want to pursue your approach fixing the
race with module exit is a requirement.

James

---

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ba22866..b952a6a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1248,6 +1248,13 @@ scsi_prep_state_check(struct scsi_device *sdev, struct 
request *req)
"rejecting I/O to dead device\n");
ret = BLKPREP_KILL;
break;
+   case SDEV_CANCEL_BLOCK:
+   /*
+* silently kill the I/O: the only allowed thing
+* is a ULD remove one (like SYNC CACHE)
+*/
+   ret = BLKPREP_KILL;
+   break;
case SDEV_BLOCK:
case SDEV_CREATED_BLOCK:
ret = BLKPREP_DEFER;
@@ -2604,6 +2611,15 @@ scsi_device_set_state(struct scsi_device *sdev, enum 
scsi_device_state state)
}
break;
 
+   case SDEV_CANCEL_BLOCK:
+   switch (oldstate) {
+   case SDEV_CAN

Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-03-17 Thread Bart Van Assche
On Fri, 2017-03-17 at 05:54 -0700, James Bottomley wrote:
> So it's better to use the module without a reference in place and take
> the risk that it may exit and release its code area while we're calling
> it?

Hello James,

My understanding of scsi_device_get() / scsi_device_put() is that the reason
why these manipulate the module reference count is to avoid that a SCSI LLD
module can be unloaded while a SCSI device is being used from a context that
is not notified about SCSI LLD unloading (e.g. a file handle controlled by
the sd driver or a SCSI ALUA device handler worker thread).

Does your comment mean that you think there is a scenario in which
scsi_target_block() or scsi_target_unblock() can be called while the text
area of a SCSI LLD is being released? I have reviewed all the callers of
these functions but I have not found such a scenario. scsi_target_block()
and scsi_target_unblock() are either called from a SCSI transport layer
implementation (FC, iSCSI, SRP) or from a SCSI LLD kernel module (snic_disc).
All these kernel modules only call scsi_target_*block() for resources (rport
or SCSI target respectively) that are removed before the code area of these
modules is released. This is why I think that calling scsi_target_*block()
without increasing the SCSI LLD module reference count is safe.

> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 82dfe07..fd1ba1d 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -39,6 +39,7 @@ static const struct {
>   { SDEV_TRANSPORT_OFFLINE, "transport-offline" },
>   { SDEV_BLOCK,   "blocked" },
>   { SDEV_CREATED_BLOCK, "created-blocked" },
> + { SDEV_CANCEL_BLOCK, "blocked" },
>  };

The multipathd function path_offline() translates "blocked" into PATH_PENDING.
Shouldn't SDEV_CANCEL_BLOCK be translated by multipathd into PATH_DOWN? There
might be other user space applications that interpret the SCSI device state
and that I am not aware of.

Additionally, your patch does not modify scsi_device_get() and hence will
cause scsi_device_get() to succeed for devices that are in state
SDEV_CANCEL_BLOCK. I think that's a subtle behavior change.

Thanks,

Bart.

Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-03-18 Thread James Bottomley
On Fri, 2017-03-17 at 16:40 +, Bart Van Assche wrote:
> On Fri, 2017-03-17 at 05:54 -0700, James Bottomley wrote:
> > So it's better to use the module without a reference in place and 
> > take the risk that it may exit and release its code area while 
> > we're calling it?
> 
> Hello James,
> 
> My understanding of scsi_device_get() / scsi_device_put() is that the 
> reason why these manipulate the module reference count is to avoid 
> that a SCSI LLD module can be unloaded while a SCSI device is being 
> used from a context that is not notified about SCSI LLD unloading 
> (e.g. a file handle controlled by the sd driver or a SCSI ALUA device
> handler worker thread).

Not just that: it's so the underlying module is pinned for every
potential user as well.  the unblock code is called in places outside
the actual hba driver module, so it needs that protection.

> Does your comment mean that you think there is a scenario in which
> scsi_target_block() or scsi_target_unblock() can be called while the 
> text area of a SCSI LLD is being released? I have reviewed all the 
> callers of these functions but I have not found such a scenario.
> scsi_target_block() and scsi_target_unblock() are either called from 
> a SCSI transport layer implementation (FC, iSCSI, SRP) or from a SCSI 
> LLD kernel module (snic_disc). All these kernel modules only call 
> scsi_target_*block() for resources (rport or SCSI target
> respectively) that are removed before the code area of
> these modules is released. This is why I think that calling
> scsi_target_*block() without increasing the SCSI LLD module reference
> count is safe.

The transport code is above the HBA module code and in that code
unblock could be racing with module removal.  The original premise was
that once the dev/target/host goes into DEL, nothing can call into
queuecommand or get a reference to the device, so nothing halts removal
after that, but you changed that with your code, which is why it's now
unsafe.

> > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> > index 82dfe07..fd1ba1d 100644
> > --- a/drivers/scsi/scsi_sysfs.c
> > +++ b/drivers/scsi/scsi_sysfs.c
> > @@ -39,6 +39,7 @@ static const struct {
> > { SDEV_TRANSPORT_OFFLINE, "transport-offline" },
> > { SDEV_BLOCK,   "blocked" },
> > { SDEV_CREATED_BLOCK, "created-blocked" },
> > +   { SDEV_CANCEL_BLOCK, "blocked" },
> >  };
> 
> The multipathd function path_offline() translates "blocked" into 
> PATH_PENDING. Shouldn't SDEV_CANCEL_BLOCK be translated by multipathd 
> into PATH_DOWN? There might be other user space applications that 
> interpret the SCSI device state and that I am not aware of.

Given it's very short lived, I don't think it much matters, but if you
think it does, that can become cancel.

> Additionally, your patch does not modify scsi_device_get() and hence
> will cause scsi_device_get() to succeed for devices that are in state
> SDEV_CANCEL_BLOCK. I think that's a subtle behavior change.

Yes, otherwise device unblock wouldn't work (on the off chance the
device is unblocked before we get to the sync cache).  Again, it's like
the open race: you could have got the reference just before the device
went into cancel and you're in the same position so there's actually no
substantive behaviour change at all, it just elongates the window where
you get a reference to a device you can't send commands to.

James


> Thanks,
> 
> Bart.



Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-03-18 Thread Bart Van Assche
On Sat, 2017-03-18 at 07:44 -0500, James Bottomley wrote:
> On Fri, 2017-03-17 at 16:40 +, Bart Van Assche wrote:
> > Does your comment mean that you think there is a scenario in which
> > scsi_target_block() or scsi_target_unblock() can be called while the 
> > text area of a SCSI LLD is being released? I have reviewed all the 
> > callers of these functions but I have not found such a scenario.
> > scsi_target_block() and scsi_target_unblock() are either called from 
> > a SCSI transport layer implementation (FC, iSCSI, SRP) or from a SCSI 
> > LLD kernel module (snic_disc). All these kernel modules only call 
> > scsi_target_*block() for resources (rport or SCSI target
> > respectively) that are removed before the code area of
> > these modules is released. This is why I think that calling
> > scsi_target_*block() without increasing the SCSI LLD module reference
> > count is safe.
> 
> The transport code is above the HBA module code and in that code
> unblock could be racing with module removal.  The original premise was
> that once the dev/target/host goes into DEL, nothing can call into
> queuecommand or get a reference to the device, so nothing halts removal
> after that, but you changed that with your code, which is why it's now
> unsafe.

Hello James,

Thank you for having provided more background information about the design
goals of the SCSI code. However, regarding scsi_target_*block(), I think it
is safe even for transport modules to call these functions without obtaining
an additional reference on the SCSI LLD kernel module. The transport
implementation won't attempt to block or unblock a SCSI target anymore once
unloading of the LLD text has started. SCSI LLDs release the attached
transport before unloading of the SCSI LLD kernel module text starts and
SCSI transport modules guarantee that scsi_target_*block() won't be called
anymore after the transport module has been released.
  
If you do not agree with the above please provide a call sequence for an
existing SCSI LLD or transport module that illustrates the race you
described.

Thanks,

Bart.

Re: [PATCH 0/3] Unblock SCSI devices even if the LLD is being unloaded

2017-04-10 Thread Bart Van Assche
On Fri, 2017-03-17 at 05:54 -0700, James Bottomley wrote:
> but if you want to pursue your approach fixing the
> race with module exit is a requirement.

Hello James,

Sorry that it took so long but I finally found the time to implement and
test an alternative. I will post the patches that implement that new approach.

Bart.