Re: [usb-storage] UAS hangs khubd on USB disconnect

2013-12-13 Thread James Bottomley
On Fri, 2013-12-13 at 13:33 -0500, Tejun Heo wrote:
> Hello, guys.
> 
> (cc'ing Greg)
> 
> On Fri, Dec 13, 2013 at 01:19:36PM -0500, Alan Stern wrote:
> > On Fri, 13 Dec 2013, Sarah Sharp wrote:
> > 
> > > > Given the way things work now, I suspect these warnings are truly 
> > > > harmless.  We could simply get rid of the WARN in sysfs_remove_group.
> > > > 
> > > > The alternative is to call device_del for SCSI targets earlier on, such 
> > > > as when their hosts are unregistered.  I don't know how James would 
> > > > feel about this approach.  It would be difficult because targets use 
> > > > their own reference counts instead of relying on the usual device 
> > > > refcounting mechanism.
> > > 
> > > Thanks for looking into this.  I think just getting rid of the WARN
> > > would be sufficient.  Can you make a patch for that?
> > 
> > Easily.  The downside is that there would no longer be any warning 
> > when someone tries to remove a wrong subdirectory by mistake.
> > 
> > > The patch still won't help with the UAS issues with
> > > scsi_init_shared_tag_map though.
> > 
> > I wasn't clear on the reason for that problem.  Does it also arise from 
> > late device_del for scsi_target?  I could try to change the way that 
> > works, if anybody (Hans?) would like to test it.
> 
> While the recent sysfs changes made this issue more visible, Greg
> wants to make sure that devices are removed from leaf up in all cases
> and keep the warning to ensure that.  Would there be a way fix SCSI
> removal ordering?

Could someone analyse the actual problem?  We're quite careful even on
host remove to iterate and remove all the devices, then targets, then
host (and allied transport objects).  Which removal is inverted?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [usb-storage] UAS hangs khubd on USB disconnect

2013-12-13 Thread James Bottomley
On Fri, 2013-12-13 at 11:18 -0800, James Bottomley wrote:
> On Fri, 2013-12-13 at 13:33 -0500, Tejun Heo wrote:
> > Hello, guys.
> > 
> > (cc'ing Greg)
> > 
> > On Fri, Dec 13, 2013 at 01:19:36PM -0500, Alan Stern wrote:
> > > On Fri, 13 Dec 2013, Sarah Sharp wrote:
> > > 
> > > > > Given the way things work now, I suspect these warnings are truly 
> > > > > harmless.  We could simply get rid of the WARN in sysfs_remove_group.
> > > > > 
> > > > > The alternative is to call device_del for SCSI targets earlier on, 
> > > > > such 
> > > > > as when their hosts are unregistered.  I don't know how James would 
> > > > > feel about this approach.  It would be difficult because targets use 
> > > > > their own reference counts instead of relying on the usual device 
> > > > > refcounting mechanism.
> > > > 
> > > > Thanks for looking into this.  I think just getting rid of the WARN
> > > > would be sufficient.  Can you make a patch for that?
> > > 
> > > Easily.  The downside is that there would no longer be any warning 
> > > when someone tries to remove a wrong subdirectory by mistake.
> > > 
> > > > The patch still won't help with the UAS issues with
> > > > scsi_init_shared_tag_map though.
> > > 
> > > I wasn't clear on the reason for that problem.  Does it also arise from 
> > > late device_del for scsi_target?  I could try to change the way that 
> > > works, if anybody (Hans?) would like to test it.
> > 
> > While the recent sysfs changes made this issue more visible, Greg
> > wants to make sure that devices are removed from leaf up in all cases
> > and keep the warning to ensure that.  Would there be a way fix SCSI
> > removal ordering?
> 
> Could someone analyse the actual problem?  We're quite careful even on
> host remove to iterate and remove all the devices, then targets, then
> host (and allied transport objects).  Which removal is inverted?

Actually, I think I have this figured out.  There's a thinko in one of
the scsi_target_reap() cases.  The original (and still existing) problem
with targets is that nothing creates them and nothing destroys them, so,
while we could rely on the refcounting of the device model to preserve
the actual target object, we had no idea when to remove it from
visibility.  That was the job of the reap reference, to track
visibility.  It looks like the reap on device last put is occurring too
late.  I think we should reap immediately after doing the sdev
device_del, so does this fix the warn on? (I'm not sure because no-one
has actually posted a backtrace, but it sounds like this is the
problem).

James

---

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 8ff62c2..98d4eb3 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -399,8 +399,6 @@ static void scsi_device_dev_release_usercontext(struct 
work_struct *work)
/* NULL queue means the device can't be used */
sdev->request_queue = NULL;
 
-   scsi_target_reap(scsi_target(sdev));
-
kfree(sdev->inquiry);
kfree(sdev);
 
@@ -1044,6 +1042,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
} else
put_device(&sdev->sdev_dev);
 
+   scsi_target_reap(scsi_target(sdev));
+
/*
 * Stop accepting new requests and wait until all queuecommand() and
 * scsi_run_queue() invocations have finished before tearing down the


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [usb-storage] UAS hangs khubd on USB disconnect

2013-12-13 Thread James Bottomley
On Fri, 2013-12-13 at 16:06 -0500, Alan Stern wrote:
> On Fri, 13 Dec 2013, James Bottomley wrote:
> 
> > Actually, I think I have this figured out.  There's a thinko in one of
> > the scsi_target_reap() cases.  The original (and still existing) problem
> > with targets is that nothing creates them and nothing destroys them, so,
> > while we could rely on the refcounting of the device model to preserve
> > the actual target object, we had no idea when to remove it from
> > visibility.  That was the job of the reap reference, to track
> > visibility.  It looks like the reap on device last put is occurring too
> > late.  I think we should reap immediately after doing the sdev
> > device_del, so does this fix the warn on? (I'm not sure because no-one
> > has actually posted a backtrace, but it sounds like this is the
> > problem).
> > 
> > James
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> > index 8ff62c2..98d4eb3 100644
> > --- a/drivers/scsi/scsi_sysfs.c
> > +++ b/drivers/scsi/scsi_sysfs.c
> > @@ -399,8 +399,6 @@ static void scsi_device_dev_release_usercontext(struct 
> > work_struct *work)
> > /* NULL queue means the device can't be used */
> > sdev->request_queue = NULL;
> >  
> > -   scsi_target_reap(scsi_target(sdev));
> > -
> > kfree(sdev->inquiry);
> > kfree(sdev);
> >  
> > @@ -1044,6 +1042,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
> > } else
> > put_device(&sdev->sdev_dev);
> >  
> > +   scsi_target_reap(scsi_target(sdev));
> > +
> > /*
> >  * Stop accepting new requests and wait until all queuecommand() and
> >  * scsi_run_queue() invocations have finished before tearing down the
> 
> This is not right.  The problem is that you don't keep track explicitly 
> of the number of references to a target; you rely implicitly on 
> starget->devices being non-empty.  starget->reap_ref is only a count of 
> local operations that should block removal.

No, it was supposed explicitly to be a visibility counter to answer the
question when can we delete the target.  It's incremented every time we
add a device to the target (and when we do an operation that may remove
one to keep an atomic context before we blow it away) and decremented
every time we remove one.

> Consider, for example, what would happen if there is more than one LUN.  
> What if one of them is removed while the other remains?

Then the reap reference remains above zero and the target stays.

> A more invasive change is needed.

I think you might be right in that we need to kill the list_empty check,
but I think that should be it.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [usb-storage] UAS hangs khubd on USB disconnect

2013-12-13 Thread James Bottomley
On Fri, 2013-12-13 at 19:48 -0500, Alan Stern wrote:
> On Fri, 13 Dec 2013, James Bottomley wrote:
> 
> > On Fri, 2013-12-13 at 16:06 -0500, Alan Stern wrote:
> > > On Fri, 13 Dec 2013, James Bottomley wrote:
> > > 
> > > > Actually, I think I have this figured out.  There's a thinko in one of
> > > > the scsi_target_reap() cases.  The original (and still existing) problem
> > > > with targets is that nothing creates them and nothing destroys them, so,
> > > > while we could rely on the refcounting of the device model to preserve
> > > > the actual target object, we had no idea when to remove it from
> > > > visibility.  That was the job of the reap reference, to track
> > > > visibility.  It looks like the reap on device last put is occurring too
> > > > late.  I think we should reap immediately after doing the sdev
> > > > device_del, so does this fix the warn on? (I'm not sure because no-one
> > > > has actually posted a backtrace, but it sounds like this is the
> > > > problem).
> > > > 
> > > > James
> > > > 
> > > > ---
> > > > 
> > > > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> > > > index 8ff62c2..98d4eb3 100644
> > > > --- a/drivers/scsi/scsi_sysfs.c
> > > > +++ b/drivers/scsi/scsi_sysfs.c
> > > > @@ -399,8 +399,6 @@ static void 
> > > > scsi_device_dev_release_usercontext(struct work_struct *work)
> > > > /* NULL queue means the device can't be used */
> > > > sdev->request_queue = NULL;
> > > >  
> > > > -   scsi_target_reap(scsi_target(sdev));
> > > > -
> > > > kfree(sdev->inquiry);
> > > > kfree(sdev);
> > > >  
> > > > @@ -1044,6 +1042,8 @@ void __scsi_remove_device(struct scsi_device 
> > > > *sdev)
> > > > } else
> > > > put_device(&sdev->sdev_dev);
> > > >  
> > > > +   scsi_target_reap(scsi_target(sdev));
> > > > +
> > > > /*
> > > >  * Stop accepting new requests and wait until all 
> > > > queuecommand() and
> > > >  * scsi_run_queue() invocations have finished before tearing 
> > > > down the
> > > 
> > > This is not right.  The problem is that you don't keep track explicitly 
> > > of the number of references to a target; you rely implicitly on 
> > > starget->devices being non-empty.  starget->reap_ref is only a count of 
> > > local operations that should block removal.
> > 
> > No, it was supposed explicitly to be a visibility counter to answer the
> > question when can we delete the target.  It's incremented every time we
> > add a device to the target (and when we do an operation that may remove
> > one to keep an atomic context before we blow it away) and decremented
> > every time we remove one.
> 
> Sorry, but you're wrong.  starget->reap_ref is _not_ incremented every 
> time we add a device to the target.  That's one of the things we need to 
> fix.

Well, then we would have a pretty astonishing cockup in the code.  The
found case of scsi_alloc_target increments the reference each time it's
called, so scsi_add_device() definitely behaves like this.  I suppose
it's possible the list_empty() check is covering a miscount in some of
the other probing routines, but that would mean we have stale targets
for a lot of our use cases.  I'll audit the code.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] fix our current target reap infrastructure.

2013-12-13 Thread James Bottomley
This patch eliminates the reap_ref and replaces it with a proper kref.
On last put of this kref, the target is removed from visibility in
sysfs.  The final call to scsi_target_reap() for the device is done from
__scsi_remove_device() and only if the device was made visible.  This
ensures that the target disappears as soon as the last device is gone
rather than waiting until final release of the device (which is often
too long).

---

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 307a811..d966e36 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -371,6 +371,31 @@ static struct scsi_target *__scsi_find_target(struct 
device *parent,
 }
 
 /**
+ * scsi_target_reap_ref_release - remove target from visibility
+ * @kref: the reap_ref in the target being released
+ *
+ * Called on last put of reap_ref, which is the indication that no device
+ * under this target is visible anymore, so render the target invisible in
+ * sysfs.  Note: we have to be in user context here because the target reaps
+ * should be done in places where the scsi device visibility is being removed.
+ */
+static void scsi_target_reap_ref_release(struct kref *kref)
+{
+   struct scsi_target *starget
+   = container_of(kref, struct scsi_target, reap_ref);
+
+   transport_remove_device(&starget->dev);
+   device_del(&starget->dev);
+   starget->state = STARGET_DEL;
+   scsi_target_destroy(starget);
+}
+
+static void scsi_target_reap_ref_put(struct scsi_target *starget)
+{
+   kref_put(&starget->reap_ref, scsi_target_reap_ref_release);
+}
+
+/**
  * scsi_alloc_target - allocate a new or find an existing target
  * @parent:parent of the target (need not be a scsi host)
  * @channel:   target channel number (zero if no channels)
@@ -401,7 +426,7 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
}
dev = &starget->dev;
device_initialize(dev);
-   starget->reap_ref = 1;
+   kref_init(&starget->reap_ref);
dev->parent = get_device(parent);
dev_set_name(dev, "target%d:%d:%d", shost->host_no, channel, id);
dev->bus = &scsi_bus_type;
@@ -441,29 +466,26 @@ static struct scsi_target *scsi_alloc_target(struct 
device *parent,
return starget;
 
  found:
-   found_target->reap_ref++;
+   kref_get(&found_target->reap_ref);
spin_unlock_irqrestore(shost->host_lock, flags);
if (found_target->state != STARGET_DEL) {
put_device(dev);
return found_target;
}
-   /* Unfortunately, we found a dying target; need to
-* wait until it's dead before we can get a new one */
+   /*
+* Unfortunately, we found a dying target; need to wait until it's
+* dead before we can get a new one.  There is an anomaly here.  We
+* *should* call scsi_target_reap() to balance the kref_get() of the
+* reap_ref above.  However, since the target is in state STARGET_DEL,
+* it's already invisible and the reap_ref is irrelevant.  If we call
+* scsi_target_reap() we might spuriously do another device_del() on
+* an already invisible target.
+*/
put_device(&found_target->dev);
flush_scheduled_work();
goto retry;
 }
 
-static void scsi_target_reap_usercontext(struct work_struct *work)
-{
-   struct scsi_target *starget =
-   container_of(work, struct scsi_target, ew.work);
-
-   transport_remove_device(&starget->dev);
-   device_del(&starget->dev);
-   scsi_target_destroy(starget);
-}
-
 /**
  * scsi_target_reap - check to see if target is in use and destroy if not
  * @starget: target to be checked
@@ -474,28 +496,11 @@ static void scsi_target_reap_usercontext(struct 
work_struct *work)
  */
 void scsi_target_reap(struct scsi_target *starget)
 {
-   struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
-   unsigned long flags;
-   enum scsi_target_state state;
-   int empty = 0;
-
-   spin_lock_irqsave(shost->host_lock, flags);
-   state = starget->state;
-   if (--starget->reap_ref == 0 && list_empty(&starget->devices)) {
-   empty = 1;
-   starget->state = STARGET_DEL;
-   }
-   spin_unlock_irqrestore(shost->host_lock, flags);
-
-   if (!empty)
-   return;
-
-   BUG_ON(state == STARGET_DEL);
-   if (state == STARGET_CREATED)
+   BUG_ON(starget->state == STARGET_DEL);
+   if (starget->state == STARGET_CREATED)
scsi_target_destroy(starget);
else
-   execute_in_process_context(scsi_target_reap_usercontext,
-  &starget->ew);
+   scsi_target_reap_ref_put(starget);
 }
 
 /**
@@ -1532,6 +1537,10 @@ struct scsi_device *__scsi_add_device(struct Scsi_Host 
*shost, uint channel,
}
mutex_unlock(&shost->scan_mutex);
scsi_autopm_put_target(sta

Re: [RFC] fix our current target reap infrastructure.

2013-12-14 Thread James Bottomley
On Fri, 2013-12-13 at 22:32 -0500, Alan Stern wrote:
> On Fri, 13 Dec 2013, James Bottomley wrote:
> 
> > This patch eliminates the reap_ref and replaces it with a proper kref.
> > On last put of this kref, the target is removed from visibility in
> > sysfs.  The final call to scsi_target_reap() for the device is done from
> > __scsi_remove_device() and only if the device was made visible.  This
> > ensures that the target disappears as soon as the last device is gone
> > rather than waiting until final release of the device (which is often
> > too long).
> 
> 
> > @@ -474,28 +496,11 @@ static void scsi_target_reap_usercontext(struct 
> > work_struct *work)
> >   */
> >  void scsi_target_reap(struct scsi_target *starget)
> >  {
> > -   struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
> > -   unsigned long flags;
> > -   enum scsi_target_state state;
> > -   int empty = 0;
> > -
> > -   spin_lock_irqsave(shost->host_lock, flags);
> > -   state = starget->state;
> > -   if (--starget->reap_ref == 0 && list_empty(&starget->devices)) {
> > -   empty = 1;
> > -   starget->state = STARGET_DEL;
> > -   }
> > -   spin_unlock_irqrestore(shost->host_lock, flags);
> > -
> > -   if (!empty)
> > -   return;
> > -
> > -   BUG_ON(state == STARGET_DEL);
> > -   if (state == STARGET_CREATED)
> > +   BUG_ON(starget->state == STARGET_DEL);
> > +   if (starget->state == STARGET_CREATED)
> > scsi_target_destroy(starget);
> > else
> > -   execute_in_process_context(scsi_target_reap_usercontext,
> > -  &starget->ew);
> > +   scsi_target_reap_ref_put(starget);
> 
> The refcount test and state change race with scsi_alloc_target().  
> Maybe the race won't occur in practice, but to be safe you should hold
> shost->host_lock throughout that time interval, as the original code
> here does.

You mean the fact that using a state model to indicate whether we should
destroy a target without bothering to refcount isn't robust against two
threads of execution running through a scan on the same target?  Yes, it
could be construed as a bug, but it's a bug in the old code as well.

> This means the kref approach won't work so easily.  You might as well
> leave reap_ref as an ordinary int.

Actually, no, we can better fix it using krefs.  We just force
everything through the kref put instead of special casing the not made
visible destruction case.  We can then case the release routine to fix
this, like below.  I suppose since this is a separate bug I'll keep it
as a separate patch.

> > @@ -393,7 +393,6 @@ static void scsi_device_dev_release_usercontext(struct 
> > work_struct *work)
> > starget = to_scsi_target(parent);
> >  
> > spin_lock_irqsave(sdev->host->host_lock, flags);
> > -   starget->reap_ref++;
> > list_del(&sdev->siblings);
> > list_del(&sdev->same_target_siblings);
> > list_del(&sdev->starved_entry);
> 
> starget is now an unused local variable.  It can be eliminated.

True, dumped them, thanks.

James

---

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index d966e36..327c0e92 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -320,6 +320,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
struct Scsi_Host *shost = dev_to_shost(dev->parent);
unsigned long flags;
 
+   starget->state = STARGET_DEL;
transport_destroy_device(dev);
spin_lock_irqsave(shost->host_lock, flags);
if (shost->hostt->target_destroy)
@@ -384,9 +385,15 @@ static void scsi_target_reap_ref_release(struct kref *kref)
struct scsi_target *starget
= container_of(kref, struct scsi_target, reap_ref);
 
-   transport_remove_device(&starget->dev);
-   device_del(&starget->dev);
-   starget->state = STARGET_DEL;
+   /*
+* if we get here and the target is still in the CREATED state that
+* means it was allocated but never made visible (because a scan
+* turned up no LUNs), so don't call device_del() on it.
+*/
+   if (starget->state == STARGET_RUNNING) {
+   transport_remove_device(&starget->dev);
+   device_del(&starget->dev);
+   }
scsi_target_destroy(starget);
 }
 
@@ -496,11 +503,13 @@ static struct scsi_target *scsi_alloc_target(struct 
device *parent,
  */
 void scsi_target_reap(struct scsi_target *starget)
 {
+   /*
+* serious problem if this triggers: STAR

Re: [RFC] fix our current target reap infrastructure.

2013-12-15 Thread James Bottomley
On Sun, 2013-12-15 at 16:32 -0500, Alan Stern wrote:
> On Sat, 14 Dec 2013, James Bottomley wrote:
> 
> > > The refcount test and state change race with scsi_alloc_target().  
> > > Maybe the race won't occur in practice, but to be safe you should hold
> > > shost->host_lock throughout that time interval, as the original code
> > > here does.
> > 
> > You mean the fact that using a state model to indicate whether we should
> > destroy a target without bothering to refcount isn't robust against two
> > threads of execution running through a scan on the same target?
> 
> I meant that the patch you posted suffers from a race when one thread
> is adding a device to a target and another thread is removing an
> existing device below that target at the same time.  Suppose the
> target's reap_ref count is initially equal to 1:
> 
>   Thread 0Thread 1
>   
>   In scsi_alloc_target(): In scsi_target_reap():
>   lock host_lock  scsi_target_reap_ref_put():
>   find existing starget   starget->reap_ref drops to 0
>   incr starget->reap_ref  In scsi_target_reap_ref_release():
>   unlock host_lockdevice_del(&starget->dev);
>   starget->state == STARGET_DEL?
>   No => okay to use starget   set starget->state = STARGET_DEL
> 
> Result: We end up using starget _and_ removing it.  The only way to
> avoid this race would be to guarantee that we never add and remove
> devices below the same target at the same time.  In theory this is 
> feasible, but I don't know if you want to do it.
> 
> This doesn't seem to be what you are talking about above.  In any case, 
> it is a bug.

No, I was thinking of the two thread scan bug (i.e. two scan threads)
not one scan and one remove, which is a bug in the old code.  This is a
race between put and get when the kref is incremented from zero (an
illegal operation which triggers a warn on).

The way to mediate this is to check for the kref already being zero
condition, like below.

James

---
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 327c0e92..303d471 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -473,7 +473,13 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
return starget;
 
  found:
-   kref_get(&found_target->reap_ref);
+   if (!kref_get_unless_zero(&found_target->reap_ref))
+   /*
+* release routine already fired.  Target is dead, but
+* STARGET_DEL may not yet be set (set in the release
+* routine), so set here as well, just in case
+*/
+   found_target->state = STARGET_DEL;
spin_unlock_irqrestore(shost->host_lock, flags);
if (found_target->state != STARGET_DEL) {
put_device(dev);



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] fix our current target reap infrastructure.

2013-12-15 Thread James Bottomley
On Sun, 2013-12-15 at 21:44 -0500, Alan Stern wrote:
> On Sun, 15 Dec 2013, James Bottomley wrote:
> 
> > No, I was thinking of the two thread scan bug (i.e. two scan threads)
> > not one scan and one remove, which is a bug in the old code.  This is a
> > race between put and get when the kref is incremented from zero (an
> > illegal operation which triggers a warn on).
> > 
> > The way to mediate this is to check for the kref already being zero
> > condition, like below.
> 
> Yes, that seems reasonable.  Consider now: Having done this, to what
> extent do starget->reap_ref and starget->state really need to be
> protected by the host_lock?  Maybe only the linked lists require
> protection.  (I haven't checked.)

Yes, I think so, but that can be done as an enhancement patch after the
fact.

> Can you post a single, combined patch incorporating all your proposed
> changes?  It's little hard to review them in pieces...

Sure, I'll repost what I have.

> Alan Stern
> 
> P.S.: Would you agree that the phrase "pretty astonishing cockup" did
> indeed turn out to be appropriate?  :-)

Objection, m'lud, my learned friend is leading the witness ...

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] fix our current target reap infrastructure.

2013-12-15 Thread James Bottomley
On Sun, 2013-12-15 at 21:49 -0500, Alan Stern wrote:
> On Sun, 15 Dec 2013, James Bottomley wrote:
> 
> > No, I was thinking of the two thread scan bug (i.e. two scan threads)
> > not one scan and one remove, which is a bug in the old code.
> 
> By the way, the existing code doesn't allow two threads to scan a
> target at the same time.  They would both have to hold the host's
> scan_mutex.

I thought of that, but it's dropped too early.  That makes the race
almost untriggerable because the racing thread starts way behind, but
it's not theoretically impossible.

> On the other hand, as far as I can see there's nothing to prevent two
> threads from removing a device at the same time.  But that wouldn't 
> cause any problems.

Right.

Thanks,

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-16 Thread James Bottomley
This set should fix our target problems with USB by making the target
visibility properly reference counted.  Since it's a major change to the
infrastructure, we'll incubate upstream first before backporting to
stable.

James

---

James Bottomley (2):
  [SCSI] fix our current target reap infrastructure.
  [SCSI] dual scan thread bug fix

 drivers/scsi/scsi_scan.c   | 102 -
 drivers/scsi/scsi_sysfs.c  |  15 ---
 include/scsi/scsi_device.h |   3 +-
 3 files changed, 74 insertions(+), 46 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/2] fix our current target reap infrastructure.

2013-12-16 Thread James Bottomley
This patch eliminates the reap_ref and replaces it with a proper kref.
On last put of this kref, the target is removed from visibility in
sysfs.  The final call to scsi_target_reap() for the device is done from
__scsi_remove_device() and only if the device was made visible.  This
ensures that the target disappears as soon as the last device is gone
rather than waiting until final release of the device (which is often
too long).

---
 drivers/scsi/scsi_scan.c   | 89 +++---
 drivers/scsi/scsi_sysfs.c  | 15 
 include/scsi/scsi_device.h |  3 +-
 3 files changed, 63 insertions(+), 44 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 307a811..39e5c85 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -371,6 +371,31 @@ static struct scsi_target *__scsi_find_target(struct 
device *parent,
 }
 
 /**
+ * scsi_target_reap_ref_release - remove target from visibility
+ * @kref: the reap_ref in the target being released
+ *
+ * Called on last put of reap_ref, which is the indication that no device
+ * under this target is visible anymore, so render the target invisible in
+ * sysfs.  Note: we have to be in user context here because the target reaps
+ * should be done in places where the scsi device visibility is being removed.
+ */
+static void scsi_target_reap_ref_release(struct kref *kref)
+{
+   struct scsi_target *starget
+   = container_of(kref, struct scsi_target, reap_ref);
+
+   transport_remove_device(&starget->dev);
+   device_del(&starget->dev);
+   starget->state = STARGET_DEL;
+   scsi_target_destroy(starget);
+}
+
+static void scsi_target_reap_ref_put(struct scsi_target *starget)
+{
+   kref_put(&starget->reap_ref, scsi_target_reap_ref_release);
+}
+
+/**
  * scsi_alloc_target - allocate a new or find an existing target
  * @parent:parent of the target (need not be a scsi host)
  * @channel:   target channel number (zero if no channels)
@@ -401,7 +426,7 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
}
dev = &starget->dev;
device_initialize(dev);
-   starget->reap_ref = 1;
+   kref_init(&starget->reap_ref);
dev->parent = get_device(parent);
dev_set_name(dev, "target%d:%d:%d", shost->host_no, channel, id);
dev->bus = &scsi_bus_type;
@@ -441,29 +466,32 @@ static struct scsi_target *scsi_alloc_target(struct 
device *parent,
return starget;
 
  found:
-   found_target->reap_ref++;
+   if (!kref_get_unless_zero(&found_target->reap_ref))
+   /*
+* release routine already fired.  Target is dead, but
+* STARGET_DEL may not yet be set (set in the release
+* routine), so set here as well, just in case
+*/
+   found_target->state = STARGET_DEL;
spin_unlock_irqrestore(shost->host_lock, flags);
if (found_target->state != STARGET_DEL) {
put_device(dev);
return found_target;
}
-   /* Unfortunately, we found a dying target; need to
-* wait until it's dead before we can get a new one */
+   /*
+* Unfortunately, we found a dying target; need to wait until it's
+* dead before we can get a new one.  There is an anomaly here.  We
+* *should* call scsi_target_reap() to balance the kref_get() of the
+* reap_ref above.  However, since the target is in state STARGET_DEL,
+* it's already invisible and the reap_ref is irrelevant.  If we call
+* scsi_target_reap() we might spuriously do another device_del() on
+* an already invisible target.
+*/
put_device(&found_target->dev);
flush_scheduled_work();
goto retry;
 }
 
-static void scsi_target_reap_usercontext(struct work_struct *work)
-{
-   struct scsi_target *starget =
-   container_of(work, struct scsi_target, ew.work);
-
-   transport_remove_device(&starget->dev);
-   device_del(&starget->dev);
-   scsi_target_destroy(starget);
-}
-
 /**
  * scsi_target_reap - check to see if target is in use and destroy if not
  * @starget: target to be checked
@@ -474,28 +502,11 @@ static void scsi_target_reap_usercontext(struct 
work_struct *work)
  */
 void scsi_target_reap(struct scsi_target *starget)
 {
-   struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
-   unsigned long flags;
-   enum scsi_target_state state;
-   int empty = 0;
-
-   spin_lock_irqsave(shost->host_lock, flags);
-   state = starget->state;
-   if (--starget->reap_ref == 0 && list_empty(&starget->devices)) {
-   empty = 1;
-   starget->state = STARGET_DEL;
-   }
-   spin_unlock_irqrestore(shost->host_lock, flags);
-
-   if (!empty)
-   return;
-
-   BUG_ON(state == STARGET_DEL);
-   if (state == STARGET_CREATED)
+   BUG

[RFC 2/2] dual scan thread bug fix

2013-12-16 Thread James Bottomley
In the highly unusual case where two threads are running concurrently through
the scanning code scanning the same target, we run into the situation where
one may allocate the target while the other is still using it.  In this case,
because the reap checks for STARGET_CREATED and kills the target without
reference counting, the second thread will do the wrong thing on reap.

Fix this by reference counting even creates and doing the STARGET_CREATED
check in the final put.

---
 drivers/scsi/scsi_scan.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 39e5c85..2d7aafa 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -320,6 +320,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
struct Scsi_Host *shost = dev_to_shost(dev->parent);
unsigned long flags;
 
+   starget->state = STARGET_DEL;
transport_destroy_device(dev);
spin_lock_irqsave(shost->host_lock, flags);
if (shost->hostt->target_destroy)
@@ -384,9 +385,15 @@ static void scsi_target_reap_ref_release(struct kref *kref)
struct scsi_target *starget
= container_of(kref, struct scsi_target, reap_ref);
 
-   transport_remove_device(&starget->dev);
-   device_del(&starget->dev);
-   starget->state = STARGET_DEL;
+   /*
+* if we get here and the target is still in the CREATED state that
+* means it was allocated but never made visible (because a scan
+* turned up no LUNs), so don't call device_del() on it.
+*/
+   if (starget->state == STARGET_RUNNING) {
+   transport_remove_device(&starget->dev);
+   device_del(&starget->dev);
+   }
scsi_target_destroy(starget);
 }
 
@@ -502,11 +509,13 @@ static struct scsi_target *scsi_alloc_target(struct 
device *parent,
  */
 void scsi_target_reap(struct scsi_target *starget)
 {
+   /*
+* serious problem if this triggers: STARGET_DEL is only set in the
+* kref release routine, so we're doing another final put on an
+* already released kref
+*/
BUG_ON(starget->state == STARGET_DEL);
-   if (starget->state == STARGET_CREATED)
-   scsi_target_destroy(starget);
-   else
-   scsi_target_reap_ref_put(starget);
+   scsi_target_reap_ref_put(starget);
 }
 
 /**
-- 
1.8.4.3



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/2] fix our current target reap infrastructure.

2013-12-17 Thread James Bottomley

On Mon, 2013-12-16 at 10:57 -0500, Alan Stern wrote:
> On Mon, 16 Dec 2013, James Bottomley wrote:
> 
> > This patch eliminates the reap_ref and replaces it with a proper kref.
> > On last put of this kref, the target is removed from visibility in
> > sysfs.  The final call to scsi_target_reap() for the device is done from
> > __scsi_remove_device() and only if the device was made visible.  This
> > ensures that the target disappears as soon as the last device is gone
> > rather than waiting until final release of the device (which is often
> > too long).
> 
> Reviewed-by: Alan Stern 
> 
> Two small suggested changes:
> 
> > @@ -441,29 +466,32 @@ static struct scsi_target *scsi_alloc_target(struct 
> > device *parent,
> > return starget;
> >  
> >   found:
> > -   found_target->reap_ref++;
> > +   if (!kref_get_unless_zero(&found_target->reap_ref))
> > +   /*
> > +* release routine already fired.  Target is dead, but
> > +* STARGET_DEL may not yet be set (set in the release
> > +* routine), so set here as well, just in case
> > +*/
> > +   found_target->state = STARGET_DEL;
> > spin_unlock_irqrestore(shost->host_lock, flags);
> > if (found_target->state != STARGET_DEL) {
> > put_device(dev);
> > return found_target;
> > }
> > -   /* Unfortunately, we found a dying target; need to
> > -* wait until it's dead before we can get a new one */
> > +   /*
> > +* Unfortunately, we found a dying target; need to wait until it's
> > +* dead before we can get a new one.  There is an anomaly here.  We
> > +* *should* call scsi_target_reap() to balance the kref_get() of the
> > +* reap_ref above.  However, since the target is in state STARGET_DEL,
> > +* it's already invisible and the reap_ref is irrelevant.  If we call
> > +* scsi_target_reap() we might spuriously do another device_del() on
> > +* an already invisible target.
> > +*/
> > put_device(&found_target->dev);
> > flush_scheduled_work();
> > goto retry;
> 
> Since scsi_target_reap_usercontext() is now gone, scsi_target_destroy()
> doesn't rely on a work queue any more.  Therefore something like
> msleep(1) would be more appropriate than flush_scheduled_work().

I suppose so.  In theory with the removal of the work queue, going from
final release to list del should be deterministic, so we should run into
this less.  I quite like the flush_scheduled_work, because it pushes out
all the pending work for devices on other targets (so accelerates host
remove).  However, I suppose it does now look wrong in this context.

> > --- a/include/scsi/scsi_device.h
> > +++ b/include/scsi/scsi_device.h
> > @@ -257,7 +257,7 @@ struct scsi_target {
> > struct list_headsiblings;
> > struct list_headdevices;
> > struct device   dev;
> > -   unsigned intreap_ref; /* protected by the host lock */
> > +   struct kref reap_ref; /* last put renders device invisible 
> > */
> 
> s/device/target/

Will update, thanks.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-18 Thread James Bottomley
On Wed, 2013-12-18 at 16:50 -0500, Alan Stern wrote:
> On Wed, 18 Dec 2013, Sarah Sharp wrote:
> 
> > On Mon, Dec 16, 2013 at 07:10:19AM -0800, James Bottomley wrote:
> > > This set should fix our target problems with USB by making the target
> > > visibility properly reference counted.  Since it's a major change to the
> > > infrastructure, we'll incubate upstream first before backporting to
> > > stable.
> > > 
> > > James
> > 
> > I tried these patches, and they cause an oops when a USB mass storage
> > device is plugged in.  Note that this device uses the usb-storage
> > driver, not the uas driver.
> > 
> > [14248.340064] scsi6 : usb-storage 2-2:1.0
> > [14248.341083] usbcore: registered new interface driver usb-storage
> > [14248.346211] usbcore: registered new interface driver uas
> > [14249.339937] scsi 6:0:0:0: Direct-Access LexarJumpDrive
> > 1.00 PQ: 0 ANSI: 6
> > [14249.340988] [ cut here ]
> > [14249.340999] WARNING: CPU: 3 PID: 5578 at lib/kobject.c:227 
> > kobject_add_internal+0x13f/0x350()
> > [14249.341003] kobject_add_internal failed for 6:0:0:0 (error: -2 parent: 
> > target6:0:0)
> > [14249.341005] Modules linked in: uas usb_storage ctr ccm cuse dm_crypt 
> > uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev btusb 
> > x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel aes_x86_64 
> > lrw gf128mul glue_helper ablk_helper cryptd arc4 iwldvm mac80211 microcode 
> > snd_hda_codec_hdmi iwlwifi psmouse snd_hda_codec_realtek serio_raw 
> > snd_hda_intel snd_usb_audio snd_hda_codec thinkpad_acpi cfg80211 joydev 
> > snd_usbmidi_lib nvram snd_hwdep snd_seq_midi snd_pcm snd_seq_midi_event 
> > snd_rawmidi lpc_ich rfcomm bnep snd_seq bluetooth snd_seq_device 
> > snd_page_alloc ehci_pci snd_timer ehci_hcd snd soundcore tpm_tis 
> > binfmt_misc btrfs libcrc32c xor raid6_pq hid_generic usbhid hid i915 ahci 
> > libahci e1000e sdhci_pci sdhci i2c_algo_bit drm_kms_helper ptp pps_core drm 
> > xhci_hcd video
> > [14249.341095] CPU: 3 PID: 5578 Comm: kworker/u16:0 Not tainted 3.13.0-rc1+ 
> > #142
> > [14249.341098] Hardware name: LENOVO 2325AP7/2325AP7, BIOS G2ET82WW (2.02 ) 
> > 09/11/2012
> > [14249.341105] Workqueue: events_unbound async_run_entry_fn
> > [14249.341108]  0009 88003aa9db60 81658a4e 
> > 88003aa9dba8
> > [14249.341115]  88003aa9db98 81048c3d 88006bc551b0 
> > fffe
> > [14249.341121]   8800bec22838 0200 
> > 88003aa9dbf8
> > [14249.341127] Call Trace:
> > [14249.341135]  [] dump_stack+0x4d/0x66
> > [14249.341142]  [] warn_slowpath_common+0x7d/0xa0
> > [14249.341148]  [] warn_slowpath_fmt+0x4c/0x50
> > [14249.341154]  [] ? _raw_spin_unlock+0x27/0x40
> > [14249.341159]  [] kobject_add_internal+0x13f/0x350
> > [14249.341163]  [] kobject_add+0x65/0xb0
> > [14249.341170]  [] ? get_device+0x30/0x30
> > [14249.341175]  [] ? klist_init+0x31/0x40
> > [14249.341181]  [] device_add+0x128/0x660
> > [14249.341186]  [] ? __pm_runtime_resume+0x5c/0x90
> > [14249.341193]  [] scsi_sysfs_add_sdev+0xac/0x340
> > [14249.341199]  [] do_scan_async+0x83/0x180
> > [14249.341204]  [] async_run_entry_fn+0x37/0x130
> > [14249.341210]  [] process_one_work+0x1f4/0x550
> > [14249.341215]  [] ? process_one_work+0x192/0x550
> > [14249.341220]  [] worker_thread+0x121/0x3a0
> > [14249.341225]  [] ? manage_workers.isra.22+0x2a0/0x2a0
> > [14249.341231]  [] kthread+0xfc/0x120
> > [14249.341238]  [] ? kthread_create_on_node+0x230/0x230
> > [14249.341243]  [] ret_from_fork+0x7c/0xb0
> > [14249.341249]  [] ? kthread_create_on_node+0x230/0x230
> > [14249.341253] ---[ end trace 7f1d8a449e6af5aa ]---
> > [14249.341259] scsi 6:0:0:0: failed to add device: -2
> 
> James:
> 
> The problem occurs when scsi_finish_async_scan() calls 
> scsi_sysfs_add_devices().
> 
> During an async scan, the devices get stored up and not made visible as
> they are found (see the end of scsi_add_lun()).  At the end, the target
> gets removed because it has no visible children, of course.  Then when
> the children do get added all at once, when the scan is over, it's too
> late.
> 
> How should this be fixed?  Forget about the en-masse registration and 
> do each device as it is found?

Great, I knew I'd find a reason to hate the async scanning code
eventually.

However, the solution is just to make the kref work for us.  We already
properly refcount everything, so we just take the reap_ref on the tar

Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-19 Thread James Bottomley
On Thu, 2013-12-19 at 10:26 -0800, Sarah Sharp wrote:
> On Wed, Dec 18, 2013 at 04:05:05PM -0800, James Bottomley wrote:
> > On Wed, 2013-12-18 at 16:50 -0500, Alan Stern wrote:
> > > On Wed, 18 Dec 2013, Sarah Sharp wrote:
> > > 
> > > > On Mon, Dec 16, 2013 at 07:10:19AM -0800, James Bottomley wrote:
> > > > > This set should fix our target problems with USB by making the target
> > > > > visibility properly reference counted.  Since it's a major change to 
> > > > > the
> > > > > infrastructure, we'll incubate upstream first before backporting to
> > > > > stable.
> > > > > 
> > > > > James
> > > > 
> > > > I tried these patches, and they cause an oops when a USB mass storage
> > > > device is plugged in.  Note that this device uses the usb-storage
> > > > driver, not the uas driver.
> > > > 
> > > > [14248.340064] scsi6 : usb-storage 2-2:1.0
> > > > [14248.341083] usbcore: registered new interface driver usb-storage
> > > > [14248.346211] usbcore: registered new interface driver uas
> > > > [14249.339937] scsi 6:0:0:0: Direct-Access LexarJumpDrive   
> > > >  1.00 PQ: 0 ANSI: 6
> > > > [14249.340988] [ cut here ]
> > > > [14249.340999] WARNING: CPU: 3 PID: 5578 at lib/kobject.c:227 
> > > > kobject_add_internal+0x13f/0x350()
> > > > [14249.341003] kobject_add_internal failed for 6:0:0:0 (error: -2 
> > > > parent: target6:0:0)
> > > > [14249.341005] Modules linked in: uas usb_storage ctr ccm cuse dm_crypt 
> > > > uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev 
> > > > btusb x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel 
> > > > aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd arc4 iwldvm 
> > > > mac80211 microcode snd_hda_codec_hdmi iwlwifi psmouse 
> > > > snd_hda_codec_realtek serio_raw snd_hda_intel snd_usb_audio 
> > > > snd_hda_codec thinkpad_acpi cfg80211 joydev snd_usbmidi_lib nvram 
> > > > snd_hwdep snd_seq_midi snd_pcm snd_seq_midi_event snd_rawmidi lpc_ich 
> > > > rfcomm bnep snd_seq bluetooth snd_seq_device snd_page_alloc ehci_pci 
> > > > snd_timer ehci_hcd snd soundcore tpm_tis binfmt_misc btrfs libcrc32c 
> > > > xor raid6_pq hid_generic usbhid hid i915 ahci libahci e1000e sdhci_pci 
> > > > sdhci i2c_algo_bit drm_kms_helper ptp pps_core drm xhci_hcd video
> > > > [14249.341095] CPU: 3 PID: 5578 Comm: kworker/u16:0 Not tainted 
> > > > 3.13.0-rc1+ #142
> > > > [14249.341098] Hardware name: LENOVO 2325AP7/2325AP7, BIOS G2ET82WW 
> > > > (2.02 ) 09/11/2012
> > > > [14249.341105] Workqueue: events_unbound async_run_entry_fn
> > > > [14249.341108]  0009 88003aa9db60 81658a4e 
> > > > 88003aa9dba8
> > > > [14249.341115]  88003aa9db98 81048c3d 88006bc551b0 
> > > > fffe
> > > > [14249.341121]   8800bec22838 0200 
> > > > 88003aa9dbf8
> > > > [14249.341127] Call Trace:
> > > > [14249.341135]  [] dump_stack+0x4d/0x66
> > > > [14249.341142]  [] warn_slowpath_common+0x7d/0xa0
> > > > [14249.341148]  [] warn_slowpath_fmt+0x4c/0x50
> > > > [14249.341154]  [] ? _raw_spin_unlock+0x27/0x40
> > > > [14249.341159]  [] kobject_add_internal+0x13f/0x350
> > > > [14249.341163]  [] kobject_add+0x65/0xb0
> > > > [14249.341170]  [] ? get_device+0x30/0x30
> > > > [14249.341175]  [] ? klist_init+0x31/0x40
> > > > [14249.341181]  [] device_add+0x128/0x660
> > > > [14249.341186]  [] ? __pm_runtime_resume+0x5c/0x90
> > > > [14249.341193]  [] scsi_sysfs_add_sdev+0xac/0x340
> > > > [14249.341199]  [] do_scan_async+0x83/0x180
> > > > [14249.341204]  [] async_run_entry_fn+0x37/0x130
> > > > [14249.341210]  [] process_one_work+0x1f4/0x550
> > > > [14249.341215]  [] ? process_one_work+0x192/0x550
> > > > [14249.341220]  [] worker_thread+0x121/0x3a0
> > > > [14249.341225]  [] ? 
> > > > manage_workers.isra.22+0x2a0/0x2a0
> > > > [14249.341231]  [] kthread+0xfc/0x120
> > > > [14249.341238]  [] ? 
> > > > kthread_create_on_node+0x230/0x230
> > > > [14249.341243]  [] ret_from_fork+0x7c/0xb0
> > > > [14249.341249]  [] ? 
> > > > kthread_create_on_node+0x230/0x2

Re: PATCH: usb-storage-set-last-sector-bug-flag.patch

2008-01-23 Thread James Bottomley

On Wed, 2008-01-23 at 10:12 -0800, Greg KH wrote:
> On Sun, Jan 20, 2008 at 04:45:36PM -0500, Alan Stern wrote:
> > On Sun, 20 Jan 2008, Greg KH wrote:
> > 
> > > On Sun, Jan 20, 2008 at 11:27:29AM +0100, Hans de Goede wrote:
> > > > Hi all,
> > > >
> > > > This patch sets the last_sector_bug flag to 1 for all USB disks. This is
> > > > needed to makes the cardreader on various HP multifunction printers 
> > > > work.
> > > >
> > > > Since the performance impact is negible we set this flag for all USB 
> > > > disks 
> > > > to avoid an unusual_devs.h nightmare.
> > > 
> > > Oh great, now my "working just fine" USB devices, which happen to have
> > > data in the last sector, suddenly stop working.
> > > 
> > > That's not acceptable :(
> > 
> > These patches really should not impact existing devices.  If they do
> > then something is definitely wrong.
> > 
> > Can you provide detailed logging information showing your problem?  For 
> > example, a usbmon trace would be good.  Better yet, a usbmon trace 
> > without the patches and a usbmon trace with the patches, for 
> > comparison.
> 
> I don't have a device with such a problem, I just am worried that we are
> now suddenly keeping access from the last sector for devices that
> currently did work just fine.
> 
> If you all guarantee that this will not happen, hey, I'm happy, and I'll
> gladly point all bug reports on to you all :)


Hey, you can trust me ... I'm a doctor ...

Would you like me to add your 'Responsibility-disclamed-by:' to the
patch ...?

James


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PATCH: usb-storage-set-last-sector-bug-flag.patch

2008-01-25 Thread James Bottomley
On Thu, 2008-01-24 at 09:21 -0800, Greg KH wrote:
> On Thu, Jan 24, 2008 at 06:07:00PM +0100, Stefan Richter wrote:
> > Greg KH wrote:
> > > I just am worried that we are
> > > now suddenly keeping access from the last sector for devices that
> > > currently did work just fine.
> > 
> > This new workaround doesn't prevent access to the last sector.  It only
> > breaks up a multi-sector access which would also reach the last sector
> > into several (two? I'm too lazy to look back in the mail thread)
> > accesses, in order to access the last sector in a dedicated
> > single-sector access.
> > 
> > So that's very differently to the fix-capacity workaround.  The
> > fix-capacity workaround manipulates the READ CAPACITY parameter data.
> > Therefore the fix-capacity workaround is unsafe for non-buggy devices.
> > 
> > The last-sector-(access-)bug workaround _only_ modifies the command
> > stream which is sent to the device.  A dangerous command is replaced by
> > equivalent safe commands.  These commands are luckily safe for _all_
> > devices, buggy and non-buggy ones.  The only cost of this workaround is
> > (1.) the code, (2.) the runtime/ bandwidth/ latency overhead for
> > accesses which reach the last sector.
> 
> Ok, thanks for explaining it better.  I have no objection to this change
> anymore.

So, for forms sake to take this through the SCSI tree I need at least
one USB person to ack it ...

James


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.24-git usb reset problems

2008-01-29 Thread James Bottomley
On Tue, 2008-01-29 at 10:36 -0500, Alan Stern wrote:
> On Tue, 29 Jan 2008, Boaz Harrosh wrote:
> 
> > --- a/drivers/usb/storage/transport.c
> > +++ b/drivers/usb/storage/transport.c
> > @@ -462,18 +462,24 @@ static int usb_stor_bulk_transfer_sglist(struct 
> > us_data *us, unsigned int pipe,
> >   * Common used function. Transfer a complete command
> >   * via usb_stor_bulk_transfer_sglist() above. Set cmnd resid
> >   */
> > -int usb_stor_bulk_srb(struct us_data* us, unsigned int pipe,
> > - struct scsi_cmnd* srb)
> > +int usb_stor_bulk_srb_length(struct us_data* us, unsigned int pipe,
> > + struct scsi_cmnd* srb, unsigned length)
> >  {
> > unsigned int partial;
> > int result = usb_stor_bulk_transfer_sglist(us, pipe, scsi_sglist(srb),
> > - scsi_sg_count(srb), scsi_bufflen(srb),
> > + scsi_sg_count(srb), length,
> >   &partial);
> >  
> > scsi_set_resid(srb, scsi_bufflen(srb) - partial);
> > return result;
> >  }
> >  
> > +int usb_stor_bulk_srb(struct us_data* us, unsigned int pipe,
> > +   struct scsi_cmnd* srb)
> > +{
> > +   return usb_stor_bulk_srb_length(us, pipe, srb, scsi_bufflen(srb));
> > +}
> > +
> 
> I don't like this patch very much.  Why add another layer of 
> indirection when the two subroutines do hardly any work?  Leave 
> usb_stor_bulk_srb() the way it was, and add usb_stor_bulk_srb_length() 
> as a separate routine that simply calls usb_stor_bulk_transfer_sglist() 
> and scsi_set_resid().
> 
> BTW, the standard coding style calls for a blank line after the list of 
> local variables at the start of a function or block.

There's another bug in the transport.c conversion in that the residuals
are updated with bogus data in several error cases, since
usb_stor_bulk_transfer_sglist() only sets the actual length if the urb
is actually sent.

I'm not sure if this is is the solution to the problem at hand, but it
definitely fixes another bug in the code.

James

diff --git a/drivers/usb/storage/transport.c b/drivers/usb/storage/transport.c
index d9f4912..bab0858 100644
--- a/drivers/usb/storage/transport.c
+++ b/drivers/usb/storage/transport.c
@@ -465,7 +465,7 @@ static int usb_stor_bulk_transfer_sglist(struct us_data 
*us, unsigned int pipe,
 int usb_stor_bulk_srb(struct us_data* us, unsigned int pipe,
  struct scsi_cmnd* srb)
 {
-   unsigned int partial;
+   unsigned int partial = scsi_get_resid(srb);
int result = usb_stor_bulk_transfer_sglist(us, pipe, scsi_sglist(srb),
  scsi_sg_count(srb), scsi_bufflen(srb),
  &partial);


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.24-git usb reset problems

2008-01-29 Thread James Bottomley
On Tue, 2008-01-29 at 20:58 +0200, Boaz Harrosh wrote:
> > Your new code does
> > 
> > int partial; <- stack uninitialised
> > sb_stor_bulk_transfer_sglist(..., &partial, ...);
> > scsi_set_resid(srb, scsi_bufflen(srb) - partial);
> > 
> > If the function doesn't touch partial, as it doesn't in the error legs,
> > resid now gets set with rubbish.
> > 
> > Actually, my code is still wrong .. we have to set it to
> > scsi_bufflen(srb) - scsi_resid(srb) so that it comes back the same if
> > left untouched.
> > 
> >> I have such a device and I get one reset but then every thing works nice.
> >> This is with debug on. I'll try to make it fail.
> > 
> > James
> > 
> > 
> Sorry I still don't see it.
> 
> original code did sb_stor_bulk_transfer_sg(..., &srp->resid, ...)
> 
> but sb_stor_bulk_transfer_sg does the:
>  int partial; <- stack uninitialised
>  sb_stor_bulk_transfer_sglist(..., &partial, ...);
> 
> and then unconditionally sets *residual = length_left;
> I do not see an "error legs" case in sb_stor_bulk_transfer_sg().

This is really programming 101. This:

static int usb_stor_bulk_transfer_sglist(struct us_data *us, unsigned
int pipe,
struct scatterlist *sg, int num_sg, unsigned int length,
unsigned int *act_len)
{
int result;

/* don't submit s-g requests during abort/disconnect processing */
if (us->flags & ABORTING_OR_DISCONNECTING)
return USB_STOR_XFER_ERROR;

The return USB_STOR_XFER_ERROR; is called an error leg.  It returns
without updating *act_len thus leaving &partial uninitialised.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.24-git usb reset problems

2008-01-29 Thread James Bottomley
On Tue, 2008-01-29 at 21:06 +0100, Jens Axboe wrote:
> > I will ... but it will cause an explosion in the bidirectional tree
> > again.  I think the bidi updates also fix this.  However, give me time
> > to rebase and verify.
> 
> OK thanks, just wanted to make sure that we didn't both expect each
> other to handle it :-)

Yes, confirm that; I think this is already fixed in scsi-misc-2.6.
Could you pull from

master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git

and verify with your device?

Thanks,

James


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.24-git usb reset problems

2008-01-29 Thread James Bottomley
On Tue, 2008-01-29 at 11:10 -0800, Matthew Dharm wrote:
> On Tue, Jan 29, 2008 at 07:39:11PM +0100, Jens Axboe wrote:
> > On Tue, Jan 29 2008, Jens Axboe wrote:
> > > On Tue, Jan 29 2008, Oliver Neukum wrote:
> > > > Am Dienstag, 29. Januar 2008 15:11:08 schrieb Jens Axboe:
> > > > > On Tue, Jan 29 2008, Boaz Harrosh wrote:
> > > > > > On Tue, Jan 29 2008 at 15:54 +0200, Jens Axboe <[EMAIL PROTECTED]> 
> > > > > > wrote:
> > > > > > > On Tue, Jan 29 2008, Boaz Harrosh wrote:
> > > > > > >> Greg KH wrote:
> > > >  
> > > > > > > No difference, still just a lot of resets.
> > > > > > > 
> > > > > > Where you able to figure out which usb storage transport is used?
> > > > > > 
> > > > > > in drivers/usb/storage/usb.c you have get_protocol() and 
> > > > > > get_transport()
> > > > > > functions. I'm not sure if these get stored in sysfs perhaps. This 
> > > > > > will
> > > > > > pinpoint better where to look. Let me research a bit. 
> > > > > 
> > > > > Did the quick'n easy and dumped it. Protocol is 'Transparent SCSI' and
> > > > > transport is 'Bulk'
> > > > 
> > > > You can recompile your kernel with CONFIG_USB_DEBUG and 
> > > > CONFIG_STORAGE_DEBUG
> > > > That should tell the reason for the resets.
> > > 
> > > Sure, I'll do that. Will post the results tonight.
> > 
> > OK, fresh boot with CONFIG_USB_DEBUG and CONFIG_STORAGE_DEBUG. Plugged
> > in the device, waited 10 seconds or so and pulled it out. These are the
> > messages.
> > 
> > It all looks good until the MODE_SENSE command, where it only transfers
> > 4 of 192 bytes.
> 
> No, that's not the problem.  This is the problem:
>  
> > usb-storage: *** thread awakened.
> > usb-storage: Command TEST_UNIT_READY (6 bytes)
> > usb-storage:  00 00 00 00 00 00
> > usb-storage: Bulk Command S 0x43425355 T 0xd L 0 F 0 Trg 0 LUN 1 CL 6
> > usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> > usb-storage: Status code 0; transferred 31/31
> > usb-storage: -- transfer complete
> > usb-storage: Bulk command transfer result=0
> > usb-storage: Attempting to get CSW...
> > usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
> > usb-storage: Status code 0; transferred 13/13
> > usb-storage: -- transfer complete
> > usb-storage: Bulk status result = 0
> > usb-storage: Bulk Status S 0x53425355 T 0xd R 0 Stat 0x1
> > usb-storage: -- transport indicates command failure
> > usb-storage: Issuing auto-REQUEST_SENSE
> > usb-storage: Bulk Command S 0x43425355 T 0xe L 18 F 128 Trg 0 LUN 1 CL 6
> > usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> > usb-storage: Status code 0; transferred 31/31
> > usb-storage: -- transfer complete
> > usb-storage: Bulk command transfer result=0
> > usb-storage: usb_stor_bulk_transfer_sglist: xfer 18 bytes, 1 entries
> > usb-storage: usb_sg_init returned -22
> > usb-storage: Bulk data transfer result 0x4
> > usb-storage: -- auto-sense failure
> > usb-storage: storage_pre_reset
> > ehci_hcd :00:1d.7: port 1 high speed
> > ehci_hcd :00:1d.7: GetStatus port 1 status 001005 POWER sig=se0 PE 
> > CONNECT
> > usb 5-1: reset high speed USB device using ehci_hcd and address 2
> > ehci_hcd :00:1d.7: port 1 high speed
> > ehci_hcd :00:1d.7: GetStatus port 1 status 001005 POWER sig=se0 PE 
> > CONNECT
> > usb-storage: storage_post_reset
> > usb-storage: usb_reset_composite_device returns 0
> > usb-storage: scsi cmd done, result=0x7
> > usb-storage: *** thread sleeping.
> 
> For some reason, usb_sg_init is boned during auto-sense.

OK, that's implicating the scsi_eh_prep_cmnd() in the auto sense
code ... that was also an update in 2.6.24

James


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.24-git usb reset problems

2008-01-29 Thread James Bottomley

On Tue, 2008-01-29 at 20:27 +0200, Boaz Harrosh wrote:
> On Tue, Jan 29 2008 at 18:34 +0200, James Bottomley <[EMAIL PROTECTED]> wrote:
> > On Tue, 2008-01-29 at 10:36 -0500, Alan Stern wrote:
> >> On Tue, 29 Jan 2008, Boaz Harrosh wrote:
> >>
> >>> --- a/drivers/usb/storage/transport.c
> >>> +++ b/drivers/usb/storage/transport.c
> >>> @@ -462,18 +462,24 @@ static int usb_stor_bulk_transfer_sglist(struct 
> >>> us_data *us, unsigned int pipe,
> >>>   * Common used function. Transfer a complete command
> >>>   * via usb_stor_bulk_transfer_sglist() above. Set cmnd resid
> >>>   */
> >>> -int usb_stor_bulk_srb(struct us_data* us, unsigned int pipe,
> >>> -   struct scsi_cmnd* srb)
> >>> +int usb_stor_bulk_srb_length(struct us_data* us, unsigned int pipe,
> >>> +   struct scsi_cmnd* srb, unsigned length)
> >>>  {
> >>>   unsigned int partial;
> >>>   int result = usb_stor_bulk_transfer_sglist(us, pipe, scsi_sglist(srb),
> >>> -   scsi_sg_count(srb), scsi_bufflen(srb),
> >>> +   scsi_sg_count(srb), length,
> >>> &partial);
> >>>  
> >>>   scsi_set_resid(srb, scsi_bufflen(srb) - partial);
> >>>   return result;
> >>>  }
> >>>  
> >>> +int usb_stor_bulk_srb(struct us_data* us, unsigned int pipe,
> >>> + struct scsi_cmnd* srb)
> >>> +{
> >>> + return usb_stor_bulk_srb_length(us, pipe, srb, scsi_bufflen(srb));
> >>> +}
> >>> +
> >> I don't like this patch very much.  Why add another layer of 
> >> indirection when the two subroutines do hardly any work?  Leave 
> >> usb_stor_bulk_srb() the way it was, and add usb_stor_bulk_srb_length() 
> >> as a separate routine that simply calls usb_stor_bulk_transfer_sglist() 
> >> and scsi_set_resid().
> >>
> >> BTW, the standard coding style calls for a blank line after the list of 
> >> local variables at the start of a function or block.
> > 
> > There's another bug in the transport.c conversion in that the residuals
> > are updated with bogus data in several error cases, since
> > usb_stor_bulk_transfer_sglist() only sets the actual length if the urb
> > is actually sent.
> > 
> > I'm not sure if this is is the solution to the problem at hand, but it
> > definitely fixes another bug in the code.
> > 
> > James
> > 
> > diff --git a/drivers/usb/storage/transport.c 
> > b/drivers/usb/storage/transport.c
> > index d9f4912..bab0858 100644
> > --- a/drivers/usb/storage/transport.c
> > +++ b/drivers/usb/storage/transport.c
> > @@ -465,7 +465,7 @@ static int usb_stor_bulk_transfer_sglist(struct us_data 
> > *us, unsigned int pipe,
> >  int usb_stor_bulk_srb(struct us_data* us, unsigned int pipe,
> >   struct scsi_cmnd* srb)
> >  {
> > -   unsigned int partial;
> > +   unsigned int partial = scsi_get_resid(srb);
> > int result = usb_stor_bulk_transfer_sglist(us, pipe, scsi_sglist(srb),
> >   scsi_sg_count(srb), scsi_bufflen(srb),
> >   &partial);
> > 
> > 
> > -
> But then this is weird because it is not what usb_stor_bulk_transfer_sg() is 
> doing
> which was the one called before.

Um, yes it was.  The original code did this

sb_stor_bulk_transfer_sg(..., &srp->resid, ...)

Which was at liberty not to touch resid, which it chose not to do in the
error legs.

Your new code does

int partial; <- stack uninitialised
sb_stor_bulk_transfer_sglist(..., &partial, ...);
scsi_set_resid(srb, scsi_bufflen(srb) - partial);

If the function doesn't touch partial, as it doesn't in the error legs,
resid now gets set with rubbish.

Actually, my code is still wrong .. we have to set it to
scsi_bufflen(srb) - scsi_resid(srb) so that it comes back the same if
left untouched.

> I have such a device and I get one reset but then every thing works nice.
> This is with debug on. I'll try to make it fail.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] 2.6.24-git usb reset problems

2008-01-29 Thread James Bottomley
On Tue, 2008-01-29 at 21:03 +0100, Jens Axboe wrote:
> On Tue, Jan 29 2008, Boaz Harrosh wrote:
> > On Tue, Jan 29 2008 at 21:45 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote:
> > > On Tue, Jan 29 2008, Jens Axboe wrote:
> > >> On Tue, Jan 29 2008, James Bottomley wrote:
> > >>> On Tue, 2008-01-29 at 11:10 -0800, Matthew Dharm wrote:
> > >>>> On Tue, Jan 29, 2008 at 07:39:11PM +0100, Jens Axboe wrote:
> > >>>>> On Tue, Jan 29 2008, Jens Axboe wrote:
> > >>>>>> On Tue, Jan 29 2008, Oliver Neukum wrote:
> > >>>>>>> Am Dienstag, 29. Januar 2008 15:11:08 schrieb Jens Axboe:
> > >>>>>>>> On Tue, Jan 29 2008, Boaz Harrosh wrote:
> > >>>>>>>>> On Tue, Jan 29 2008 at 15:54 +0200, Jens Axboe <[EMAIL 
> > >>>>>>>>> PROTECTED]> wrote:
> > >>>>>>>>>> On Tue, Jan 29 2008, Boaz Harrosh wrote:
> > >>>>>>>>>>> Greg KH wrote:
> > >>>>>>>  
> > >>>>>>>>>> No difference, still just a lot of resets.
> > >>>>>>>>>>
> > >>>>>>>>> Where you able to figure out which usb storage transport is used?
> > >>>>>>>>>
> > >>>>>>>>> in drivers/usb/storage/usb.c you have get_protocol() and 
> > >>>>>>>>> get_transport()
> > >>>>>>>>> functions. I'm not sure if these get stored in sysfs perhaps. 
> > >>>>>>>>> This will
> > >>>>>>>>> pinpoint better where to look. Let me research a bit. 
> > >>>>>>>> Did the quick'n easy and dumped it. Protocol is 'Transparent SCSI' 
> > >>>>>>>> and
> > >>>>>>>> transport is 'Bulk'
> > >>>>>>> You can recompile your kernel with CONFIG_USB_DEBUG and 
> > >>>>>>> CONFIG_STORAGE_DEBUG
> > >>>>>>> That should tell the reason for the resets.
> > >>>>>> Sure, I'll do that. Will post the results tonight.
> > >>>>> OK, fresh boot with CONFIG_USB_DEBUG and CONFIG_STORAGE_DEBUG. Plugged
> > >>>>> in the device, waited 10 seconds or so and pulled it out. These are 
> > >>>>> the
> > >>>>> messages.
> > >>>>>
> > >>>>> It all looks good until the MODE_SENSE command, where it only 
> > >>>>> transfers
> > >>>>> 4 of 192 bytes.
> > >>>> No, that's not the problem.  This is the problem:
> > >>>>  
> > >>>>> usb-storage: *** thread awakened.
> > >>>>> usb-storage: Command TEST_UNIT_READY (6 bytes)
> > >>>>> usb-storage:  00 00 00 00 00 00
> > >>>>> usb-storage: Bulk Command S 0x43425355 T 0xd L 0 F 0 Trg 0 LUN 1 CL 6
> > >>>>> usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> > >>>>> usb-storage: Status code 0; transferred 31/31
> > >>>>> usb-storage: -- transfer complete
> > >>>>> usb-storage: Bulk command transfer result=0
> > >>>>> usb-storage: Attempting to get CSW...
> > >>>>> usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
> > >>>>> usb-storage: Status code 0; transferred 13/13
> > >>>>> usb-storage: -- transfer complete
> > >>>>> usb-storage: Bulk status result = 0
> > >>>>> usb-storage: Bulk Status S 0x53425355 T 0xd R 0 Stat 0x1
> > >>>>> usb-storage: -- transport indicates command failure
> > >>>>> usb-storage: Issuing auto-REQUEST_SENSE
> > >>>>> usb-storage: Bulk Command S 0x43425355 T 0xe L 18 F 128 Trg 0 LUN 1 
> > >>>>> CL 6
> > >>>>> usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> > >>>>> usb-storage: Status code 0; transferred 31/31
> > >>>>> usb-storage: -- transfer complete
> > >>>>> usb-storage: Bulk command transfer result=0
> > >>>>> usb-storage: usb_stor_bulk_transfer_sglist: xfer 18 bytes, 1 entries
> > >>>>> usb-storage: usb_sg_init returned -22
> > >>>>> usb-storage: Bulk data tra

Re: [BUG] 2.6.24-git usb reset problems

2008-01-30 Thread James Bottomley
On Wed, 2008-01-30 at 11:38 +0100, Jens Axboe wrote:
> On Wed, Jan 30 2008, Geert Uytterhoeven wrote:
> > On Tue, 29 Jan 2008, Jens Axboe wrote:
> > > On Tue, Jan 29 2008, Jens Axboe wrote:
> > > > On Tue, Jan 29 2008, James Bottomley wrote:
> > > > > On Tue, 2008-01-29 at 11:10 -0800, Matthew Dharm wrote:
> > > > > > For some reason, usb_sg_init is boned during auto-sense.
> > > > > 
> > > > > OK, that's implicating the scsi_eh_prep_cmnd() in the auto sense
> > > > > code ... that was also an update in 2.6.24
> > > > 
> > > > yeah, already found the bug - it's assuming ->request_buffer holds the
> > > > sglist, oops. Preparing a fix.
> > > 
> > > ok here goes, this saves and restores the sg table correctly. it also
> > > fixes the usb bug for me.
> > 
> > I can confirm this patch fixes the errors I was seeing with current
> > linux-2.6.git for the USB memory card readers in a Dell TFT connected
> > to a PS3.
> 
> James, we need a fix for this pushed asap. So either we should merge the
> below now, or push the bidi patchset that also fixes this. It all
> depends on when you want to merge the bidi patches...

The SCSI patch set (including the bidirectional pieces) is waiting in
scsi-misc ... just for forms sake, could you confirm that it actually
fixes the problem and I'll push it.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix for an underflow condition in usb storage & isd200.c

2008-02-06 Thread James Bottomley

On Wed, 2008-02-06 at 17:18 -0500, Alan Stern wrote:
> On Wed, 6 Feb 2008, Matthew Dharm wrote:
> 
> > Maybe this is a crazy question, but...
> > 
> > Why is this not in the SCSI core?
> 
> Or even in the block core?
> 
> >  It's hardly USB-specific, and I'm
> > willing to bet that there are other HCDs (at least spb2) which need to do
> > this sort of thing...
> 
> James, do you have any idea?
> 
> What we're talking about is a routine that provides drivers a simple
> way to access the data in a scatter-gather buffer (which may lie in
> highmem or otherwise not be easily reachable).  The idea is that some 
> commands are emulated by the driver rather than carried out by the 
> device, and the driver needs some way to stick the results in the 
> transfer buffer.

Isn't that what scsi_kmap_sg() is designed for ... or do you need
something slightly different?

James


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix for an underflow condition in usb storage & isd200.c

2008-02-06 Thread James Bottomley

On Wed, 2008-02-06 at 18:25 -0500, Alan Stern wrote:
> On Wed, 6 Feb 2008, James Bottomley wrote:
> 
> > > What we're talking about is a routine that provides drivers a simple
> > > way to access the data in a scatter-gather buffer (which may lie in
> > > highmem or otherwise not be easily reachable).  The idea is that some 
> > > commands are emulated by the driver rather than carried out by the 
> > > device, and the driver needs some way to stick the results in the 
> > > transfer buffer.
> > 
> > Isn't that what scsi_kmap_sg() is designed for ... or do you need
> > something slightly different?
> 
> I don't know -- I've never heard of scsi_kmap_sg().  And it doesn't 
> appear to exist anywhere in my kernel source.
> 
> Did you mean scsi_kmap_atomic_sg()?

Yes .. I replied from memory rather than looking in the source.

>   That appears to do only part of
> what usb_stor_access_xfer_buf() does.  In fact, all it does is map a
> single page.

Um, it does everything you asked about above.  It's designed as a helper
for SCSI devices that need to modify data for automatically generated
returns (RAID devices and INQUIRY strings for instance).  It's also used
by some of the less well automated devices that might need to do an
effective PIO feed on DMA engine halts.  It allows you to address a sg
list by offset and length (although it will adjust the length if there's
a mapping problem).  Since the only way to guarantee access to highmem
is by kmap_atomic (which has limited slots), any API that does this sort
of thing is necessarily page based and does single page mappings at a
time.  Why don't you tell me what you think is missing rather than me
having to dig around in the usb sources to try to work it out.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND 0/3] scsi: fix internal write cache issue on usb hdd.

2012-07-17 Thread James Bottomley
On Mon, 2012-07-16 at 16:48 -0700, Greg KH wrote:
> On Sat, Jul 07, 2012 at 11:04:45PM -0400, Namjae Jeon wrote:
> > From: Namjae Jeon 
> > 
> > The numbers of USB HDDs(All USB HDD I checked) does not respond
> > correctly to scsi mode sense command for retrieving the write cache
> > page status. Even though write cache is enabled by default, due to
> > scsi driver assume that cache is not enabled which in turn might lead
> > to loss of data since data still will be in cache.
> > This result that all filesystems is not stable on USB HDD when the
> > device is unplugged abruptly, even though these are having journaling
> > feature. Our first trying is that scsi driver send ATA command
> > (ATA Pass through, #85) to USB HDD after failure from normal routine to
> > know write cache enable.
> > We have known it is dangerous after testing several USB HDD. some of
> > HDD is stalled by this command(A-DATA HDD). So we tried to make the
> > patch James Bottomley's suggestion(usb quirk) on version 2 that add
> > product ID and verdor ID of USB HDD to USB quirk list after checking
> > write cache.
> > All filesystem will be stable on USB HDD registered in quirk list.
> > And it will be updated continuously.
> 
> Now applied to the usb-next branch.

It's been in scsi#misc for ten days with no problems.  Lets leave it
there rather than create merge and rebase issues.

Thanks,

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-02 Thread James Bottomley
On Thu, 2014-01-02 at 16:01 -0500, Mark Lord wrote:
> On 14-01-02 02:15 PM, Sarah Sharp wrote:
> > On Tue, Dec 31, 2013 at 12:40:16PM -0800, walt wrote:
> ..
> >> Unfortunately this patch causes a regression when copying large files to my
> >> outboard USB3 drive. (Nothing at all to do with networking.)
> >>
> >> When I try to copy a large (20GB) file to the USB3 drive, the copy dies 
> >> after
> >> about 7GB, the ext4 journal aborts and the drive is remounted read-only.
> >>
> >> This bug is 100% reproducible (always pretty close to 7GB) and reverting 
> >> this
> >> patch completely fixes the problem.
> > 
> > Ok, I had feared that would be a consequence of this patch.  I think the
> > problem is that the usb-storage driver submitted an URB with more
> > scatter-gather entries than would fit on the ring segment, the xHCI
> > driver rejected the URB with -ENOMEM, and the SCSI core eventually gave
> > up on the SCSI command.
> 
> 
> Is there not a block layer / scheduler tunable for max sg entries or 
> something?

Yes, it's blk_queue_max_segments()  You can also add an indirect limit
from userspace using the queue/max_sectors_kb parameter (it doesn't
limit the number of sg elements directly, but it does limit the overall
size of the transfer, and since each segment is 4k or larger except at
the beginning and end, that limits the total number of sg elements as
well).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-02 Thread James Bottomley
On Sat, 2013-12-21 at 15:51 -0500, Alan Stern wrote:
> On Fri, 20 Dec 2013, Sarah Sharp wrote:
> 
> > On Thu, Dec 19, 2013 at 11:48:47AM -0800, James Bottomley wrote:
> > > It should apply incrementally on top of the previous two.  If it
> > > actually works, I'll fold it into the first patch.
> > 
> > Well, it doesn't oops anymore, but it does generate a pile of warnings:
> 
> This was a simple oversight.  scsi_target_reap() was called at the
> start of __scsi_remove_device(), but it should be called at the end.
> The patch below, applied on top of James's three patches, will fix the
> warnings.

Thanks, I'll do a repost of the combined series

> Alan Stern
> 
> P.S.: The comment isn't entirely correct any more...

Yes, fixed the tense.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-02 Thread James Bottomley
This set should fix our target problems with USB by making the target
visibility properly reference counted.  Since it's a major change to the
infrastructure, we'll incubate upstream first before backporting to
stable.

James Bottomley (2):
  [SCSI] fix our current target reap infrastructure.
  [SCSI] dual scan thread bug fix

 drivers/scsi/scsi_scan.c   | 108 +
 drivers/scsi/scsi_sysfs.c  |  20 ++---
 include/scsi/scsi_device.h |   3 +-
 3 files changed, 84 insertions(+), 47 deletions(-)

James

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 1/2] fix our current target reap infrastructure

2014-01-02 Thread James Bottomley
This patch eliminates the reap_ref and replaces it with a proper kref.
On last put of this kref, the target is removed from visibility in
sysfs.  The final call to scsi_target_reap() for the device is done from
__scsi_remove_device() and only if the device was made visible.  This
ensures that the target disappears as soon as the last device is gone
rather than waiting until final release of the device (which is often
too long).

Reviewed-by: Alan Stern 
Signed-off-by: James Bottomley 
---
 drivers/scsi/scsi_scan.c   | 95 --
 drivers/scsi/scsi_sysfs.c  | 20 +++---
 include/scsi/scsi_device.h |  3 +-
 3 files changed, 73 insertions(+), 45 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 307a811..ef3f958 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -371,6 +371,31 @@ static struct scsi_target *__scsi_find_target(struct 
device *parent,
 }
 
 /**
+ * scsi_target_reap_ref_release - remove target from visibility
+ * @kref: the reap_ref in the target being released
+ *
+ * Called on last put of reap_ref, which is the indication that no device
+ * under this target is visible anymore, so render the target invisible in
+ * sysfs.  Note: we have to be in user context here because the target reaps
+ * should be done in places where the scsi device visibility is being removed.
+ */
+static void scsi_target_reap_ref_release(struct kref *kref)
+{
+   struct scsi_target *starget
+   = container_of(kref, struct scsi_target, reap_ref);
+
+   transport_remove_device(&starget->dev);
+   device_del(&starget->dev);
+   starget->state = STARGET_DEL;
+   scsi_target_destroy(starget);
+}
+
+static void scsi_target_reap_ref_put(struct scsi_target *starget)
+{
+   kref_put(&starget->reap_ref, scsi_target_reap_ref_release);
+}
+
+/**
  * scsi_alloc_target - allocate a new or find an existing target
  * @parent:parent of the target (need not be a scsi host)
  * @channel:   target channel number (zero if no channels)
@@ -401,7 +426,7 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
}
dev = &starget->dev;
device_initialize(dev);
-   starget->reap_ref = 1;
+   kref_init(&starget->reap_ref);
dev->parent = get_device(parent);
dev_set_name(dev, "target%d:%d:%d", shost->host_no, channel, id);
dev->bus = &scsi_bus_type;
@@ -441,29 +466,36 @@ static struct scsi_target *scsi_alloc_target(struct 
device *parent,
return starget;
 
  found:
-   found_target->reap_ref++;
+   if (!kref_get_unless_zero(&found_target->reap_ref))
+   /*
+* release routine already fired.  Target is dead, but
+* STARGET_DEL may not yet be set (set in the release
+* routine), so set here as well, just in case
+*/
+   found_target->state = STARGET_DEL;
spin_unlock_irqrestore(shost->host_lock, flags);
if (found_target->state != STARGET_DEL) {
put_device(dev);
return found_target;
}
-   /* Unfortunately, we found a dying target; need to
-* wait until it's dead before we can get a new one */
+   /*
+* Unfortunately, we found a dying target; need to wait until it's
+* dead before we can get a new one.  There is an anomaly here.  We
+* *should* call scsi_target_reap() to balance the kref_get() of the
+* reap_ref above.  However, since the target is in state STARGET_DEL,
+* it's already invisible and the reap_ref is irrelevant.  If we call
+* scsi_target_reap() we might spuriously do another device_del() on
+* an already invisible target.
+*/
put_device(&found_target->dev);
-   flush_scheduled_work();
+   /*
+* length of time is irrelevant here, we just want to yield the CPU
+* for a tick to avoid busy waiting for the target to die.
+*/
+   msleep(1);
goto retry;
 }
 
-static void scsi_target_reap_usercontext(struct work_struct *work)
-{
-   struct scsi_target *starget =
-   container_of(work, struct scsi_target, ew.work);
-
-   transport_remove_device(&starget->dev);
-   device_del(&starget->dev);
-   scsi_target_destroy(starget);
-}
-
 /**
  * scsi_target_reap - check to see if target is in use and destroy if not
  * @starget: target to be checked
@@ -474,28 +506,11 @@ static void scsi_target_reap_usercontext(struct 
work_struct *work)
  */
 void scsi_target_reap(struct scsi_target *starget)
 {
-   struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
-   unsigned long flags;
-   enum scsi_target_state state;
-   int empty = 0;
-
-   spin_lock_irqsave(shost->host_lock, flags);
-

[RFC 2/2] dual scan thread bug fix

2014-01-02 Thread James Bottomley
In the highly unusual case where two threads are running concurrently through
the scanning code scanning the same target, we run into the situation where
one may allocate the target while the other is still using it.  In this case,
because the reap checks for STARGET_CREATED and kills the target without
reference counting, the second thread will do the wrong thing on reap.

Fix this by reference counting even creates and doing the STARGET_CREATED
check in the final put.

Signed-off-by: James Bottomley 
---
 drivers/scsi/scsi_scan.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index ef3f958..82cf902 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -320,6 +320,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
struct Scsi_Host *shost = dev_to_shost(dev->parent);
unsigned long flags;
 
+   starget->state = STARGET_DEL;
transport_destroy_device(dev);
spin_lock_irqsave(shost->host_lock, flags);
if (shost->hostt->target_destroy)
@@ -384,9 +385,15 @@ static void scsi_target_reap_ref_release(struct kref *kref)
struct scsi_target *starget
= container_of(kref, struct scsi_target, reap_ref);
 
-   transport_remove_device(&starget->dev);
-   device_del(&starget->dev);
-   starget->state = STARGET_DEL;
+   /*
+* if we get here and the target is still in the CREATED state that
+* means it was allocated but never made visible (because a scan
+* turned up no LUNs), so don't call device_del() on it.
+*/
+   if (starget->state == STARGET_RUNNING) {
+   transport_remove_device(&starget->dev);
+   device_del(&starget->dev);
+   }
scsi_target_destroy(starget);
 }
 
@@ -506,11 +513,13 @@ static struct scsi_target *scsi_alloc_target(struct 
device *parent,
  */
 void scsi_target_reap(struct scsi_target *starget)
 {
+   /*
+* serious problem if this triggers: STARGET_DEL is only set in the
+* kref release routine, so we're doing another final put on an
+* already released kref
+*/
BUG_ON(starget->state == STARGET_DEL);
-   if (starget->state == STARGET_CREATED)
-   scsi_target_destroy(starget);
-   else
-   scsi_target_reap_ref_put(starget);
+   scsi_target_reap_ref_put(starget);
 }
 
 /**
-- 
1.8.4.3



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-02 Thread James Bottomley
On Thu, 2014-01-02 at 16:45 -0800, Sarah Sharp wrote:
> On Sat, Dec 21, 2013 at 03:51:25PM -0500, Alan Stern wrote:
> > On Fri, 20 Dec 2013, Sarah Sharp wrote:
> > 
> > > On Thu, Dec 19, 2013 at 11:48:47AM -0800, James Bottomley wrote:
> > > > It should apply incrementally on top of the previous two.  If it
> > > > actually works, I'll fold it into the first patch.
> > > 
> > > Well, it doesn't oops anymore, but it does generate a pile of warnings:
> > 
> > This was a simple oversight.  scsi_target_reap() was called at the
> > start of __scsi_remove_device(), but it should be called at the end.
> > The patch below, applied on top of James's three patches, will fix the
> > warnings.
> > 
> > Alan Stern
> > 
> > P.S.: The comment isn't entirely correct any more...
> 
> Ok, Alan's additional patch fixed the warnings I was seeing on UAS
> device unplug.  James, can you Cc me on the finished patch when you send
> it in?

A bit late: it's here, but it should be on your usb list as well:

http://marc.info/?l=linux-scsi&m=138870949118304

James

> Hans, I don't want to send the UAS patches off to Greg until James'
> patches get into mainline.  I believe Greg's usb-next tree is frozen at
> this point, so they'll have to wait until 3.15.
> 
> Sarah Sharp
> 
> > Index: usb-3.13/drivers/scsi/scsi_sysfs.c
> > ===
> > --- usb-3.13.orig/drivers/scsi/scsi_sysfs.c
> > +++ usb-3.13/drivers/scsi/scsi_sysfs.c
> > @@ -1028,13 +1028,6 @@ void __scsi_remove_device(struct scsi_de
> >  {
> > struct device *dev = &sdev->sdev_gendev;
> >  
> > -   /*
> > -* Paired with the kref_get() in scsi_sysfs_initialize().  We're
> > -* removing sysfs visibility from the device, so make the target
> > -* invisible if this was the last device underneath it.
> > -*/
> > -   scsi_target_reap(scsi_target(sdev));
> > -
> > if (sdev->is_visible) {
> > if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
> > return;
> > @@ -1059,6 +1052,13 @@ void __scsi_remove_device(struct scsi_de
> > sdev->host->hostt->slave_destroy(sdev);
> > transport_destroy_device(dev);
> >  
> > +   /*
> > +* Paired with the kref_get() in scsi_sysfs_initialize().  We're
> > +* removing sysfs visibility from the device, so make the target
> > +* invisible if this was the last device underneath it.
> > +*/
> > +   scsi_target_reap(scsi_target(sdev));
> > +
> > put_device(dev);
> >  }
> >  
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-03 Thread James Bottomley
On Fri, 2014-01-03 at 09:56 +0100, Hans de Goede wrote:
> James, what is the status of getting the fix for the refcount issue into
> mainline ?

The plan is what I wrote in the patch intro:

"Since it's a major change to the infrastructure, we'll incubate
upstream first before backporting to stable."

That means with the merge window and delayed stable backport to make
sure it's sound.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 2/2] dual scan thread bug fix

2014-01-03 Thread James Bottomley

On Fri, 2014-01-03 at 10:58 -0500, Alan Stern wrote:
> On Thu, 2 Jan 2014, James Bottomley wrote:
> 
> > In the highly unusual case where two threads are running concurrently 
> > through
> > the scanning code scanning the same target, we run into the situation where
> > one may allocate the target while the other is still using it.  In this 
> > case,
> > because the reap checks for STARGET_CREATED and kills the target without
> > reference counting, the second thread will do the wrong thing on reap.
> > 
> > Fix this by reference counting even creates and doing the STARGET_CREATED
> > check in the final put.
> 
> I'm still concerned about one thing.  The previous patch does this in
> scsi_alloc_target():
> 
> >   found:
> > -   found_target->reap_ref++;
> > +   if (!kref_get_unless_zero(&found_target->reap_ref))
> > +   /*
> > +* release routine already fired.  Target is dead, but
> > +* STARGET_DEL may not yet be set (set in the release
> > +* routine), so set here as well, just in case
> > +*/
> > +   found_target->state = STARGET_DEL;
> > spin_unlock_irqrestore(shost->host_lock, flags);
> 
> As a result, the two comments in this patch aren't right:
> 
> > @@ -384,9 +385,15 @@ static void scsi_target_reap_ref_release(struct kref 
> > *kref)
> > struct scsi_target *starget
> > = container_of(kref, struct scsi_target, reap_ref);
> >  
> > -   transport_remove_device(&starget->dev);
> > -   device_del(&starget->dev);
> > -   starget->state = STARGET_DEL;
> > +   /*
> > +* if we get here and the target is still in the CREATED state that
> > +* means it was allocated but never made visible (because a scan
> > +* turned up no LUNs), so don't call device_del() on it.
> > +*/
> > +   if (starget->state == STARGET_RUNNING) {
> > +   transport_remove_device(&starget->dev);
> > +   device_del(&starget->dev);
> > +   }
> 
> Here the state could already be STARGET_DEL, even though the target is
> still visible.

Well, I agree with the theory.  In practise, there are only a few
machine instructions between the kref going to zero and us reaching that
point, because kref_release will jump into this routine next, so the
condition would be very hard to see.  However, I suppose it's easy to
close by checking for != STARGET_CREATED and there's no reason not to do
that, so I'll change it.

> Also, it's a little odd that the comment talks about CREATED but the 
> code really checks for RUNNING.  They should be consistent.

!= STARGET_CREATED should solve this.

> > @@ -506,11 +513,13 @@ static struct scsi_target *scsi_alloc_target(struct 
> > device *parent,
> >   */
> >  void scsi_target_reap(struct scsi_target *starget)
> >  {
> > +   /*
> > +* serious problem if this triggers: STARGET_DEL is only set in the
> > +* kref release routine, so we're doing another final put on an
> > +* already released kref
> > +*/
> > BUG_ON(starget->state == STARGET_DEL);
> 
> Here the code is okay but the comment is wrong: STARGET_DEL is set in 
> _two_ places (but neither of them runs until reap_ref has reached 0).
> 
> Would it be better in scsi_alloc_target() to behave as though the state 
> were STARGET_DEL without actually setting it?

Yes, I'll update the comment to it only goes to DEL after the kref goes
to zero.

How does the attached diff look?

James

---

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 82cf902..2f7de33 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -390,7 +390,7 @@ static void scsi_target_reap_ref_release(struct kref *kref)
 * means it was allocated but never made visible (because a scan
 * turned up no LUNs), so don't call device_del() on it.
 */
-   if (starget->state == STARGET_RUNNING) {
+   if (starget->state != STARGET_CREATED) {
transport_remove_device(&starget->dev);
device_del(&starget->dev);
}
@@ -514,9 +514,9 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
 void scsi_target_reap(struct scsi_target *starget)
 {
/*
-* serious problem if this triggers: STARGET_DEL is only set in the
-* kref release routine, so we're doing another final put on an
-* already released kref
+* serious problem if this triggers: STARGET_DEL is only set in the if
+* the reap_ref drops to zero, so we're trying to do another final put
+* on an already released kref
 */
BUG_ON(starget->state == STARGET_DEL);
scsi_target_reap_ref_put(starget);




--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 2/2] dual scan thread bug fix

2014-01-07 Thread James Bottomley
On Sat, 2014-01-04 at 11:27 -0500, Alan Stern wrote:
> On Fri, 3 Jan 2014, James Bottomley wrote:
> 
> > > I'm still concerned about one thing.  The previous patch does this in
> > > scsi_alloc_target():
> > > 
> > > >   found:
> > > > -   found_target->reap_ref++;
> > > > +   if (!kref_get_unless_zero(&found_target->reap_ref))
> > > > +   /*
> > > > +* release routine already fired.  Target is dead, but
> > > > +* STARGET_DEL may not yet be set (set in the release
> > > > +* routine), so set here as well, just in case
> > > > +*/
> > > > +   found_target->state = STARGET_DEL;
> > > > spin_unlock_irqrestore(shost->host_lock, flags);
> > > 
> > > As a result, the two comments in this patch aren't right:
> > > 
> > > > @@ -384,9 +385,15 @@ static void scsi_target_reap_ref_release(struct 
> > > > kref *kref)
> > > > struct scsi_target *starget
> > > > = container_of(kref, struct scsi_target, reap_ref);
> > > >  
> > > > -   transport_remove_device(&starget->dev);
> > > > -   device_del(&starget->dev);
> > > > -   starget->state = STARGET_DEL;
> > > > +   /*
> > > > +* if we get here and the target is still in the CREATED state 
> > > > that
> > > > +* means it was allocated but never made visible (because a scan
> > > > +* turned up no LUNs), so don't call device_del() on it.
> > > > +*/
> > > > +   if (starget->state == STARGET_RUNNING) {
> > > > +   transport_remove_device(&starget->dev);
> > > > +   device_del(&starget->dev);
> > > > +   }
> > > 
> > > Here the state could already be STARGET_DEL, even though the target is
> > > still visible.
> > 
> > Well, I agree with the theory.  In practise, there are only a few
> > machine instructions between the kref going to zero and us reaching that
> > point, because kref_release will jump into this routine next, so the
> > condition would be very hard to see.
> 
> It's true that the window is very small and not at all likely to be
> hit.  Still, I prefer eliminating such things entirely.
> 
> >  However, I suppose it's easy to
> > close by checking for != STARGET_CREATED and there's no reason not to do
> > that, so I'll change it.
> 
> Checking for != STARGET_CREATED is also wrong in principle.  The state
> could already be STARGET_DEL even though the target was never made
> visible.
> 
> The basic problem is that you are relying on the state to be an
> accurate description of the target's visibility, but
> scsi_alloc_target() changes the state without changing the visibility.  
> I really think you should get rid of that extra assignment in
> scsi_alloc_target().

OK, Agreed, but that means modifying the 1/2 patch with the below.  This
should make the proposed diff to 2/2 correct.

James

---
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index ef3f958..5fad646 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -417,7 +417,7 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
+ shost->transportt->target_size;
struct scsi_target *starget;
struct scsi_target *found_target;
-   int error;
+   int error, ref_got;
 
starget = kzalloc(size, GFP_KERNEL);
if (!starget) {
@@ -466,15 +466,15 @@ static struct scsi_target *scsi_alloc_target(struct 
device *parent,
return starget;
 
  found:
-   if (!kref_get_unless_zero(&found_target->reap_ref))
-   /*
-* release routine already fired.  Target is dead, but
-* STARGET_DEL may not yet be set (set in the release
-* routine), so set here as well, just in case
-*/
-   found_target->state = STARGET_DEL;
+   /*
+* release routine already fired if kref is zero, so if we can still
+* take the reference, the target must be alive.  If we can't, it must
+* be dying and we need to wait for a new target
+*/
+   ref_got = kref_get_unless_zero(&found_target->reap_ref);
+
spin_unlock_irqrestore(shost->host_lock, flags);
-   if (found_target->state != STARGET_DEL) {
+   if (ref_got) {
put_device(dev);
return fou

Re: [RFC 2/2] dual scan thread bug fix

2014-01-08 Thread James Bottomley
On Wed, 2014-01-08 at 10:57 -0500, Alan Stern wrote:
> On Wed, 8 Jan 2014, James Bottomley wrote:
> 
> > OK, Agreed, but that means modifying the 1/2 patch with the below.  This
> > should make the proposed diff to 2/2 correct.
> > 
> > James
> > 
> > ---
> > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> > index ef3f958..5fad646 100644
> > --- a/drivers/scsi/scsi_scan.c
> > +++ b/drivers/scsi/scsi_scan.c
> > @@ -417,7 +417,7 @@ static struct scsi_target *scsi_alloc_target(struct 
> > device *parent,
> > + shost->transportt->target_size;
> > struct scsi_target *starget;
> > struct scsi_target *found_target;
> > -   int error;
> > +   int error, ref_got;
> >  
> > starget = kzalloc(size, GFP_KERNEL);
> > if (!starget) {
> > @@ -466,15 +466,15 @@ static struct scsi_target *scsi_alloc_target(struct 
> > device *parent,
> > return starget;
> >  
> >   found:
> > -   if (!kref_get_unless_zero(&found_target->reap_ref))
> > -   /*
> > -* release routine already fired.  Target is dead, but
> > -* STARGET_DEL may not yet be set (set in the release
> > -* routine), so set here as well, just in case
> > -*/
> > -   found_target->state = STARGET_DEL;
> > +   /*
> > +* release routine already fired if kref is zero, so if we can still
> > +* take the reference, the target must be alive.  If we can't, it must
> > +* be dying and we need to wait for a new target
> > +*/
> > +   ref_got = kref_get_unless_zero(&found_target->reap_ref);
> > +
> > spin_unlock_irqrestore(shost->host_lock, flags);
> > -   if (found_target->state != STARGET_DEL) {
> > +   if (ref_got) {
> > put_device(dev);
> > return found_target;
> > }
> > @@ -482,8 +482,8 @@ static struct scsi_target *scsi_alloc_target(struct 
> > device *parent,
> >  * Unfortunately, we found a dying target; need to wait until it's
> >  * dead before we can get a new one.  There is an anomaly here.  We
> >  * *should* call scsi_target_reap() to balance the kref_get() of the
> > -* reap_ref above.  However, since the target is in state STARGET_DEL,
> > -* it's already invisible and the reap_ref is irrelevant.  If we call
> > +* reap_ref above.  However, since the target being released, it's
> > +* already invisible and the reap_ref is irrelevant.  If we call
> >  * scsi_target_reap() we might spuriously do another device_del() on
> >  * an already invisible target.
> >  */
> 
> In fact, most of this comment (everything after the first sentence) is
> no longer needed.  If the target is dying then kref_get_unless_zero
> must have failed, so there is no need to worry about unbalanced
> refcounts.

I'd like to keep the comment: I get a lot of email from people who write
static checkers for this.  In principle, I agree, they should treat
kref_get_unless_zero() as a spin_trylock(), but it's nice to have
something concrete in the code to point to when the email arrives.

> Otherwise this looks good.

Great, thanks, I'll re-roll and repost.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] target refcounting infrastructure fixes for usb

2014-01-21 Thread James Bottomley
After about 3 iterations as an RFC, I'm elevating this to a patch.  If
it works, I'll try for this merge window with a delayed backport to
stable (after the next kernel release to make sure no bugs got
introduced).

This set should fix our target problems with USB by making the target
visibility properly reference counted.

James Bottomley (2):
  [SCSI] fix our current target reap infrastructure.
  [SCSI] dual scan thread bug fix

 drivers/scsi/scsi_scan.c   | 112 -
 drivers/scsi/scsi_sysfs.c  |  20 +---
 include/scsi/scsi_device.h |   3 +-
 3 files changed, 86 insertions(+), 49 deletions(-)

-- 
1.8.4.3



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] dual scan thread bug fix

2014-01-21 Thread James Bottomley
In the highly unusual case where two threads are running concurrently through
the scanning code scanning the same target, we run into the situation where
one may allocate the target while the other is still using it.  In this case,
because the reap checks for STARGET_CREATED and kills the target without
reference counting, the second thread will do the wrong thing on reap.

Fix this by reference counting even creates and doing the STARGET_CREATED
check in the final put.

Signed-off-by: James Bottomley 
---
 drivers/scsi/scsi_scan.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 5fad646..4109530 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -320,6 +320,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
struct Scsi_Host *shost = dev_to_shost(dev->parent);
unsigned long flags;
 
+   starget->state = STARGET_DEL;
transport_destroy_device(dev);
spin_lock_irqsave(shost->host_lock, flags);
if (shost->hostt->target_destroy)
@@ -384,9 +385,15 @@ static void scsi_target_reap_ref_release(struct kref *kref)
struct scsi_target *starget
= container_of(kref, struct scsi_target, reap_ref);
 
-   transport_remove_device(&starget->dev);
-   device_del(&starget->dev);
-   starget->state = STARGET_DEL;
+   /*
+* if we get here and the target is still in the CREATED state that
+* means it was allocated but never made visible (because a scan
+* turned up no LUNs), so don't call device_del() on it.
+*/
+   if (starget->state != STARGET_CREATED) {
+   transport_remove_device(&starget->dev);
+   device_del(&starget->dev);
+   }
scsi_target_destroy(starget);
 }
 
@@ -506,11 +513,13 @@ static struct scsi_target *scsi_alloc_target(struct 
device *parent,
  */
 void scsi_target_reap(struct scsi_target *starget)
 {
+   /*
+* serious problem if this triggers: STARGET_DEL is only set in the if
+* the reap_ref drops to zero, so we're trying to do another final put
+* on an already released kref
+*/
BUG_ON(starget->state == STARGET_DEL);
-   if (starget->state == STARGET_CREATED)
-   scsi_target_destroy(starget);
-   else
-   scsi_target_reap_ref_put(starget);
+   scsi_target_reap_ref_put(starget);
 }
 
 /**
-- 
1.8.4.3



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] fix our current target reap infrastructure

2014-01-21 Thread James Bottomley
This patch eliminates the reap_ref and replaces it with a proper kref.
On last put of this kref, the target is removed from visibility in
sysfs.  The final call to scsi_target_reap() for the device is done from
__scsi_remove_device() and only if the device was made visible.  This
ensures that the target disappears as soon as the last device is gone
rather than waiting until final release of the device (which is often
too long).

Reviewed-by: Alan Stern 
Signed-off-by: James Bottomley 
---
 drivers/scsi/scsi_scan.c   | 99 --
 drivers/scsi/scsi_sysfs.c  | 20 +++---
 include/scsi/scsi_device.h |  3 +-
 3 files changed, 75 insertions(+), 47 deletions(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 307a811..5fad646 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -371,6 +371,31 @@ static struct scsi_target *__scsi_find_target(struct 
device *parent,
 }
 
 /**
+ * scsi_target_reap_ref_release - remove target from visibility
+ * @kref: the reap_ref in the target being released
+ *
+ * Called on last put of reap_ref, which is the indication that no device
+ * under this target is visible anymore, so render the target invisible in
+ * sysfs.  Note: we have to be in user context here because the target reaps
+ * should be done in places where the scsi device visibility is being removed.
+ */
+static void scsi_target_reap_ref_release(struct kref *kref)
+{
+   struct scsi_target *starget
+   = container_of(kref, struct scsi_target, reap_ref);
+
+   transport_remove_device(&starget->dev);
+   device_del(&starget->dev);
+   starget->state = STARGET_DEL;
+   scsi_target_destroy(starget);
+}
+
+static void scsi_target_reap_ref_put(struct scsi_target *starget)
+{
+   kref_put(&starget->reap_ref, scsi_target_reap_ref_release);
+}
+
+/**
  * scsi_alloc_target - allocate a new or find an existing target
  * @parent:parent of the target (need not be a scsi host)
  * @channel:   target channel number (zero if no channels)
@@ -392,7 +417,7 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
+ shost->transportt->target_size;
struct scsi_target *starget;
struct scsi_target *found_target;
-   int error;
+   int error, ref_got;
 
starget = kzalloc(size, GFP_KERNEL);
if (!starget) {
@@ -401,7 +426,7 @@ static struct scsi_target *scsi_alloc_target(struct device 
*parent,
}
dev = &starget->dev;
device_initialize(dev);
-   starget->reap_ref = 1;
+   kref_init(&starget->reap_ref);
dev->parent = get_device(parent);
dev_set_name(dev, "target%d:%d:%d", shost->host_no, channel, id);
dev->bus = &scsi_bus_type;
@@ -441,29 +466,36 @@ static struct scsi_target *scsi_alloc_target(struct 
device *parent,
return starget;
 
  found:
-   found_target->reap_ref++;
+   /*
+* release routine already fired if kref is zero, so if we can still
+* take the reference, the target must be alive.  If we can't, it must
+* be dying and we need to wait for a new target
+*/
+   ref_got = kref_get_unless_zero(&found_target->reap_ref);
+
spin_unlock_irqrestore(shost->host_lock, flags);
-   if (found_target->state != STARGET_DEL) {
+   if (ref_got) {
put_device(dev);
return found_target;
}
-   /* Unfortunately, we found a dying target; need to
-* wait until it's dead before we can get a new one */
+   /*
+* Unfortunately, we found a dying target; need to wait until it's
+* dead before we can get a new one.  There is an anomaly here.  We
+* *should* call scsi_target_reap() to balance the kref_get() of the
+* reap_ref above.  However, since the target being released, it's
+* already invisible and the reap_ref is irrelevant.  If we call
+* scsi_target_reap() we might spuriously do another device_del() on
+* an already invisible target.
+*/
put_device(&found_target->dev);
-   flush_scheduled_work();
+   /*
+* length of time is irrelevant here, we just want to yield the CPU
+* for a tick to avoid busy waiting for the target to die.
+*/
+   msleep(1);
goto retry;
 }
 
-static void scsi_target_reap_usercontext(struct work_struct *work)
-{
-   struct scsi_target *starget =
-   container_of(work, struct scsi_target, ew.work);
-
-   transport_remove_device(&starget->dev);
-   device_del(&starget->dev);
-   scsi_target_destroy(starget);
-}
-
 /**
  * scsi_target_reap - check to see if target is in use and destroy if not
  * @starget: target to be checked
@@ -474,28 +506,11 @@ static void scsi_target_reap_usercontext(struct

Re: v3.14-rc2+ WARNING: CPU: 0 PID: 24108 at fs/sysfs/group.c:216 sysfs_remove_group+0xa3/0xb0()

2014-02-17 Thread James Bottomley
On Mon, 2014-02-17 at 14:48 -0500, Alan Stern wrote:
> On Mon, 17 Feb 2014, Ronald wrote:
> 
> > I caught this by coincidence since I had netconsole attached to debug
> > a nouveau MSI problem (deadlock). It happenned when I disconnected my
> > external hardrive in thunar. I have attached my full dmesg.
> 
> This isn't a USB problem.
> 
> > [17764.280135] usb 1-2: new high-speed USB device number 13 using ehci-pci
> > [17764.453094] usb-storage 1-2:1.0: USB Mass Storage device detected
> > [17764.461737] scsi14 : usb-storage 1-2:1.0
> > [17772.722083] scsi 14:0:0:0: Direct-Access Maxtor   3200
> >0341 PQ: 0 ANSI: 4
> > [17772.914430] sd 14:0:0:0: [sdb] 586114704 512-byte logical blocks:
> > (300 GB/279 GiB)
> > [17772.917232] sd 14:0:0:0: [sdb] Write Protect is off
> > [17772.917246] sd 14:0:0:0: [sdb] Mode Sense: 17 00 00 00
> > [17772.919206] sd 14:0:0:0: [sdb] No Caching mode page found
> > [17772.919222] sd 14:0:0:0: [sdb] Assuming drive cache: write through
> > [17772.925185] sd 14:0:0:0: [sdb] No Caching mode page found
> > [17772.925199] sd 14:0:0:0: [sdb] Assuming drive cache: write through
> > [17773.020601]  sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 sdb8 sdb9 >
> > [17773.027194] sd 14:0:0:0: [sdb] No Caching mode page found
> > [17773.027211] sd 14:0:0:0: [sdb] Assuming drive cache: write through
> > [17773.027219] sd 14:0:0:0: [sdb] Attached SCSI disk
> > [17828.434461] Buffer I/O error on device sdb1, logical block 32
> > [17828.434930] Buffer I/O error on device sdb1, logical block 786432
> > [17828.434962] Buffer I/O error on device sdb1, logical block 16384
> > [17828.434971] Buffer I/O error on device sdb1, logical block 512
> > [17828.434984] Buffer I/O error on device sdb1, logical block 48496383
> > [17828.435024] Buffer I/O error on device sdb1, logical block 3
> > [17828.435032] Buffer I/O error on device sdb1, logical block 7
> > [17828.436022] [ cut here ]
> > [17828.436053] WARNING: CPU: 0 PID: 24108 at fs/sysfs/group.c:216
> > sysfs_remove_group+0xa3/0xb0()
> > [17828.436064] sysfs group 81a74bc0 not found for kobject 
> > 'target14:0:0'
> > [17828.436069] CPU: 0 PID: 24108 Comm: systemd-udevd Not tainted
> > 3.14.0-rc2-00750-g946dd68-dirty #3
> > [17828.436141] Hardware name: Hewlett-Packard HP Pavilion dv6000
> > (RS551EA#ABH)  /30B7, BIOS F.40 08/01/2008
> > [17828.436145]  0009 88003170fb70 8165eb64
> > 88003170fba8
> > [17828.436278]  8102c0c6  81a74bc0
> > 880070bb7038
> > [17828.436298]  880033f4f168 880048aa4000 88003170fc08
> > 8102c167
> > [17828.436309] Call Trace:
> > [17828.436321]  [] dump_stack+0x19/0x1b
> > [17828.436330]  [] warn_slowpath_common+0x76/0xa0
> > [17828.436337]  [] warn_slowpath_fmt+0x47/0x50
> > [17828.436345]  [] ? kernfs_find_and_get_ns+0x48/0x60
> > [17828.436353]  [] sysfs_remove_group+0xa3/0xb0
> > [17828.436362]  [] dpm_sysfs_remove+0x3e/0x50
> > [17828.436370]  [] device_del+0x3d/0x1b0
> > [17828.436380]  [] scsi_target_reap_usercontext+0x2b/0x40
> > [17828.436444]  [] execute_in_process_context+0x60/0x70
> > [17828.436452]  [] scsi_target_reap+0x77/0xa0
> > [17828.436458]  []
> > scsi_device_dev_release_usercontext+0x160/0x190
> > [17828.436463]  [] execute_in_process_context+0x60/0x70
> > [17828.436467]  [] scsi_device_dev_release+0x17/0x20
> > [17828.436472]  [] device_release+0x38/0xb0
> > [17828.436478]  [] kobject_cleanup+0x43/0x80
> > [17828.436651]  [] kobject_put+0x28/0x60
> > [17828.436656]  [] put_device+0x12/0x20
> > [17828.436661]  [] scsi_device_put+0x10/0x20
> > [17828.43]  [] scsi_disk_put+0x34/0x50
> > [17828.436670]  [] sd_release+0x33/0x70
> > [17828.436675]  [] __blkdev_put+0x144/0x180
> > [17828.436679]  [] __blkdev_put+0xed/0x180
> > [17828.436683]  [] blkdev_put+0x97/0x110
> > [17828.436687]  [] blkdev_close+0x20/0x30
> > [17828.436693]  [] __fput+0xe2/0x200
> > [17828.436698]  [] fput+0x9/0x10
> > [17828.436703]  [] task_work_run+0x7f/0xc0
> > [17828.436708]  [] do_notify_resume+0x66/0x70
> > [17828.436829]  [] ? filp_close+0x51/0x80
> > [17828.436835]  [] int_signal+0x12/0x17
> > [17828.436839] ---[ end trace b057fe2fdcfc8458 ]---
> > [17828.441765] usb 1-2: USB disconnect, device number 13
> 
> You can tell by the way the stack trace doesn't mention USB at all.  In 
> fact, this is a known SCSI problem.  It has been fixed by these two 
> patches:
> 
>   http://marc.info/?l=linux-scsi&m=139031645920152&w=2
>   http://marc.info/?l=linux-scsi&m=139031650720168&w=2
> 
> If these patches have appeared in any git repositories yet, I don't 
> know where.
> 
> James, what happened to those two target-reap-infrastructure patches?  
> They don't seem to have reached linux-next yet, and I can't find them 
> in your SCSI repository on git.kernel.org.

Yes, I'm just about to build it.  I usually start at -rc1, but got a
little behind because of pressure of work.  I need them to go through 

Re: [balbi-usb:xceiv 3/3] drivers/usb/phy/rcar-phy.c:75:3: error: implicit declaration of function 'iowrite32'

2012-11-01 Thread James Bottomley
On Thu, 2012-11-01 at 06:04 -0700, Greg KH wrote:
> On Thu, Nov 01, 2012 at 12:50:34PM +0200, Felipe Balbi wrote:
> > On Thu, Nov 01, 2012 at 12:33:39PM +0200, Felipe Balbi wrote:
> > > Hi,
> > > 
> > > On Thu, Nov 01, 2012 at 06:29:30PM +0800, kbuild test robot wrote:
> > > > tree:   git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git 
> > > > xceiv
> > > > head:   1789e52acc90c87484a109d6349eefe63cabb257
> > > > commit: 1789e52acc90c87484a109d6349eefe63cabb257 [3/3] usb: phy: add 
> > > > R-Car USB phy driver
> > > > config: make ARCH=cris allyesconfig
> > > > 
> > > > All error/warnings:
> > > > 
> > > > drivers/usb/phy/rcar-phy.c: In function 'rcar_usb_phy_init':
> > > > drivers/usb/phy/rcar-phy.c:75:3: error: implicit declaration of 
> > > > function 'iowrite32' [-Werror=implicit-function-declaration]
> > > > drivers/usb/phy/rcar-phy.c:83:4: error: implicit declaration of 
> > > > function 'ioread32' [-Werror=implicit-function-declaration]
> > > > cc1: some warnings being treated as errors
> > > 
> > > too bad that it compiles fine on x86 and ARM :-(
> > > 
> > > Kuninori, care to send me a fixup patch ? I guess using writel()/readl()
> > > should do it ?!?
> > 
> > looks like this isn't enough. We have 6 arches which don't provide
> > writel()/readl(), namely blackfin, c6x, openrisc, s390, score, and um.
> > 
> > Should those arches be fixed instead ? Don't we have a single
> > memory-mapped access set of APIs which are supposed to be provided by
> > all arches ?
> > 
> > Greg, you've been doing this for much more time then I have. Should all
> > arches provide writeb/w/l/q and readb/w/l/q ??

Certainly not the q variants.  It's debatable on the l ones.

The reason for the q problem is that if the arch can achieve atomic bus
transactions for 8 bytes, that's fine, but if the arch can't do that,
it's up to the driver to decide what it needs.  Most can do two 4 byte
writes, but a few need the atomicity.  We can't decide that in the arch
so we'd have to assume the most extreme case, so using a spinlock around
the two 4 byte writes to ensure atomicity, which would be a killer for
performance.

> I would think so, but I really don't know.
> 
> linux-arch people, what do we do about architectures that don't provide
> these functions?  Just disable building the drivers for them, or do we
> fix them somehow?

A good driver has #if's for this case ... see for example

drivers/scsi/ipr.h

(ipr doesn't need atomicity, so it just defines writeq to be a pair of
writels.  See drivers/scsi/mpt2sas/mpt2sas_base.c for a driver that
requires atomicity.)

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] USB enclosures seem to require read(16) with >2TB drives

2012-11-12 Thread James Bottomley
On Fri, 2012-11-09 at 16:33 +, Elliott, Robert (Server Storage)
wrote:
> I recommend broadening this patch.  T10 is discussing making READ
> (10), WRITE (10), etc. obsolete in SBC-4 in favor of their 16-byte CDB
> counterparts.  
> 
> The algorithm should be:
> 1. During discovery, determine if 16-byte CDBs are supported.  There
> are several ways to determine this:
> a) REPORT SUPPORTED OPERATION CODES command succeeds and reports that
> READ (16) et al are supported.
> b) READ (16) command specifying a Transfer Length of zero succeeds.
> c) READ CAPACITY (16) command succeeds and reports that the capacity
> is > 2 TiB.
> d) (future) INQUIRY command succeeds fetching the Block Device
> Characteristics VPD page and notices a new field added by the SBC-4
> simplified SCSI feature set proposal.

When you consider the problem that we must support all devices (which
means the older ones as well), including the annoying USB ones which
crash on unexpected commands, you can see that d) is about the only
viable option.

We can also force RC16 if the capacity is over 2^32 blocks, because the
command will be required, so that's probably the place to start.

James

> Since REPORT SUPPORTED OPERATION CODES is optional, it won't always
> work. READ CAPACITY (16) used to be optional for < 2 TiB drives, so it
> won't always work either. READ (16) will always work, but requires the
> drive to be spun up beforehand (e.g., with START STOP UNIT).
>  
> 2. if 16-byte CDBs are supported, then use them; only drop down to
> 10-byte CDBs if 16-byte CDBs are unavailable.  Don't make the decision
> by comparing the LBA on every IO.
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] USB enclosures seem to require read(16) with >2TB drives

2012-11-12 Thread James Bottomley
On Fri, 2012-11-09 at 11:08 -0500, Jason J. Herne wrote:
> diff --git a/drivers/usb/storage/scsiglue.c
> b/drivers/usb/storage/scsiglue.c
> index 13b8bcd..6ff785e 100644
> --- a/drivers/usb/storage/scsiglue.c
> +++ b/drivers/usb/storage/scsiglue.c
> @@ -251,6 +251,11 @@ static int slave_configure(struct scsi_device
> *sdev)
> US_FL_SCM_MULT_TARG)) &&
> us->protocol == USB_PR_BULK)
> us->use_last_sector_hacks = 1;
> +
> +   /* Force read-16 for large capacity drives. */
> +   sdev->force_read_16 = 1;
> +
> +

This turns READ_16 on unconditionally for all USB disks ... is that
safe?  As in have you tested this with older USB thumb drives?

Also do we have to worry about TYPE_RBC? ... this looks like a bit of a
global omission in usbstorage.c


> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index 5591ed5..e92b846 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -134,6 +134,7 @@ struct scsi_device {
>  * because we did a bus reset. */
> unsigned use_10_for_rw:1; /* first try 10-byte read / write */
> unsigned use_10_for_ms:1; /* first try 10-byte mode
> sense/select */
> +   unsigned force_read_16:1; /* Use read(16) over read(10) */

This should probably be use_16_for_rw:1 for consistency.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] USB enclosures seem to require read(16) with >2TB drives

2012-11-12 Thread James Bottomley
On Mon, 2012-11-12 at 15:31 +0100, Paolo Bonzini wrote:
> Il 12/11/2012 12:33, James Bottomley ha scritto:
> > On Fri, 2012-11-09 at 11:08 -0500, Jason J. Herne wrote:
> >> diff --git a/drivers/usb/storage/scsiglue.c
> >> b/drivers/usb/storage/scsiglue.c
> >> index 13b8bcd..6ff785e 100644
> >> --- a/drivers/usb/storage/scsiglue.c
> >> +++ b/drivers/usb/storage/scsiglue.c
> >> @@ -251,6 +251,11 @@ static int slave_configure(struct scsi_device
> >> *sdev)
> >> US_FL_SCM_MULT_TARG)) &&
> >> us->protocol == USB_PR_BULK)
> >> us->use_last_sector_hacks = 1;
> >> +
> >> +   /* Force read-16 for large capacity drives. */
> >> +   sdev->force_read_16 = 1;
> >> +
> >> +
> > 
> > This turns READ_16 on unconditionally for all USB disks ... is that
> > safe?  As in have you tested this with older USB thumb drives?
> 
> Actually it only turns it on for large capacity drives, as said in the
> comment.  sdp->force_read_16 only matters for >2TB drives: 

If you follow the discussion, we'll need to turn it on for some drives
regardless of size.

> >> +  /* Many large capacity USB drives/controllers require the use of 
> >> read(16). */
> >> +  force_read16 = (sdkp->capacity > 0xULL && sdp->force_read_16);
> >> +
> >>if (host_dif == SD_DIF_TYPE2_PROTECTION) {
> >>SCpnt->cmnd = mempool_alloc(sd_cdb_pool, GFP_ATOMIC);
> >>  
> >> @@ -833,7 +836,7 @@ static int sd_prep_fn(struct request_queue *q, struct 
> >> request *rq)
> >>SCpnt->cmnd[29] = (unsigned char) (this_count >> 16) & 0xff;
> >>SCpnt->cmnd[30] = (unsigned char) (this_count >> 8) & 0xff;
> >>SCpnt->cmnd[31] = (unsigned char) this_count & 0xff;
> >> -  } else if (block > 0x) {
> >> +  } else if (block > 0x || force_read16) {
> 
> so the better name would be never_use_10_for_rw_on_large_disks.  For
> some definitions of better. :)

Hm.

> Any reason not to do this always on >2TB drives, which basically means
> changing this:
> 
> - } else if (block > 0x) {
> + } else if (sdkp->capacity > 0x) {
> 
> and nothing else?

Because of the coming T10 mandate in SBC-4 deprecating everything other
than the 16 byte commands.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] USB enclosures seem to require read(16) with >2TB drives

2012-11-12 Thread James Bottomley
On Mon, 2012-11-12 at 10:01 -0500, Jason J. Herne wrote:
> > Any reason not to do this always on >2TB drives, which basically means
> > changing this:
> >
> > -   } else if (block > 0x) {
> > +   } else if (sdkp->capacity > 0x) {
> >
> > and nothing else?
> 
> This was the intent of my patch, except I wanted to *only* affect USB based
> drives as my drive was functioning perfecting when connected to an internal
> Sata port.  I was adopting the "Do not fix what isn't broke" mentality.

There's a subtlety here: block is in units of the disk sector size,
sdkp->capacity is in units of 512 bytes (the linux native sector size),
so it would need shifting before doing the determination.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [regression] uas no longer queues commands since v4.5.2

2016-05-23 Thread James Bottomley
On Mon, 2016-05-23 at 13:48 +0200, Hans de Goede wrote:
> Hi,
> 
> On 22-05-16 12:39, Tom Yan wrote:
> > With commit 198de51dbc3454d95b015ca0a055b673f85f01bb, uas no longer
> > set `queue_depth` with scsi_change_queue_depth(), so now
> > `queue_depth`
> > of UAS drives are 1. Even though `can_queue` is set to
> > `devinfo->qdepth - 2`, but apparently that does not help, since I
> > can
> > see performance impact.
> > 
> > Suppose limiting `can_queue` is the right way to fix this bug
> > (https://bugzilla.redhat.com/show_bug.cgi?id=1315013), do we really
> > need to eliminate scsi_change_queue_depth() at the same time
> > though?
> 
> No, I thought it was no longer needed, assuming the slave would
> inherit the host's value, but I was mistaken, the slave will default
> to a queue_depth of 1 if not specified.

Just to improve understanding: shost->can_queue is the host limit: it's
the maximum number of commands that can be outstanding on the host at
any one time, which is why it's used as the count for the shared tag
map.  It's also a fixed limit and can't be altered after the host is
instantiated.  sdev->queue_depth is the device queue limit, it can be
altered dynamically during device operation.  There's no relation
between them except that if you make queue_depth larger than can_queue,
the latter rules.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] uas: leave can_queue as MAX_CMNDS if device reports larger qdepth

2016-05-23 Thread James Bottomley
On Tue, 2016-05-24 at 01:23 +0800, Tom Yan wrote:
> Nothing wrong. It's just .can_queue = MAX_CMNDS in the host template
> is no longer neceesary, since with his patch, uas will set can_queue
> again later (to devinfo->qdepth - 2).
> 
> Originally I thought .can_queue = MAX_CMNDS can hence be removed; but
> after a second thought, I think it might probably be better if we
> leave it there and make use of it, in case certain device somehow
> inapproriately reports an enormous qdepth (i.e. larger than
> MAX_CMNDS). (According to the commit message of 55ff8cfbc4e1 ("USB:
> uas: Reduce can_queue to MAX_CMNDS"), "The uas driver can never queue
> more then MAX_CMNDS...")

OK, so try this as an exercise: Why would this not be the right thing
to do after the host is prepared:  It has to do with the streams
resources the driver has already created.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] USB: uas: Fix slave queue_depth not being set

2016-05-23 Thread James Bottomley
On Mon, 2016-05-23 at 13:49 +0200, Hans de Goede wrote:
> Commit 198de51dbc34 ("USB: uas: Limit qdepth at the scsi-host level")
> removed the scsi_change_queue_depth() call from uas_slave_configure()
> assuming that the slave would inherit the host's queue_depth, which
> that commit sets to the same value.
> 
> This is incorrect, without the scsi_change_queue_depth() call the 
> slave's queue_depth defaults to 1, introducing a performance
> regression.
> 
> This commit restores the call, fixing the performance regression.
> 
> Cc: sta...@vger.kernel.org
> Fixes: 198de51dbc34 ("USB: uas: Limit qdepth at the scsi-host level")
> Reported-by: Tom Yan 
> Signed-off-by: Hans de Goede 
> ---
>  drivers/usb/storage/uas.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
> index 16bc679..ecc7d4b 100644
> --- a/drivers/usb/storage/uas.c
> +++ b/drivers/usb/storage/uas.c
> @@ -835,6 +835,7 @@ static int uas_slave_configure(struct scsi_device
> *sdev)
>   if (devinfo->flags & US_FL_BROKEN_FUA)
>   sdev->broken_fua = 1;
>  
> + scsi_change_queue_depth(sdev, devinfo->qdepth - 2);

Are you sure about this?  For spinning rust, experiments imply that the
optimal queue depth per device is somewhere between 2 and 4.  Obviously
that's not true for SSDs, so it depends on your use case.  Plus, for
ATA NCQ devices (which I believe most UAS is bridged to) you have a
maximum NCQ depth of 31.

There's a good reason why you don't want a queue deeper than you can
handle: it tends to interfere with writeback because you build up a lot
of pending I/O in the queue which can't be issued (it's very similar to
why bufferbloat is a problem in networks).  In theory, as long as your
devices return the correct indicator (QUEUE_FULL status), we'll handle
most of this in the mid-layer by plugging the block queue, but given
what I've seen from UAS devices, that's less than probable.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] uas: leave can_queue as MAX_CMNDS if device reports larger qdepth

2016-05-23 Thread James Bottomley
On Tue, 2016-05-24 at 02:33 +0800, Tom Yan wrote:
> I don't quite get what you mean. Are you saying that it is impossible
> that UAS devices would report inappropriately high qdepth, because of
> this?
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/
> drivers/usb/storage/uas.c?h=v4.6#n908

OK, you found the statement, now translate what it means to resource
allocation and it will show you why the if you proposed to add would
always be true.

James

> In that case should I send another patch that simply has `.can_queue
> =
> MAX_CMNDS` in the host template removed?
> 
> On 24 May 2016 at 01:27, James Bottomley
>  wrote:
> > On Tue, 2016-05-24 at 01:23 +0800, Tom Yan wrote:
> > > Nothing wrong. It's just .can_queue = MAX_CMNDS in the host
> > > template
> > > is no longer neceesary, since with his patch, uas will set
> > > can_queue
> > > again later (to devinfo->qdepth - 2).
> > > 
> > > Originally I thought .can_queue = MAX_CMNDS can hence be removed;
> > > but
> > > after a second thought, I think it might probably be better if we
> > > leave it there and make use of it, in case certain device somehow
> > > inapproriately reports an enormous qdepth (i.e. larger than
> > > MAX_CMNDS). (According to the commit message of 55ff8cfbc4e1
> > > ("USB:
> > > uas: Reduce can_queue to MAX_CMNDS"), "The uas driver can never
> > > queue
> > > more then MAX_CMNDS...")
> > 
> > OK, so try this as an exercise: Why would this not be the right
> > thing
> > to do after the host is prepared:  It has to do with the streams
> > resources the driver has already created.
> > 
> > James
> > 
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] USB: uas: Fix slave queue_depth not being set

2016-05-24 Thread James Bottomley
On Tue, 2016-05-24 at 08:53 +0200, Hans de Goede wrote:
> Hi,
> 
> On 23-05-16 19:36, James Bottomley wrote:
> > On Mon, 2016-05-23 at 13:49 +0200, Hans de Goede wrote:
> > > Commit 198de51dbc34 ("USB: uas: Limit qdepth at the scsi-host
> > > level")
> > > removed the scsi_change_queue_depth() call from 
> > > uas_slave_configure() assuming that the slave would inherit the 
> > > host's queue_depth, which that commit sets to the same value.
> > > 
> > > This is incorrect, without the scsi_change_queue_depth() call the
> > > slave's queue_depth defaults to 1, introducing a performance
> > > regression.
> > > 
> > > This commit restores the call, fixing the performance regression.
> > > 
> > > Cc: sta...@vger.kernel.org
> > > Fixes: 198de51dbc34 ("USB: uas: Limit qdepth at the scsi-host
> > > level")
> > > Reported-by: Tom Yan 
> > > Signed-off-by: Hans de Goede 
> > > ---
> > >  drivers/usb/storage/uas.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/usb/storage/uas.c
> > > b/drivers/usb/storage/uas.c
> > > index 16bc679..ecc7d4b 100644
> > > --- a/drivers/usb/storage/uas.c
> > > +++ b/drivers/usb/storage/uas.c
> > > @@ -835,6 +835,7 @@ static int uas_slave_configure(struct
> > > scsi_device
> > > *sdev)
> > >   if (devinfo->flags & US_FL_BROKEN_FUA)
> > >   sdev->broken_fua = 1;
> > > 
> > > + scsi_change_queue_depth(sdev, devinfo->qdepth - 2);
> > 
> > Are you sure about this?  For spinning rust, experiments imply that 
> > the optimal queue depth per device is somewhere between 2 and 4. 
> >  Obviously that's not true for SSDs, so it depends on your use 
> > case.  Plus, for ATA NCQ devices (which I believe most UAS is 
> > bridged to) you have a maximum NCQ depth of 31.
> 
> So this value is the same as host.can_queue, and is what uas has 
> always used, basically this says it is ok to queue as much as the 
> bridge can handle. We've seen a few rare multi-lun devices, but 
> typically almost all uas devices have one lun, what I really want to 
> do here is give a maximum and let say the sd driver lower that if it
> is sub-optimal.

If that's what you actually want, you should be setting sdev
->max_queue_depth and .track_queue_depth = 1 in the template.

You might also need to add calls to scsi_track_queue_full() but only if
the devices aren't responding QUEUE_FULL correctly.

James

> Also notice that uas is used a lot with ssd-s, that is mostly what
> I want to optimize for, but it is definitely also used with spinning
> rust.
> 
> And yes almost all uas devices are bridged sata devices (this may
> change in the near future though, with ssd-s specifically designed
> for usb-3 attachment, although sofar these all seem to use an
> embbeded sata bridge), so from this pov an upper limit of 31 makes 
> sense, I guess, but I've not seen any bridges which actually do more 
> then 32 streams anyways.
> 
> Still this is a bug-fix patch, essentially a partial revert, to 
> address performance regressions, so lets get this out as is and take 
> our time to come up with some tweaks (if necessary) for the say 4.8.
> 
> > There's a good reason why you don't want a queue deeper than you 
> > can handle: it tends to interfere with writeback because you build 
> > up a lot of pending I/O in the queue which can't be issued (it's 
> > very similar to why bufferbloat is a problem in networks).  In 
> > theory, as long as your devices return the correct indicator 
> > (QUEUE_FULL status), we'll handle most of this in the mid-layer by 
> > plugging the block queue, but given what I've seen from UAS
> > devices, that's less than probable.
> 
> So any smart ideas how to be nicer to spinning rust, without 
> negatively impacting ssd-s? As said if I've to choice I think we 
> should chose optimizing ssd-s, as that is where uas is used a lot 
> (although usb  attached harddisks are switching over to it too).
> 
> Note I just checked the 1TB sata/ahci harddisk in my workstation and 
> it is using a queue_depth of 31 too, so this really does seem like a 
> mid-layer problem to me.
> 
> Regards,
> 
> Hans
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Advanced Format SAT devices show incorrect physical block size

2017-01-10 Thread James Bottomley
On Tue, 2017-01-10 at 16:00 -0500, Alan Stern wrote:
> In theory, I suppose we could change the kernel so that it would 
> default to READ CAPACITY(16) for devices that report a SCSI level >= 
> 3, or something along those lines.  In general we hesitate to make
> changes of this sort, because they almost always end up breaking 
> _some_ devices -- and if that happens then the change is reverted, 
> with no exceptions.  Linus has a very strict rule about not breaking 
> working systems.

You shouldn't have to change anything: it already does (otherwise how
else would we detect physical exponent for proper SCSI devices) see
sd.c:sd_try_rc16_first().  It always returns false for USB because you
set sdev->try_rc_10_first

James

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seagate External SMR drive USB resets (was: Re: [PATCH] uas: Add US_FL_NO_ATA_1X quirk for one more Seagate device)

2017-11-15 Thread James Bottomley
On Wed, 2017-11-15 at 17:02 -0500, Alan Stern wrote:
> On Wed, 15 Nov 2017, Jérôme Carretero wrote:
> > >   Because with several of these drives / lots of activity /
> occasional
> > >   issues, it looks like it will be hard to catch (yes I can use
> > > usbmon).
> > > 
> > > - It looks like there is no configurable timeout for USB MSC
> requests.
> > >   Perhaps the device is not responding in time and this is why
> it's
> > >   reset?
> 
> Timeouts are set by the SCSI layer.  I believe they are rather long
> (30 seconds, by default).  Presumably they are configurable, although
> I would have to do some digging to figure out how.

They're in /sys/class/scsi_device//device/timeout

jejb@bedivere:~> ls -l /sys/class/scsi_device/0\:0\:0\:0/device/timeout
-rw-r--r-- 1 root root 4096 Nov 15 14:37 
/sys/class/scsi_device/0:0:0:0/device/timeout
jejb@bedivere:~> cat /sys/class/scsi_device/0\:0\:0\:0/device/timeout
30

You can actually have a udev rule adjust them on a per device id basis.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] uas: Add a new NO_REPORT_LUNS quirk

2016-03-31 Thread James Bottomley
On Thu, 2016-03-31 at 14:22 +0200, Hans de Goede wrote:
> Add a new NO_REPORT_LUNS quirk and set it for Seagate drives with
> an usb-id of: 0bc2:331a, as these will fail to respond to a
> REPORT_LUNS command.

Actually, if we're sending them a report luns command, they must be
reporting in at SCSI-3 SPC or higher.  Should we be quirking them down
to SCSI-2 instead because it reduces the risk of running into something
else they're not doing from the SPC command set?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] uas: Add a new NO_REPORT_LUNS quirk

2016-03-31 Thread James Bottomley
On Thu, 2016-03-31 at 17:03 +0200, Hans de Goede wrote:
> Hi,
> 
> On 31-03-16 16:48, James Bottomley wrote:
> > On Thu, 2016-03-31 at 14:22 +0200, Hans de Goede wrote:
> > > Add a new NO_REPORT_LUNS quirk and set it for Seagate drives with
> > > an usb-id of: 0bc2:331a, as these will fail to respond to a
> > > REPORT_LUNS command.
> > 
> > Actually, if we're sending them a report luns command, they must be
> > reporting in at SCSI-3 SPC or higher.  Should we be quirking them 
> > down to SCSI-2 instead because it reduces the risk of running into
> > something else they're not doing from the SPC command set?
> 
> These are fairly new devices, so they should really be scsi3, but the
> usb <-> sata bridge (presumably) used does not seem to like
> report_luns.

That's what I'm questioning: REPORT LUNS is one of the big SCSI-3
changes, if they don't support that, it really looks like someone
picked up an old engine and just fuzzed the inquiry data to return SCSI
-3.  In which case we should put it back to SCSI-2 where it belongs.

Also, if it's USB<->SCSI bridge, that isn't really UAS, is it?

> Note that usb-storage simple sets no_report_luns conditionally for 
> all usb-storage devices. The scsi people have repeatedly asked me to 
> not do this kinda blanket blacklisting for uas devices, because they 
> hope that uas will allow them to more or less do proper scsi over 
> usb, so we end up with blacklisting specific commands every now and 
> then to get devices to work.

Well, we were hoping that with UAS the USB device creators would
actually learn what a standard was when it bit them, yes.  The fact
that Seagate can release a SCSI-3 UAS device that doesn't do REPORT
LUNS kind of dashes that hope.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [-next] BUG_ON in scsi_target_destroy()

2016-04-13 Thread James Bottomley
On Wed, 2016-04-13 at 10:41 +0200, Johannes Thumshirn wrote:
> Hi Sergey,  Xiong,
> 
> Can you try below patch?
> 
> On Montag, 11. April 2016 18:01:47 CEST Sergey Senozhatsky wrote:
> > Hello,
> > 
> > commit 7b106f2de6938c31ce5e9c86bc70ad3904666b96
> > Author: Johannes Thumshirn 
> > Date:   Tue Apr 5 11:50:44 2016 +0200
> > 
> > scsi: Add intermediate STARGET_REMOVE state to
> > scsi_target_state
> > 
> > 
> > BUG_ON()s (next-20160411) each time I remove a usb flash
> > 
> > [   49.561600]  [] scsi_target_destroy+0x5a/0xcb
> > [scsi_mod]
> > [   49.561607]  [] scsi_target_reap+0x4a/0x4f
> > [scsi_mod]
> > [   49.561613]  [] __scsi_remove_device+0xc3/0xd0
> > [scsi_mod]
> > [   49.561619]  [] scsi_forget_host+0x52/0x63
> > [scsi_mod]
> > [   49.561623]  [] scsi_remove_host+0x8c/0x102
> > [scsi_mod]
> > [   49.561627]  [] usb_stor_disconnect+0x6b/0xab
> > [usb_storage]
> > [   49.561634]  []
> > usb_unbind_interface+0x77/0x1ca [usbcore]
> > [   49.561636]  []
> > __device_release_driver+0x9d/0x121
> > [   49.561638]  []
> > device_release_driver+0x23/0x30
> > [   49.561639]  [] bus_remove_device+0xfb/0x10e
> > [   49.561641]  [] device_del+0x164/0x1e6
> > [   49.561648]  [] ?
> > remove_intf_ep_devs+0x3b/0x48 [usbcore]
> > [   49.561655]  [] usb_disable_device+0x84/0x1a5
> > [usbcore]
> > [   49.561661]  [] usb_disconnect+0x94/0x19f
> > [usbcore]
> > [   49.561667]  [] hub_event+0x5c1/0xdea
> > [usbcore]
> > [   49.561670]  [] process_one_work+0x1dc/0x37f
> > [   49.561672]  [] worker_thread+0x282/0x36d
> > [   49.561673]  [] ? rescuer_thread+0x2ae/0x2ae
> > [   49.561675]  [] kthread+0xd2/0xda
> > [   49.561678]  [] ret_from_fork+0x22/0x40
> > [   49.561679]  [] ?
> > kthread_worker_fn+0x13e/0x13e
> > 
> > -ss
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux
> > -scsi" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 0734927..0c00928 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1276,6 +1276,7 @@ int scsi_sysfs_add_sdev(struct scsi_device
> *sdev)
>  void __scsi_remove_device(struct scsi_device *sdev)
>  {
>   struct device *dev = &sdev->sdev_gendev;
> + struct scsi_target *starget;
>  
>   /*
>* This cleanup path is not reentrant and while it is
> impossible
> @@ -1315,7 +1316,9 @@ void __scsi_remove_device(struct scsi_device
> *sdev)
>* remoed sysfs visibility from the device, so make the
> target
>* invisible if this was the last device underneath it.
>*/
> - scsi_target_reap(scsi_target(sdev));
> + starget = scsi_target(sdev);
> + starget->state = STARGET_REMOVE;
> + scsi_target_reap(starget);

How about good grief no!  A device with multiple targets will get it's
lists screwed with this

The STARGET_REMOVE state you added only applies to the case we're
trying to kill a target.  In the natural operation case, which is what
everyone else is running into, we will try to remove a running target
when it has no more scsi devices left on it.  So the correct patch
should be to make the BUG_ON see this:

James

---

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 27df7e7..e0a78f5 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -319,8 +319,7 @@ static void scsi_target_destroy(struct scsi_target *starget)
struct Scsi_Host *shost = dev_to_shost(dev->parent);
unsigned long flags;
 
-   BUG_ON(starget->state != STARGET_REMOVE &&
-  starget->state != STARGET_CREATED);
+   BUG_ON(starget->state == STARGET_DEL);
starget->state = STARGET_DEL;
transport_destroy_device(dev);
spin_lock_irqsave(shost->host_lock, flags);

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid 5 creation fails on usb3 connected drives kernel 4.4.x, 4.5

2016-04-16 Thread James Bottomley
On Sat, 2016-04-16 at 09:08 -0700, Greg KH wrote:
> On Sat, Apr 09, 2016 at 08:18:26PM +1000, Brian Chadwick wrote:
> > On 09/04/16 00:24, Greg KH wrote:
> > > On Fri, Apr 08, 2016 at 07:37:12PM +1000, Brian Chadwick wrote:
> > > > On 08/04/16 01:59, Greg KH wrote:
> > > > > On Fri, Apr 08, 2016 at 01:34:51AM +1000, Brian Chadwick
> > > > > wrote:
> > > > > > On 08/04/16 00:44, Greg KH wrote:
> > > > > > > On Thu, Apr 07, 2016 at 03:04:55PM +1000, Brian Chadwick
> > > > > > > wrote:
> > > > > > > > On 06/04/16 19:53, Greg KH wrote:
> > > > > > > > > On Tue, Apr 05, 2016 at 06:42:51PM +1000, Brian
> > > > > > > > > Chadwick wrote:
> > > > > > > > > > SETUP:
> > > > > > > > > > 
> > > > > > > > > > i7 16GB Computer.
> > > > > > > > > > 
> > > > > > > > > > 1 x PCI-x USB3 adaptor card (i think uses xhci
> > > > > > > > > > -hcd)04:00.0 USB controller:
> > > > > > > > > > Renesas Technology Corp. uPD720202 USB 3.0 Host
> > > > > > > > > > Controller (rev 02)
> > > > > > > > > > Kernel driver in use: xhci_hcd
> > > > > > > > > > 
> > > > > > > > > > 2 x USB3 to dual SATA HDD docks. uses JMicron
> > > > > > > > > > JMS56x Series controllers,
> > > > > > > > > > uses uas.ko kernel module
> > > > > > > > > > Bus 004 Device 002: ID 152d:9561 JMicron Technology
> > > > > > > > > > Corp. / JMicron USA
> > > > > > > > > > Technology Corp.
> > > > > > > > > > Bus 004 Device 003: ID 152d:9561 JMicron Technology
> > > > > > > > > > Corp. / JMicron USA
> > > > > > > > > > Technology Corp.
> > > > > > > > > > 
> > > > > > > > > > 3 x Western Digital 320GB 3.5" SATA drives
> > > > > > > > > > 
> > > > > > > > > > USE:
> > > > > > > > > > 
> > > > > > > > > > RAID 5
> > > > > > > > > > 
> > > > > > > > > > KERNEL:
> > > > > > > > > > 4.3.5
> > > > > > > > > > 
> > > > > > > > > > System works perfectly as expected.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > Change to kernel 4.4.x or 4.5 and RAID 5 doesnt
> > > > > > > > > > work. I am not sure whether
> > > > > > > > > > its actually RAID 5 at fault or if its the USB
> > > > > > > > > > Attached SCSI driver uas.ko,
> > > > > > > > > > or maybe the host driver xhci-hcd.
> > > > > > > > > Can you run 'git bisect' to try to track down the
> > > > > > > > > offending commit?
> 
> 
> 
> > Hi, well to my surprise git bisect has come up with a commit about
> > which I
> > had no inkling. I may have to go to another mailing list it doesnt
> > look like
> > a usb problem.
> > 
> > This is the final output of 'git bisect view' :
> > 
> > 
> > commit 64d513ac31bd02a3c9b69ef0f36c196f9a9d
> > Author: Christoph Hellwig 
> > Date:   Thu Oct 8 09:28:04 2015 +0100
> > 
> > scsi: use host wide tags by default
> > 
> > This patch changes the !blk-mq path to the same defaults as the
> > blk-mq
> > I/O path by always enabling block tagging, and always using
> > host wide
> > tags.  We've had blk-mq available for a few releases so bugs
> > with
> > this mode should have been ironed out, and this ensures we get
> > better
> > coverage of over tagging setup over different configs.
> > 
> > Signed-off-by: Christoph Hellwig 
> > Acked-by: Jens Axboe 
> > Reviewed-by: Hannes Reinecke 
> > Signed-off-by: James Bottomley 
> > 
> > 
> > 
> > Thank you for your help in narrowing this down, and in educating me
> > about
> > git bisect (its a neat tool) ... I await your advice as how to
> > proceed from
> > here.
> 
> Sorry for the delay, nice work tracking down the problem.
> 
> Christoph, any ideas?  This patch is breaking this user's system as
> described above.  Is there a more recent fix for this, or did
> something
> forget to get changed in the large patch you did here?

If this is UAS connected devices, then this is likely the fix:

http://marc.info/?l=linux-scsi&m=146045685829613

James




--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/29] drivers, md: convert mddev.active from atomic_t to refcount_t

2017-03-14 Thread James Bottomley
On Tue, 2017-03-14 at 12:29 +, Reshetova, Elena wrote:
> > Elena Reshetova  writes:
> > 
> > > refcount_t type and corresponding API should be
> > > used instead of atomic_t when the variable is used as
> > > a reference counter. This allows to avoid accidental
> > > refcounter overflows that might lead to use-after-free
> > > situations.
> > > 
> > > Signed-off-by: Elena Reshetova 
> > > Signed-off-by: Hans Liljestrand 
> > > Signed-off-by: Kees Cook 
> > > Signed-off-by: David Windsor 
> > > ---
> > >  drivers/md/md.c | 6 +++---
> > >  drivers/md/md.h | 3 ++-
> > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > 
> > When booting linux-next (specifically 5be4921c9958ec) I'm seeing
> > the
> > backtrace below. I suspect this patch is just exposing an existing
> > issue?
> 
> Yes, we have actually been following this issue in the another
> thread. 
> It looks like the object is re-used somehow, but I can't quite
> understand how just by reading the code. 
> This was what I put into the previous thread:
> 
> "The log below indicates that you are using your refcounter in a bit
> weird way in mddev_find(). 
> However, I can't find the place (just by reading the code) where you
> would increment refcounter from zero (vs. setting it to one).
> It looks like you either iterate over existing nodes (and increment
> their counters, which should be >= 1 at the time of increment) or
> create a new node, but then mddev_init() sets the counter to 1. "
> 
> If you can help to understand what is going on with the object
> creation/destruction, would be appreciated!
> 
> Also Shaohua Li stopped this patch coming from his tree since the 
> issue was caught at that time, so we are not going to merge this 
> until we figure it out. 

Asking on the correct list (dm-devel) would have got you the easy
answer:  The refcount behind mddev->active is a genuine atomic.  It has
refcount properties but only if the array fails to initialise (in that
case, final put kills it).  Once it's added to the system as a gendisk,
it cannot be freed until md_free().  Thus its ->active count can go to
zero (when it becomes inactive; usually because of an unmount). On a
simple allocation regardless of outcome, the last executed statement in
md_alloc is mddev_put(): that destroys the device if we didn't manage
to create it or returns 0 and adds an inactive device to the system
which the user can get with mddev_find().

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large disk drives

2014-11-03 Thread James Bottomley
On Mon, 2014-11-03 at 16:06 -0500, Dale R. Worley wrote:
> Was there any resolution as to how large disk drives would be handled
> if their interface did not support the "capacity" request that would
> tell how large they were?

Realistically no ... unless someone comes up with a reliable heuristic
to give us the size.

> Or as an alternative, is there any way to avoid buying USB-SCSI
> interfaces that do not support the large-capacity request?
> Unfortunately, such devices work OK with Windows (since Windows trusts
> what the partition table says), you can't just say to the salesperson
> "It has to work on drives over 3 TB."

This is a stopgap: your 3TB drive can be guessed as the 16 bit capacity
plus 2TB, but the same won't happen for a 5TB device.  Believing the
partition table gives us a chicken and egg problem because something
still has to get the partition table on to the device.

I don't think "don't buy something that doesn't work" is a hugely
unreasonable response to this.

James

> Dale
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large disk drives

2014-11-05 Thread James Bottomley
On Tue, 2014-11-04 at 11:14 -0500, Alan Stern wrote:
> On Tue, 4 Nov 2014, James Bottomley wrote:
> 
> > On Mon, 2014-11-03 at 16:06 -0500, Dale R. Worley wrote:
> > > Was there any resolution as to how large disk drives would be handled
> > > if their interface did not support the "capacity" request that would
> > > tell how large they were?
> > 
> > Realistically no ... unless someone comes up with a reliable heuristic
> > to give us the size.
> 
> I posted a patch to allow the user to override the reported capacity:
> 
>   http://marc.info/?l=linux-scsi&m=140993840113445&w=2
> 
> Nobody responded to it.

Sorry, meant to.  In principle I'm OK with this as the lever for the
hack (largely because it means we don't need to do anything) but will
the distributions support it?

> > > Unfortunately, such devices work OK with Windows (since Windows trusts
> > > what the partition table says), you can't just say to the salesperson
> > > "It has to work on drives over 3 TB."
> > 
> > This is a stopgap: your 3TB drive can be guessed as the 16 bit capacity
> > plus 2TB, but the same won't happen for a 5TB device.  Believing the
> > partition table gives us a chicken and egg problem because something
> > still has to get the partition table on to the device.
> > 
> > I don't think "don't buy something that doesn't work" is a hugely
> > unreasonable response to this.
> 
> The problem is knowing beforehand whether it will work.  Once you buy 
> the device and can test it, returning it is annoying and time-consuming 
> at best.

OK, but I still don't understand how windows gets the partition table on
there in the first place ... that must surely be some sort of guess the
disk size hack.

James


> Alan Stern
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large disk drives

2014-11-06 Thread James Bottomley
On Wed, 2014-11-05 at 11:30 -0800, Christoph Hellwig wrote:
> On Wed, Nov 05, 2014 at 11:34:11AM -0500, Alan Stern wrote:
> > > Sorry, meant to.  In principle I'm OK with this as the lever for the
> > > hack (largely because it means we don't need to do anything) but will
> > > the distributions support it?
> > 
> > While I can't speak for the distributions, it's reasonable to assume 
> > that if something becomes part of the upstream kernel then they will 
> > include it.
> 
> Btw, we do have precedence for looking at partition tables from SCSI
> code with scsi_partsize(), so the layering violation of looking at
> EFI labels for disks sizes wouldn't be something entirely new even
> if we did it in kernel space.

We really don't want to make the decision within the kernel of whether
we believe the partition size or the disk capacity.   For these disk
problems we need it to be the former, but if we choose that always,
we'll get weird results on mispartitioned devices.

The usual rule is no policy in the kernel and which to choose is policy,
so just export the knob (as Alan's patch does) and then let userspace
decide.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large disk drives

2014-11-06 Thread James Bottomley
On Thu, 2014-11-06 at 16:33 +0200, Boaz Harrosh wrote:
> On 11/06/2014 12:30 PM, James Bottomley wrote:
> > On Wed, 2014-11-05 at 11:30 -0800, Christoph Hellwig wrote:
> >> On Wed, Nov 05, 2014 at 11:34:11AM -0500, Alan Stern wrote:
> >>>> Sorry, meant to.  In principle I'm OK with this as the lever for the
> >>>> hack (largely because it means we don't need to do anything) but will
> >>>> the distributions support it?
> >>>
> >>> While I can't speak for the distributions, it's reasonable to assume 
> >>> that if something becomes part of the upstream kernel then they will 
> >>> include it.
> >>
> >> Btw, we do have precedence for looking at partition tables from SCSI
> >> code with scsi_partsize(), so the layering violation of looking at
> >> EFI labels for disks sizes wouldn't be something entirely new even
> >> if we did it in kernel space.
> > 
> > We really don't want to make the decision within the kernel of whether
> > we believe the partition size or the disk capacity.   For these disk
> > problems we need it to be the former, but if we choose that always,
> > we'll get weird results on mispartitioned devices.
> > 
> > The usual rule is no policy in the kernel and which to choose is policy,
> > so just export the knob (as Alan's patch does) and then let userspace
> > decide.
> > 
> 
> I do not think it is a matter of policy.
> 
> From what I understand the situation is that read_capacity_10() which is
> 32bit, Max 2T byte disks. fails with 0x and read_capacity_16() (64bit)
> is not supported because of wrong scsi-version or because it is blacklisted.
> (or device returns not-supported)

Actually, no, RC(10) returns the lowest 32 bits of the actual result.
Which is out of spec, but hey, this is a USB bridge ...

> Now If sd_read_capacity() succeed then off-course we choose it.

Um, the problem is we can't tell ... sd_read_capacity() returns a
number, it just may be wrong by a factor of 2TB.

>  What I'm saying
> is if read-capacity fails, then should we attempt a new read_capacity_part()?

We can't tell we have a failure.  We can tell if the partition capacity
and the read capacity differ that one of them must be wrong ...

> And sure a user-mode knob if he wants to make policy, like you said, is fine
> by me.
> 
> But just the simple case of read-capacity failure should we then?

We don't have a failure.  This is the problem.  Determining that a
problem exists 

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large disk drives

2014-11-06 Thread James Bottomley
On Thu, 2014-11-06 at 17:08 +, David Laight wrote:
> From: Boaz Harrosh
> > On 11/06/2014 05:53 PM, Alan Stern wrote:
> > >> But just the simple case of read-capacity failure should we then?
> > >
> > > That's a separate question.  As far as I know, the case you are
> > > describing has not come up.
> > >
> > 
> > BTW: what we should do is when the partition parser at the block layer
> > see that the partition capacity as written in the partition-table is
> > bigger then the capacity reported for the device we can put a fat
> > message at dmesg with both sizes and user can decide.
> 
> One possibility is to try to read the last sector of the actual
> partition. If it succeeds there are 2 possibilities:

These are USB devices (bridges) not well behaved SCSI devices.  The last
time we tried to poke USB devices beyond their defined capacity (the USB
capacity off by one problem) the firmware on some crashed and we
eventually gave up.  That's why, unless we can find simple, functional
heuristics for the kernel, it's safer to punt to userspace.

James

> 1) The disk is as bit as the partition table indicates.
> 2) The high bit(s) of the sector number have been masked.
> (or the disk locks up)
> 
> In many cases the latter can be verified by reading the other sector.
> But if the media is new/erased then all the sectors will be 0xff
> and you'd have to do a write to ensure there was no sector aliasing.
> 
>   David
> 
> NrybXǧv^)޺{.n+{"{ayʇڙ,jfhzwj:+vwjmzZ+ݢj"!


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] usb: storage: debug: uninitialized var in usb_stor_show_sense()

2014-11-29 Thread James Bottomley
On Sat, 2014-11-29 at 15:48 +0300, Dan Carpenter wrote:
> The "fmt" variable might be used uninitialized, it should be set to NULL
> at the start.
> 
> Fixes: d811b848ebb7 ('scsi: use sdev as argument for sense code printing')
> Signed-off-by: Dan Carpenter 

Hm, that' falls into the category of an interface that's hard to get
right if every caller has to remember to set fmt to NULL.

Could you instead set *fmt to NULL in scsi_extd_sense_format()?  That
way the interface is much easier.

Thanks,

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] target refcounting infrastructure fixes for usb

2014-03-07 Thread James Bottomley
On Thu, 2014-03-06 at 15:18 -0800, Sarah Sharp wrote:
> Hi James,
> 
> Sorry for the extremely late ack.

That's OK ... my original plan was to send them in the last merge window
even without your ack, but that got a bit derailed, so I'll do it in
this one with.

>   These patches work fine when applied
> against 3.14-rc5, and I can't trigger the oops anymore.  Hans' UAS
> patches are going to be queued for 3.15.  Any chance of these patches
> making it into 3.15 as well?

Yes, the plan is to queue for 3.15 and request a delayed backport to
stable in case any of the extensive logic changes triggers another bug
in the field.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock in usb-storage error handling

2014-03-19 Thread James Bottomley
On Wed, 2014-03-19 at 16:31 -0400, Alan Stern wrote:
> On Wed, 19 Mar 2014, Andreas Reis wrote:
> 
> > I've uploaded a dmesg with the new debugging patch to bugzilla:
> > https://bugzilla.kernel.org/attachment.cgi?id=130041
> 
> Thanks.  I have now managed to reproduce many of the features of this
> problem on my own computer.
> 
> James, I will need your help (or help from somebody who understands the 
> SCSI error handler) to figure out how this problem should be fixed.
> 
> Basically, usb-storage deadlocks when the SCSI error handler invokes
> the eh_device_reset_handler callback while a command is running.  The
> command has timed out and will never complete normally, because the
> device's firmware has crashed.  But usb-storage's device-reset routine
> waits for the current command to finish, which brings everything to a
> standstill.
> 
> Is this design wrong?  That is, should the device-reset routine wait 
> for currently executing commands to finish, or should it abort them, or 
> what?

In some sense, yes, but not necessarily from the Point of View of USB.
What we assume in SCSI is that commands are forgettable, meaning there's
always some operation we can perform (be it abort or reset) that causes
the device to forget about all outstanding commands and reset its
internal state machine to a known good state.

The cardinal SCSI assumption is that after we've successfully done an
abort or reset on a command that it will never come back to us from the
device.

> Or should the SCSI error handler abort the running command before 
> invoking the eh_device_reset_handler callback?

So this is rooted in the "Abort can be a Problem" issue:  Abort
sometimes works well (and it's not very disruptive) but sometimes if the
device is having a problem in its command state machine, adding another
command (which is what the abort is) doesn't actually do anything, so we
need error escalation to reset.  We can't wait for the abort or other
commands to complete because they never will.  The reset is expected to
clear everything from the device (including the pending aborts).

> For the record, and in case anyone is curious, here's the detailed
> sequence of events during my test:
> 
>   sd issues a READ(10) command.  For whatever reason, the device
>   goes nuts and the command times out.
> 
>   scsi_times_out() calls scsi_abort_command(), which queues an
>   abort request.
> 
>   scmd_eh_abort_handler() calls scsi_try_to_abort_cmd(), which
>   succeeds in aborting the READ.
> 
>   The READ command is retried (I didn't trace through the details
>   of this).  The retry fails with a Unit Attention (SK=6, 
>   ASC=0x29, Reset or Bus Device Reset Occurred).
> 
>   The READ command is retried a second time, and it times out 
>   again.
> 
>   This time around, scsi_times_out() calls scsi_abort_command()
>   unsuccessfully (because the SCSI_EH_ABORT_SCHEDULED flag is
>   still set).

>From the first time we sent the abort?  That sounds like a problem in
our state tracking.

>   As a result, scsi_error_handler() calls scsi_unjam_host(), 
>   which calls scsi_eh_get_sense().
> 
>   That routine calls scsi_request_sense(), which goes into
>   scsi_send_eh_cmnd().

I thought USB was autosense, so when it reports check condition, we
should already have sense ... or are we calling request_sense without
being sent a check condition status?

>   The calls to shost->hostt->queuecommand() all fail, because the
>   READ command is still running and usb-storage has a queue
>   depth of 1.  The error messages produced by these failures are
>   disconcerting but not dangerous.
> 
>   Since the REQUEST SENSE command was never issued, 
>   scsi_eh_get_sense() returns 0.
> 
>   scsi_unjam_host() goes on to call scsi_eh_abort_cmds(), which
>   does essentially nothing because the SCSI_EH_CANCEL_CMD flag
>   for the only command on work_q is clear.  
>   scsi_eh_test_devices() returns 0 because check_list is empty
>   and work_q isn't.
> 
>   scsi_unjam_host() then calls scsi_eh_ready_devs().  This
>   routine ends up calling scsi_eh_bus_device_reset(), at which 
>   point usb-storage deadlocks as described above.

OK, so in the case where the command can never complete (because the fw
has crashed), what should be the process for resetting the device so it
can again function?

James

> (On Andreas's system, the first READ retry times out as opposed to the
> second retry as on my computer.  I doubt this makes any difference.)
> 
> I can't tell if this is all working as intended or if it went off the 
> tracks somewhere.
> 
> Thanks for any guidance.
> 
> Alan Stern
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "

Re: Deadlock in usb-storage error handling

2014-03-20 Thread James Bottomley
On Thu, 2014-03-20 at 11:36 -0400, Alan Stern wrote:
> On Wed, 19 Mar 2014, James Bottomley wrote:
> 
> > > Basically, usb-storage deadlocks when the SCSI error handler invokes
> > > the eh_device_reset_handler callback while a command is running.  The
> > > command has timed out and will never complete normally, because the
> > > device's firmware has crashed.  But usb-storage's device-reset routine
> > > waits for the current command to finish, which brings everything to a
> > > standstill.
> > > 
> > > Is this design wrong?  That is, should the device-reset routine wait 
> > > for currently executing commands to finish, or should it abort them, or 
> > > what?
> > 
> > In some sense, yes, but not necessarily from the Point of View of USB.
> > What we assume in SCSI is that commands are forgettable, meaning there's
> > always some operation we can perform (be it abort or reset) that causes
> > the device to forget about all outstanding commands and reset its
> > internal state machine to a known good state.
> > 
> > The cardinal SCSI assumption is that after we've successfully done an
> > abort or reset on a command that it will never come back to us from the
> > device.
> 
> The same assumption is present in usb-storage.
> 
> The difference appears to be that SCSI assumes a reset can be issued at 
> any time, even while a command is running.  In usb-storage, that's true 
> for a _bus_ reset but it's not true for a _device_ reset.
> 
> Perhaps this restriction on device resets could be lifted, but in real
> life it probably won't make much difference.  The fact is, almost no
> USB mass-storage devices implement device reset correctly, whereas most
> of them do implement bus reset.  (Possibly related factoid: MS Windows
> uses bus resets but doesn't use device resets in its USB mass-storage
> driver.)

Actually, you don't have to lift the restriction.  Just fail the device
reset if you can't issue it (so if there's any outstanding commands).
SCSI will move on to a bus reset which you can accept.

> > > Or should the SCSI error handler abort the running command before 
> > > invoking the eh_device_reset_handler callback?
> > 
> > So this is rooted in the "Abort can be a Problem" issue:  Abort
> > sometimes works well (and it's not very disruptive) but sometimes if the
> > device is having a problem in its command state machine, adding another
> > command (which is what the abort is) doesn't actually do anything, so we
> > need error escalation to reset.  We can't wait for the abort or other
> > commands to complete because they never will.  The reset is expected to
> > clear everything from the device (including the pending aborts).
> 
> With usb-storage, aborts usually work pretty well.  But sometimes they 
> don't cure the underlying cause, so when the offending command is 
> retried, it hangs up again.  Something like that seems to be happening 
> here.
> 
> > > For the record, and in case anyone is curious, here's the detailed
> > > sequence of events during my test:
> > > 
> > >   sd issues a READ(10) command.  For whatever reason, the device
> > >   goes nuts and the command times out.
> > > 
> > >   scsi_times_out() calls scsi_abort_command(), which queues an
> > >   abort request.
> > > 
> > >   scmd_eh_abort_handler() calls scsi_try_to_abort_cmd(), which
> > >   succeeds in aborting the READ.
> > > 
> > >   The READ command is retried (I didn't trace through the details
> > >   of this).  The retry fails with a Unit Attention (SK=6, 
> > >   ASC=0x29, Reset or Bus Device Reset Occurred).
> > > 
> > >   The READ command is retried a second time, and it times out 
> > >   again.
> > > 
> > >   This time around, scsi_times_out() calls scsi_abort_command()
> > >   unsuccessfully (because the SCSI_EH_ABORT_SCHEDULED flag is
> > >   still set).
> > 
> > From the first time we sent the abort?
> 
> Yes.  As far as I can see, the only place where SCSI_EH_ABORT_SCHEDULED
> gets cleared is in scsi_eh_finish_cmd(), which never got called.  
> After the successful abort, scmd_eh_abort_handler() went directly into
> its "if (rtn == SUCCESS)" and "scmd %p retry aborted command" /
> scsi_queue_insert() code paths.

OK, that's a bug ... I'll see if I can find it.

> >  That sounds like a problem in
> > our state tracking.
> 
> Could well be.  How should I track it down fu

Re: Deadlock in usb-storage error handling

2014-03-20 Thread James Bottomley
On Thu, 2014-03-20 at 12:34 -0400, Alan Stern wrote:
> On Thu, 20 Mar 2014, James Bottomley wrote:
> 
> > OK, so I think we have three things to do
> > 
> >  1. Investigate SCSI and fix it's abort state problem that's causing
> > it not to send the abort second time around
> >  2. Fix usb-storage to fail a reset it can't do (i.e. device reset
> > with outstanding commands)
> >  3. Find out why we're sending a spurious request sense.
> > 
> > I can look at 1 and 3 if you want to take 2.
> 
> It's a deal!  Thanks for your help.

OK, I think this is the fix for 1, if you could try it out.

Thanks,

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 771c16b..c52bfb2 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -145,14 +145,14 @@ scmd_eh_abort_handler(struct work_struct *work)
"scmd %p retry "
"aborted command\n", scmd));
scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
-   return;
+   goto out;
} else {
SCSI_LOG_ERROR_RECOVERY(3,
scmd_printk(KERN_WARNING, scmd,
"scmd %p finish "
"aborted command\n", scmd));
scsi_finish_command(scmd);
-   return;
+   goto out;
}
} else {
SCSI_LOG_ERROR_RECOVERY(3,
@@ -162,6 +162,8 @@ scmd_eh_abort_handler(struct work_struct *work)
}
}
 
+   scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED;
+
if (!scsi_eh_scmd_add(scmd, 0)) {
SCSI_LOG_ERROR_RECOVERY(3,
scmd_printk(KERN_WARNING, scmd,
@@ -170,6 +172,10 @@ scmd_eh_abort_handler(struct work_struct *work)
scmd->result |= DID_TIME_OUT << 16;
scsi_finish_command(scmd);
}
+   return;
+ out:
+   scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED;
+   return;
 }
 
 /**



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock in usb-storage error handling

2014-03-20 Thread James Bottomley
On Thu, 2014-03-20 at 12:34 -0400, Alan Stern wrote:
> On Thu, 20 Mar 2014, James Bottomley wrote:
> 
> > OK, so I think we have three things to do
> > 
> >  1. Investigate SCSI and fix it's abort state problem that's causing
> > it not to send the abort second time around
> >  2. Fix usb-storage to fail a reset it can't do (i.e. device reset
> > with outstanding commands)
> >  3. Find out why we're sending a spurious request sense.
> > 
> > I can look at 1 and 3 if you want to take 2.
> 
> It's a deal!  Thanks for your help.

And this looks to be 3: a bug in the way we attach sense data to
commands (we shouldn't look for attached sense if the device error code
didn't imply there would be any).

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 771c16b..d020149 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1157,6 +1157,15 @@ int scsi_eh_get_sense(struct list_head *work_q,
 __func__));
break;
}
+   if (status_byte(scmd->result) != CHECK_CONDITION)
+   /*
+* don't request sense if there's no check condition
+* status because the error we're processing isn't one
+* that has a sense code (and some devices get
+* confused by sense requests out of the blue)
+*/
+   continue;
+
SCSI_LOG_ERROR_RECOVERY(2, scmd_printk(KERN_INFO, scmd,
  "%s: requesting sense\n",
  current->comm));



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock in usb-storage error handling

2014-03-20 Thread James Bottomley
On Thu, 2014-03-20 at 15:48 -0400, Alan Stern wrote:
> On Thu, 20 Mar 2014, James Bottomley wrote:
> 
> > On Thu, 2014-03-20 at 12:34 -0400, Alan Stern wrote:
> > > On Thu, 20 Mar 2014, James Bottomley wrote:
> > > 
> > > > OK, so I think we have three things to do
> > > > 
> > > >  1. Investigate SCSI and fix it's abort state problem that's causing
> > > > it not to send the abort second time around
> > > >  2. Fix usb-storage to fail a reset it can't do (i.e. device reset
> > > > with outstanding commands)
> > > >  3. Find out why we're sending a spurious request sense.
> > > > 
> > > > I can look at 1 and 3 if you want to take 2.
> > > 
> > > It's a deal!  Thanks for your help.
> > 
> > And this looks to be 3: a bug in the way we attach sense data to
> > commands (we shouldn't look for attached sense if the device error code
> > didn't imply there would be any).
> > 
> > James
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 771c16b..d020149 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -1157,6 +1157,15 @@ int scsi_eh_get_sense(struct list_head *work_q,
> >  __func__));
> > break;
> > }
> > +   if (status_byte(scmd->result) != CHECK_CONDITION)
> > +   /*
> > +* don't request sense if there's no check condition
> > +* status because the error we're processing isn't one
> > +* that has a sense code (and some devices get
> > +* confused by sense requests out of the blue)
> > +*/
> > +   continue;
> > +
> > SCSI_LOG_ERROR_RECOVERY(2, scmd_printk(KERN_INFO, scmd,
> >   "%s: requesting sense\n",
> >   current->comm));
> 
> I tried this patch first, because fixing the earlier bug would mask
> this one.
> 
> The patch sort of worked.  But the first time I tried it, it failed in
> a rather amusing way.  While the second retry was running and hung,
> scmd->result _was_ equal to CHECK_CONDITION -- because that was the
> result from the _first_ retry, and it had never gotten cleared!
> 
> scmd->result needs to be set to 0 before the queuecommand callback is
> invoked.  I ended up adding this to your patch, and then it worked
> perfectly:

Wow, the stale data bugs are just crawling out of the code.  Thanks for
checking.

> 
> Index: usb-3.14/drivers/scsi/scsi_error.c
> ===
> --- usb-3.14.orig/drivers/scsi/scsi_error.c
> +++ usb-3.14/drivers/scsi/scsi_error.c
> @@ -924,6 +924,7 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd
>   memset(scmd->cmnd, 0, BLK_MAX_CDB);
>   memset(&scmd->sdb, 0, sizeof(scmd->sdb));
>   scmd->request->next_rq = NULL;
> + scmd->result = 0;
>  
>   if (sense_bytes) {
>   scmd->sdb.length = min_t(unsigned, SCSI_SENSE_BUFFERSIZE,
> Index: usb-3.14/drivers/scsi/scsi_lib.c
> ===
> --- usb-3.14.orig/drivers/scsi/scsi_lib.c
> +++ usb-3.14/drivers/scsi/scsi_lib.c
> @@ -159,6 +159,7 @@ static void __scsi_queue_insert(struct s
>* lock such that the kblockd_schedule_work() call happens
>* before blk_cleanup_queue() finishes.
>*/
> + cmd->result = 0;
>   spin_lock_irqsave(q->queue_lock, flags);
>   blk_requeue_request(q, cmd->request);
>   kblockd_schedule_work(q, &device->requeue_work);
> 
> 
> Maybe only the second one is necessary, but it seemed best to be
> consistent.

Thanks, I'll add this one to the list as well and see if we can get it
into the merge window.  I take it you'd like a cc to stable on these
three?

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock in usb-storage error handling

2014-03-20 Thread James Bottomley
On Thu, 2014-03-20 at 15:59 -0400, Alan Stern wrote:
> On Thu, 20 Mar 2014, James Bottomley wrote:
> 
> > OK, so I think we have three things to do
> > 
> >  1. Investigate SCSI and fix it's abort state problem that's causing
> > it not to send the abort second time around
> >  2. Fix usb-storage to fail a reset it can't do (i.e. device reset
> > with outstanding commands)
> >  3. Find out why we're sending a spurious request sense.
> > 
> > I can look at 1 and 3 if you want to take 2.
> 
> I wrote a patch for 2.  It turned out not to help much, because I was
> wrong -- while a command is running, usb-storage won't do a bus reset
> either.  The lock occurs in a separate part of the code, where I wasn't
> looking earlier.  When I tested the patch, the device reset failed
> immediately (as desired) and then the attempted bus reset hung.  
> Overall, not much of an improvement...

Hm, yes, sorry, after the bus reset fails, we'll just take the device
offline.

> In fact, this restriction on bus resets is important and necessary.  
> USB devices can be composite, meaning they can contain several
> functions all packed into a single device.  If a driver for one of the
> other functions needs to perform a bus reset, we don't want it to
> interrupt an ongoing mass-storage command.  Thus, it is important for
> the reset routine to wait for an outstanding command to complete.
> 
> Anyway, this looks to be a moot point.  With your fix for 1, neither 
> sort of reset was necessary.

Well as long as the tiny hammer works ... but there are going to be
devices that need the bigger one one day.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deadlock in usb-storage error handling

2014-03-28 Thread James Bottomley
On Thu, 2014-03-20 at 15:49 -0400, Alan Stern wrote:
> On Thu, 20 Mar 2014, James Bottomley wrote:
> 
> > On Thu, 2014-03-20 at 12:34 -0400, Alan Stern wrote:
> > > On Thu, 20 Mar 2014, James Bottomley wrote:
> > > 
> > > > OK, so I think we have three things to do
> > > > 
> > > >  1. Investigate SCSI and fix it's abort state problem that's causing
> > > > it not to send the abort second time around
> > > >  2. Fix usb-storage to fail a reset it can't do (i.e. device reset
> > > > with outstanding commands)
> > > >  3. Find out why we're sending a spurious request sense.
> > > > 
> > > > I can look at 1 and 3 if you want to take 2.
> > > 
> > > It's a deal!  Thanks for your help.
> > 
> > OK, I think this is the fix for 1, if you could try it out.
> > 
> > Thanks,
> > 
> > James
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 771c16b..c52bfb2 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -145,14 +145,14 @@ scmd_eh_abort_handler(struct work_struct *work)
> > "scmd %p retry "
> > "aborted command\n", scmd));
> > scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
> > -   return;
> > +   goto out;
> > } else {
> > SCSI_LOG_ERROR_RECOVERY(3,
> > scmd_printk(KERN_WARNING, scmd,
> > "scmd %p finish "
> > "aborted command\n", scmd));
> > scsi_finish_command(scmd);
> > -   return;
> > +   goto out;
> > }
> > } else {
> > SCSI_LOG_ERROR_RECOVERY(3,
> > @@ -162,6 +162,8 @@ scmd_eh_abort_handler(struct work_struct *work)
> > }
> > }
> >  
> > +   scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED;
> > +
> > if (!scsi_eh_scmd_add(scmd, 0)) {
> > SCSI_LOG_ERROR_RECOVERY(3,
> > scmd_printk(KERN_WARNING, scmd,
> > @@ -170,6 +172,10 @@ scmd_eh_abort_handler(struct work_struct *work)
> > scmd->result |= DID_TIME_OUT << 16;
> > scsi_finish_command(scmd);
> > }
> > +   return;
> > + out:
> > +   scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED;
> > +   return;
> >  }
> >  
> >  /**
> 
> This worked the first time.  :-)
> 
> But I wonder, is it safe to access scmd after calling 
> scsi_finish_command()?

Agree, I've redone the patch integrated into the try_to_abort call
instead.  Will post shortly.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] Fix USB deadlock caused by SCSI error handling

2014-03-28 Thread James Bottomley
This is a set of three patches we agreed to a while ago to eliminate a
USB deadlock.  I did rewrite the first patch, if it could be reviewed
and tested.

Thanks,

James

---

Alan Stern (1):
  [SCSI] Fix command result state propagation

James Bottomley (2):
  [SCSI] Fix abort state memory problem
  [SCSI] Fix spurious request sense in error handling

 drivers/scsi/scsi_error.c | 19 ---
 drivers/scsi/scsi_lib.c   |  1 +
 2 files changed, 17 insertions(+), 3 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] [SCSI] Fix command result state propagation

2014-03-28 Thread James Bottomley
From: Alan Stern 

We're seeing a case where the contents of scmd->result isn't being reset after
a SCSI command encounters an error, is resubmitted, times out and then gets
handled.  The error handler acts on the stale result of the previous error
instead of the timeout.  Fix this by properly zeroing the scmd->status before
the command is resubmitted.

Signed-off-by: Alan Stern 
Signed-off-by: James Bottomley 
---
 drivers/scsi/scsi_error.c | 1 +
 drivers/scsi/scsi_lib.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 221a5bc..335eaf4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -927,6 +927,7 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct 
scsi_eh_save *ses,
memset(scmd->cmnd, 0, BLK_MAX_CDB);
memset(&scmd->sdb, 0, sizeof(scmd->sdb));
scmd->request->next_rq = NULL;
+   scmd->result = 0;
 
if (sense_bytes) {
scmd->sdb.length = min_t(unsigned, SCSI_SENSE_BUFFERSIZE,
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 5681c05..b1f4867 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -137,6 +137,7 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, int 
reason, int unbusy)
 * lock such that the kblockd_schedule_work() call happens
 * before blk_cleanup_queue() finishes.
 */
+   cmd->result = 0;
spin_lock_irqsave(q->queue_lock, flags);
blk_requeue_request(q, cmd->request);
kblockd_schedule_work(q, &device->requeue_work);
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] [SCSI] Fix spurious request sense in error handling

2014-03-28 Thread James Bottomley
We unconditionally execute scsi_eh_get_sense() to make sure all failed
commands that should have sense attached, do.  However, the routine forgets
that some commands, because of the way they fail, will not have any sense code
... we should not bother them with a REQUEST_SENSE command.  Fix this by
testing to see if we actually got a CHECK_CONDITION return and skip asking for
sense if we don't.

Tested-by: Alan Stern 
Signed-off-by: James Bottomley 
---
 drivers/scsi/scsi_error.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index b9f3b07..221a5bc 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1160,6 +1160,15 @@ int scsi_eh_get_sense(struct list_head *work_q,
 __func__));
break;
}
+   if (status_byte(scmd->result) != CHECK_CONDITION)
+   /*
+* don't request sense if there's no check condition
+* status because the error we're processing isn't one
+* that has a sense code (and some devices get
+* confused by sense requests out of the blue)
+*/
+   continue;
+
SCSI_LOG_ERROR_RECOVERY(2, scmd_printk(KERN_INFO, scmd,
  "%s: requesting sense\n",
  current->comm));
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] [SCSI] Fix abort state memory problem

2014-03-28 Thread James Bottomley
The abort state is being stored persistently across commands, meaning if the
command times out a second time, the error handler thinks an abort is still
pending.  Fix this by properly resetting the abort flag after the abort is
finished.

Signed-off-by: James Bottomley 
---
 drivers/scsi/scsi_error.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 771c16b..b9f3b07 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -869,10 +869,13 @@ static int scsi_try_bus_device_reset(struct scsi_cmnd 
*scmd)
 
 static int scsi_try_to_abort_cmd(struct scsi_host_template *hostt, struct 
scsi_cmnd *scmd)
 {
-   if (!hostt->eh_abort_handler)
-   return FAILED;
+   int retval = FAILED;
+
+   if (hostt->eh_abort_handler)
+   retval = hostt->eh_abort_handler(scmd);
 
-   return hostt->eh_abort_handler(scmd);
+   scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED;
+   return retval;
 }
 
 static void scsi_abort_eh_cmnd(struct scsi_cmnd *scmd)
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Fix USB deadlock caused by SCSI error handling

2014-03-31 Thread James Bottomley
[lets split the thread]
On Mon, 2014-03-31 at 16:37 +0200, Hannes Reinecke wrote:
> On 03/31/2014 03:33 PM, Alan Stern wrote:
> > On Mon, 31 Mar 2014, Hannes Reinecke wrote:
> >> On 03/28/2014 08:29 PM, Alan Stern wrote:
> >>> On Fri, 28 Mar 2014, James Bottomley wrote:
> >>> Maybe scmd_eh_abort_handler() should check the flag before doing
> >>> anything.  Is there any sort of sychronization to prevent the same
> >>> incarnation of a command from being aborted twice (or by two different
> >>> threads at the same time)?  If there is, it isn't obvious.
> >>>
> >> See above. scsi_times_out() will only ever called once.
> >> What can happen, though, is that _theoretically_ the LLDD might
> >> decide to call ->done() on a timed out command when
> >> scsi_eh_abort_handler() is still pending.
> > 
> > That's okay.  We can expect the LLDD to have sufficient locking to
> > handle that sort of thing without confusion (usb-storage does, for
> > example).
> > 
> >>> (Also, what's going on at the start of scsi_abort_command()?  Contrary
> >>> to what one might expect, the first part of the function _cancels_ a
> >>> scheduled abort.  And it does so without clearing the
> >>> SCSI_EH_ABORT_SCHEDULED flag.)
> >>>
> >> The original idea was this:
> >>
> >> SCSI_EH_ABORT_SCHEDULED is sticky _per command_.
> >> Point is, any command abort is only ever send for a timed-out
> >> command. And the main problem for a timed-out command is that we
> >> simply _do not_ know what happened for that command. So _if_ a
> >> command timed out, _and_ we've send an abort, _and_ the command
> >> times out _again_ we'll be running into an endless loop between
> >> timeout and aborting, and never returning the command at all.
> >>
> >> So to prevent this we should set a marker on that command telling it
> >> to _not_ try to abort the command again.
> > 
> > I disagree.  We _have_ to abort the command again -- how else can we
> > stop a running command?  To prevent the loop you described, we should
> > avoid _retrying_ the command after it is aborted the second time.
> > 
> The actual question is whether it's worth aborting the same command
> a second time.
> In principle any reset (like LUN reset etc) should clear the
> command, too.
> And the EH abort functionality is geared around this.
> If, for some reason, the transport layer / device driver
> requires a command abort to be send then sure, we need
> to accommodate for that.

We already discussed this (that was my first response too).  USB needs
all outstanding commands aborted before proceeding to reset.  I'm
starting to think the actual way to fix this is to reset the abort
scheduled only if we send something else, so this might be a better fix.

It doesn't matter if we finish a command with abort scheduled because
the command then gets freed and the flags cleaned.

We can take our time with this because the other two patches, which I
can send separately fix the current deadlock (we no longer send an
unaborted request sense before the reset), and it can go via rc fixes.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 771c16b..3cfd2bf 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1007,6 +1007,12 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, 
unsigned char *cmnd,
const unsigned long stall_for = msecs_to_jiffies(100);
int rtn;
 
+   /*
+* We're sending another command we'll need to abort, so reset the
+* abort scheduled flag on the real command before we save its state
+*/
+   scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED;
+
 retry:
scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes);
shost->eh_action = &done;


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] Fix USB deadlock caused by SCSI error handling

2014-04-10 Thread James Bottomley
On Thu, 2014-04-10 at 19:52 +0200, Hannes Reinecke wrote:
> On 04/10/2014 05:31 PM, Alan Stern wrote:
> > On Thu, 10 Apr 2014, Hannes Reinecke wrote:
> >
> >> On 04/10/2014 12:58 PM, Andreas Reis wrote:
> >>> That patch appears to work in preventing the crashes, judged on one
> >>> repeated appearance of the bug.
> >>>
> >>> dmesg had the usual
> >>> [  215.229903] usb 4-2: usb_disable_lpm called, do nothing
> >>> [  215.336941] usb 4-2: reset SuperSpeed USB device number 3 using
> >>> xhci_hcd
> >>> [  215.350296] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called
> >>> with disabled ep 880427b829c0
> >>> [  215.350305] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called
> >>> with disabled ep 880427b82a08
> >>> [  215.350621] usb 4-2: usb_enable_lpm called, do nothing
> >>>
> >>> repeated five times, followed by one
> >>> [  282.795801] sd 8:0:0:0: Device offlined - not ready after error
> >>> recovery
> >>>
> >>> and then as often as something tried to read from it:
> >>> [  295.585472] sd 8:0:0:0: rejecting I/O to offline device
> >>>
> >>> The stick could then be properly un- and remounted (the latter if it
> >>> had been physically replugged) without issue � for the bug to
> >>> reoccur after one to three minutes. I tried this three times, no
> >>> dmesg difference except the ep addresses varied on two of that.
> >>>
> >> Was this just that patch you've tested with or the entire patch series?
> >>
> >> If the latter, Alan, is this the expected outcome?
> >
> > Yes, it is.  The same thing should happen with the entire patch series.
> >
> >> I would've thought the error recover should _not_ run into
> >> offlining devices here, but rather the device should be recovered
> >> eventually.
> >
> > The command times out, it is aborted, and the command is retried.  The
> > same thing happens, and we repeat five times.  Eventually the SCSI core
> > gives up and declares the device to be offline.
> >
> Hmm. Ok. If you are fine with it who am I to argue here.
> James, shall I resent the patch series?

You mean the one patch?  No, it's OK, I have it.

It's still not complete, though, as I've said a couple of times.  The
problem is that we have abort memory on any eh command as well, which
this doesn't fix.

The scenario is abort command, set flag, abort completes, send TUR, TUR
doesn't return, so we now try to abort the TUR, but scsi_abort_eh_cmnd()
will skip the abort because the flag is set and move straight to reset.

The fix is this, I can just add it as well.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 771c16b..7516e2c 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -920,6 +920,7 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct 
scsi_eh_save *ses,
ses->prot_op = scmd->prot_op;
 
scmd->prot_op = SCSI_PROT_NORMAL;
+   scmd->eh_eflags = 0;
scmd->cmnd = ses->eh_cmnd;
memset(scmd->cmnd, 0, BLK_MAX_CDB);
memset(&scmd->sdb, 0, sizeof(scmd->sdb));


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Weird I/O errors with USB Hard disk

2014-05-10 Thread James Bottomley
On Sat, 2014-05-10 at 22:38 +0800, loody wrote:
> hi all:
> I have a USB hard disk and when I play specific file.
> it will show below message
> ( I purpose enable usb/storage/transport.c debug message about urb debug)
> what makes me confused are
> 1. Does " Unhandled sense code" mean the " SCSI Request Sense" command?

Error handling couldn't correct the error so it was passed up.

> 2. if #1 is correct, there should be any error handling for not keep
> sending following commands. Why usb driver keep clear this halt
> enpoint?
> isn't there any timeout mechanism no matter in scsi layer or usb layer?
> 
> appreciate your help
> 
> 
> [22:06:48]clearing endpoint halt for pipe 0xc0008280
> [22:06:48]sd 0:0:0:0: [sda] Unhandled sense code
> [22:06:48]sd 0:0:0:0: [sda]  Result: hostbyte=0x10 driverbyte=0x08
> [22:06:48]sd 0:0:0:0: [sda]  Sense Key : 0x3 [current]
> [22:06:48]sd 0:0:0:0: [sda]  ASC=0x11 ASCQ=0x0
> [22:06:48]sd 0:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 00 24 42 22 00 00 f0 00
> [22:06:48]end_request: critical target error, dev sda, sector 2376226

If you compiled your kernel with SCSI messages it would tell you that
this is a medium error, unrecovered read error in sector 2376226

The device is failing and the sector cannot be recovered.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Weird I/O errors with USB Hard disk

2014-05-10 Thread James Bottomley
On Sun, 2014-05-11 at 00:43 +0800, loody wrote:
> hi James:
> 
> 
> 2014-05-10 23:10 GMT+08:00 James Bottomley
> :
> > On Sat, 2014-05-10 at 22:38 +0800, loody wrote:
> >> hi all:
> >> I have a USB hard disk and when I play specific file.
> >> it will show below message
> >> ( I purpose enable usb/storage/transport.c debug message about urb debug)
> >> what makes me confused are
> >> 1. Does " Unhandled sense code" mean the " SCSI Request Sense" command?
> >
> > Error handling couldn't correct the error so it was passed up.
> >
> >> 2. if #1 is correct, there should be any error handling for not keep
> >> sending following commands. Why usb driver keep clear this halt
> >> enpoint?
> >> isn't there any timeout mechanism no matter in scsi layer or usb layer?
> >>
> >> appreciate your help
> >>
> >>
> >> [22:06:48]clearing endpoint halt for pipe 0xc0008280
> >> [22:06:48]sd 0:0:0:0: [sda] Unhandled sense code
> >> [22:06:48]sd 0:0:0:0: [sda]  Result: hostbyte=0x10 driverbyte=0x08
> >> [22:06:48]sd 0:0:0:0: [sda]  Sense Key : 0x3 [current]
> >> [22:06:48]sd 0:0:0:0: [sda]  ASC=0x11 ASCQ=0x0
> >> [22:06:48]sd 0:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 00 24 42 22 00 00 f0 00
> >> [22:06:48]end_request: critical target error, dev sda, sector 2376226
> >
> > If you compiled your kernel with SCSI messages it would tell you that
> > this is a medium error, unrecovered read error in sector 2376226
> >
> > The device is failing and the sector cannot be recovered.
> in my opinion, the OS will not hang up when this problem happen, right?
> But my platform is always trying to clear halt endpoint, and it seems
> Block or File system layer still try to access this sector.

That depends on what's being done and the configuration of the layers
above.  For this error SCSI passes up as fast as possible.  The log
implied some churn below in USB.  On a real FS, the page should
eventually be marked as error, which should cause this to stop, but
there are a variety of daemons which poke the disk, it could be one of
them.

> is that related to file node open method or file read method?
> such that read bad sector will hang up the OS.

Depends, it's definitely possible to trigger repeat behaviour with a DIO
read because there's no memory of the error in the page cache.  There
may be other ways.

James




--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug 77631] New: task scsi_eh_6:537 blocked for more than 120 seconds.

2014-06-10 Thread James Bottomley
>From the trace below, this looks to be a USB issue (USB added to cc):
the scsi error handler thread is waiting for usb storage to complete the
reset.  It's a 3.14.5 kernel, so the previous reset hang because of
spurious sense requests should be fixed.

James


On Tue, 2014-06-10 at 16:50 +, bugzilla-dae...@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=77631
> 
> Bug ID: 77631
>Summary: task scsi_eh_6:537 blocked for more than 120 seconds.
>Product: IO/Storage
>Version: 2.5
> Kernel Version: 3.14.6
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: SCSI
>   Assignee: linux-s...@vger.kernel.org
>   Reporter: mikhail.v.gavri...@gmail.com
> Regression: No
> 
> Created attachment 138941
>   --> https://bugzilla.kernel.org/attachment.cgi?id=138941&action=edit
> kernel log
> 
> task scsi_eh_6:537 blocked for more than 120 seconds.
> 
> 
> [  240.036323] INFO: task scsi_eh_6:537 blocked for more than 120 seconds.
> [  240.036333]   Not tainted 3.14.6-200.fc20.x86_64+debug #1
> [  240.036337] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this
> message.
> [  240.036341] scsi_eh_6   D 880813528000  6184   537  2 
> 0x
> [  240.036355]  8808027cdc50 0046 88007c988000
> 8808027cdfd8
> [  240.036366]  001d5800 001d5800 88007c988000
> 880810ca6bf0
> [  240.036375]  880810ca6bf8 0246 88007c988000
> 88080c532600
> [  240.036384] Call Trace:
> [  240.036400]  [] schedule_preempt_disabled+0x31/0x80
> [  240.036409]  [] mutex_lock_nested+0x194/0x430
> [  240.036434]  [] ? device_reset+0x24/0x50 [usb_storage]
> [  240.036450]  [] ? device_reset+0x24/0x50 [usb_storage]
> [  240.036466]  [] device_reset+0x24/0x50 [usb_storage]
> [  240.036478]  [] scsi_try_bus_device_reset+0x2a/0x50
> [  240.036487]  [] scsi_eh_ready_devs+0x4df/0xc20
> [  240.036495]  [] ? scsi_eh_get_sense+0x176/0x2a0
> [  240.036504]  [] scsi_error_handler+0x53c/0x830
> [  240.036512]  [] ? scsi_eh_get_sense+0x2a0/0x2a0
> [  240.036522]  [] kthread+0xff/0x120
> [  240.036530]  [] ? insert_kthread_work+0x80/0x80
> [  240.036540]  [] ret_from_fork+0x7c/0xb0
> [  240.036547]  [] ? insert_kthread_work+0x80/0x80
> [  240.036553] 1 lock held by scsi_eh_6/537:
> [  240.036555]  #0:  (&us_interface_key[i]){+.+.+.}, at: []
> device_reset+0x24/0x50 [usb_storage]
> [  240.036575] INFO: task usb-storage:539 blocked for more than 120 seconds.
> [  240.036579]   Not tainted 3.14.6-200.fc20.x86_64+debug #1
> [  240.036582] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this
> message.
> [  240.036585] usb-storage D 88081351a590  5032   539  2 
> 0x
> [  240.036595]  8808025ebae0 0046 880800c28000
> 8808025ebfd8
> [  240.036604]  001d5800 001d5800 880800c28000
> 7fff
> [  240.036612]  880810ca6dd0 880810ca6dc8 880800c28000
> 
> [  240.036621] Call Trace:
> [  240.036629]  [] schedule+0x29/0x70
> [  240.036637]  [] schedule_timeout+0x1f9/0x350
> [  240.036648]  [] ? mark_held_locks+0xb9/0x140
> [  240.036657]  [] ? _raw_spin_unlock_irq+0x2c/0x40
> [  240.036665]  [] ? trace_hardirqs_on_caller+0x105/0x1d0
> [  240.036673]  [] wait_for_completion+0xfc/0x140
> [  240.036684]  [] ? wake_up_state+0x20/0x20
> [  240.036692]  [] usb_sg_wait+0x12d/0x190
> [  240.036709]  []
> usb_stor_bulk_transfer_sglist.part.4+0x8b/0xf0 [usb_storage]
> [  240.036725]  [] usb_stor_bulk_srb+0x71/0x80 [usb_storage]
> [  240.036741]  [] usb_stor_Bulk_transport+0x15c/0x360
> [usb_storage]
> [  240.036758]  [] usb_stor_invoke_transport+0x3b/0x570
> [usb_storage]
> [  240.036766]  [] ? mark_held_locks+0xb9/0x140
> [  240.036774]  [] ? _raw_spin_unlock_irq+0x2c/0x40
> [  240.036790]  [] 
> usb_stor_transparent_scsi_command+0xe/0x10
> [usb_storage]
> [  240.036807]  [] usb_stor_control_thread+0x176/0x290
> [usb_storage]
> [  240.036824]  [] ? fill_inquiry_response+0x20/0x20
> [usb_storage]
> [  240.036840]  [] ? fill_inquiry_response+0x20/0x20
> [usb_storage]
> [  240.036847]  [] kthread+0xff/0x120
> [  240.036854]  [] ? insert_kthread_work+0x80/0x80
> [  240.036862]  [] ret_from_fork+0x7c/0xb0
> [  240.036869]  [] ? insert_kthread_work+0x80/0x80
> [  240.036873] 1 lock held by usb-storage/539:
> [  240.036876]  #0:  (&us_interface_key[i]){+.+.+.}, at: []
> usb_stor_control_thread+0x8a/0x290 [usb_storage]
> [  359.971932] INFO: task scsi_eh_6:537 blocked for more than 120 seconds.
> [  359.971935]   Not tainted 3.14.6-200.fc20.x86_64+debug #1
> [  359.971936] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this
> message.
> [  359.971937] scsi_eh_6   D 880813528000  6184   537  2 
> 0x
> [  359.97

Re: [usb-storage] Re: External USB3 disk fails with "Invalid field in cdb"

2014-06-30 Thread James Bottomley
On Fri, 2014-06-27 at 15:23 -0400, Alan Stern wrote:
> On Fri, 27 Jun 2014, Michael Büsch wrote:
> 
> > On Fri, 27 Jun 2014 14:42:01 -0400 (EDT)
> > Alan Stern  wrote:
> > 
> > > Michael, can you post the "lsusb -v" output for this device?  I see it 
> > > is made by JMicron; they are notorious for buggy USB-ATA bridges.
> > 
> > Of course. Here you go:
> > 
> > Bus 004 Device 009: ID 152d:0567 JMicron Technology Corp. / JMicron USA 
> > Technology Corp. 
> > Device Descriptor:
> >   bLength18
> >   bDescriptorType 1
> >   bcdUSB   3.00
> >   bDeviceClass0 (Defined at Interface level)
> >   bDeviceSubClass 0 
> >   bDeviceProtocol 0 
> >   bMaxPacketSize0 9
> >   idVendor   0x152d JMicron Technology Corp. / JMicron USA 
> > Technology Corp.
> >   idProduct  0x0567 
> >   bcdDevice1.14
> >   iManufacturer   1 JMicron
> >   iProduct2 USB to ATA/ATAPI Bridge
> >   iSerial 3 xxx
> >   bNumConfigurations  1
> >   Configuration Descriptor:
> > bLength 9
> > bDescriptorType 2
> > wTotalLength  121
> > bNumInterfaces  1
> > bConfigurationValue 1
> > iConfiguration  4 USB Mass Storage
> > bmAttributes 0xc0
> >   Self Powered
> > MaxPower2mA
> ...
> 
> > MaxPower=2mA is a nice guess for a hard disk. ;)
> 
> That refers to the amount of power the device draws from the USB bus.  
> Since the disk drive is self-powered, it doesn't use much bus power.
> 
> Does the patch below do what you and James want?

Yes, that's the usual annoying additions to our blacklist.  You can add
my acked-by and could you cc stable?

Thanks,

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] uas: Limit qdepth at the scsi-host level

2016-03-19 Thread James Bottomley
On Sat, 2016-03-19 at 09:59 +0100, Hans de Goede wrote:
> Commit 64d513ac31bd ("scsi: use host wide tags by default") causes
> the scsi-core to queue more cmnds then we can handle on devices with
> multiple LUNs, limit the qdepth at the scsi-host level instead of
> per slave to fix this.

Help me understand this bug a bit more.  Are you saying that the commit
you identify is causing the block layer to queue more commands than
you've set the per-lun limit to?  In which case we have a serious
problem for more than just UAS.  Or are you saying that UAS always had
a global command limit, but it just didn't get set correctly; however,
it mostly worked until the above commit exposed the problem?


James

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 1/2] ACPI / EC: Fix broken 64bit big-endian users of 'global_lock'

2015-09-26 Thread James Bottomley
On Fri, 2015-09-25 at 22:58 +0200, Rafael J. Wysocki wrote:
> On Friday, September 25, 2015 01:25:49 PM Viresh Kumar wrote:
> > On 25 September 2015 at 13:33, Rafael J. Wysocki  wrote:
> > > You're going to change that into bool in the next patch, right?
> > 
> > Yeah.
> > 
> > > So what if bool is a byte and the field is not word-aligned
> > 
> > Its between two 'unsigned long' variables today, and the struct isn't 
> > packed.
> > So, it will be aligned, isn't it?
> > 
> > > and changing
> > > that byte requires a read-modify-write.  How do we ensure that things 
> > > remain
> > > consistent in that case?
> > 
> > I didn't understood why a read-modify-write is special here? That's
> > what will happen
> > to most of the non-word-sized fields anyway?
> > 
> > Probably I didn't understood what you meant..
> 
> Say you have three adjacent fields in a structure, x, y, z, each one byte 
> long.
> Initially, all of them are equal to 0.
> 
> CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> 
> What's the result?

I think every CPU's  cache architecure guarantees adjacent store
integrity, even in the face of SMP, so it's x==1 and y==2.  If you're
thinking of old alpha SMP system where the lowest store width is 32 bits
and thus you have to do RMW to update a byte, this was usually fixed by
padding (assuming the structure is not packed).  However, it was such a
problem that even the later alpha chips had byte extensions.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 1/2] ACPI / EC: Fix broken 64bit big-endian users of 'global_lock'

2015-09-28 Thread James Bottomley
On Mon, 2015-09-28 at 08:58 +, David Laight wrote:
> From: Rafael J. Wysocki
> > Sent: 27 September 2015 15:09
> ...
> > > > Say you have three adjacent fields in a structure, x, y, z, each one 
> > > > byte long.
> > > > Initially, all of them are equal to 0.
> > > >
> > > > CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> > > >
> > > > What's the result?
> > >
> > > I think every CPU's  cache architecure guarantees adjacent store
> > > integrity, even in the face of SMP, so it's x==1 and y==2.  If you're
> > > thinking of old alpha SMP system where the lowest store width is 32 bits
> > > and thus you have to do RMW to update a byte, this was usually fixed by
> > > padding (assuming the structure is not packed).  However, it was such a
> > > problem that even the later alpha chips had byte extensions.
> 
> Does linux still support those old Alphas?
> 
> The x86 cpus will also do 32bit wide rmw cycles for the 'bit' operations.

That's different: it's an atomic RMW operation.  The problem with the
alpha was that the operation wasn't atomic (meaning that it can't be
interrupted and no intermediate output states are visible).

> > OK, thanks!
> 
> You still have to ensure the compiler doesn't do wider rmw cycles.
> I believe the recent versions of gcc won't do wider accesses for volatile 
> data.

I don't understand this comment.  You seem to be implying gcc would do a
64 bit RMW for a 32 bit store ... that would be daft when a single
instruction exists to perform the operation on all architectures.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 1/2] ACPI / EC: Fix broken 64bit big-endian users of 'global_lock'

2015-09-28 Thread James Bottomley
On Mon, 2015-09-28 at 14:50 +, David Laight wrote:
> From: James Bottomley [mailto:james.bottom...@hansenpartnership.com]
> > Sent: 28 September 2015 15:27
> > On Mon, 2015-09-28 at 08:58 +, David Laight wrote:
> > > From: Rafael J. Wysocki
> > > > Sent: 27 September 2015 15:09
> > > ...
> > > > > > Say you have three adjacent fields in a structure, x, y, z, each 
> > > > > > one byte long.
> > > > > > Initially, all of them are equal to 0.
> > > > > >
> > > > > > CPU A writes 1 to x and CPU B writes 2 to y at the same time.
> > > > > >
> > > > > > What's the result?
> > > > >
> > > > > I think every CPU's  cache architecure guarantees adjacent store
> > > > > integrity, even in the face of SMP, so it's x==1 and y==2.  If you're
> > > > > thinking of old alpha SMP system where the lowest store width is 32 
> > > > > bits
> > > > > and thus you have to do RMW to update a byte, this was usually fixed 
> > > > > by
> > > > > padding (assuming the structure is not packed).  However, it was such 
> > > > > a
> > > > > problem that even the later alpha chips had byte extensions.
> > >
> > > Does linux still support those old Alphas?
> > >
> > > The x86 cpus will also do 32bit wide rmw cycles for the 'bit' operations.
> > 
> > That's different: it's an atomic RMW operation.  The problem with the
> > alpha was that the operation wasn't atomic (meaning that it can't be
> > interrupted and no intermediate output states are visible).
> 
> It is only atomic if prefixed by the 'lock' prefix.
> Normally the read and write are separate bus cycles.

The essential point is that x86 has atomic bit ops and byte writes.
Early alpha did not.

> > > You still have to ensure the compiler doesn't do wider rmw cycles.
> > > I believe the recent versions of gcc won't do wider accesses for volatile 
> > > data.
> > 
> > I don't understand this comment.  You seem to be implying gcc would do a
> > 64 bit RMW for a 32 bit store ... that would be daft when a single
> > instruction exists to perform the operation on all architectures.
> 
> Read the object code and weep...
> It is most likely to happen for operations that are rmw (eg bit set).
> For instance the arm cpu has limited offsets for 16bit accesses, for
> normal structures the compiler is likely to use a 32bit rmw sequence
> for a 16bit field that has a large offset.
> The C language allows the compiler to do it for any access (IIRC including
> volatiles).

I think you might be confusing different things.  Most RISC CPUs can't
do 32 bit store immediates because there aren't enough bits in their
arsenal, so they tend to split 32 bit loads into a left and right part
(first the top then the offset).  This (and other things) are mostly
what you see in code.  However, 32 bit register stores are still atomic,
which is all we require.  It's not really the compiler's fault, it's
mostly an architectural limitation.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 0/4] scsi: ufs & ums-* & esp_scsi: fix module reference counting

2015-05-05 Thread James Bottomley
On Tue, 2015-05-05 at 10:25 -0400, Alan Stern wrote:
> On Mon, 4 May 2015, James Bottomley wrote:
> 
> > On Mon, 2015-05-04 at 16:09 -0400, Alan Stern wrote:
> > > On Mon, 4 May 2015, James Bottomley wrote:
> > > 
> > > > However, it does also strike me that these three drivers have problems
> > > > because they're using the wrong initialisation pattern: the template is
> > > > supposed to be in the bus connector for compound drivers not in the
> > > > core.
> > > 
> > > Why is it supposed to be done that way?  Isn't that less efficient?  It 
> > > means you have to have a separate copy of the template for each bus 
> > > connector driver, instead of letting them all share a common template 
> > > in the core.
> > 
> > Well, no it doesn't.  The way 53c700 implements it is that there is a
> > common template in the core.  The drivers just initialise their variant
> > fields (for 53c700 it's name, proc_name and this_id) and the core fills
> > in the rest.  Admittedly wd33c93 doesn't quite get this right, that's
> > why I cited 53c700.
> 
> You seem to be agreeing with me, even though you think you are 
> disagreeing.
> 
>   "... there is a common template in the core." -- that's one
>   scsi_host_template structure.
> 
>   "The drivers just initialize their variant fields ... and the
>   core fills in the rest." -- that's an additional
>   scsi_host_template structure for each driver.
> 
> The total comes to N+1 scsi_host_templates, where N is the number of 
> drivers.
> 
> Now consider the way usb-storage does things.
> 
>   There is a common template in the core.  That's one
>   scsi_host_template structure.
> 
>   The drivers use the core's template.  They have _no_ variant
>   fields other than .owner.  That's why I thought Akinobu's idea
>   of putting the owner field in the Scsi_Host structure was a
>   good idea.
> 
> The total comes to 1 scsi_host_template.  Is that not better than N+1?
> 
> Consider the patch Akinobu just posted.  In addition to a whole bunch 
> of new code, it adds this:
> 
> > --- a/drivers/usb/storage/alauda.c
> > +++ b/drivers/usb/storage/alauda.c
> ...
> > @@ -1232,6 +1233,8 @@ static int alauda_transport(struct scsi_cmnd *srb, 
> > struct us_data *us)
> > return USB_STOR_TRANSPORT_FAILED;
> >  }
> >  
> > +static struct scsi_host_template alauda_host_template;
> 
> alauda.c was the only driver modified in that patch, and it gained an 
> new scsi_host_template.
> 
> For the case where the variants are identical in all respects other 
> than .owner, doesn't it make sense to allow them to share a single 
> scsi_host_template?
> 
> The original design of the SCSI stack envisioned multiple drivers, each 
> in control of multiple SCSI hosts.  The idea was that 
> scsi_host_template would be associated with the driver and Scsi_Host 
> with the individual host.
> 
> Now the kernel has evolved, and we have multiple drivers, some of which
> contain multiple subdrivers, each in control of multiple SCSI hosts.  
> In this situation we should be flexible enough to allow the
> scsi_host_template to be associated with either the driver or the
> subdriver (decision to be made by the driver).  When the only variant
> field is .owner, it makes sense to associate the scsi_host_template
> with the driver, not force it to be associated with the subdriver.
> 
> The cost is duplication of the .owner field in every Scsi_Host.  The 
> savings is a reduction in the number of scsi_host_templates.

So your essential objection is the host template duplication?  I know
it's a couple of hundred bytes, but surely its dwarfed by all the other
stuff you have to duplicate ... the module size of each of these is
around 0.25MB, so a couple of hundred bytes would seem a bit
insignificant.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] USB: storage: add "no SYNCHRONIZE CACHE" quirk

2015-06-22 Thread James Bottomley
On Mon, 2015-06-22 at 10:55 -0400, Alan Stern wrote:
> Some USB mass-storage devices claim to have a write-back cache but
> don't support the SYNCHRONIZE CACHE command, which means there is no
> way to tell these devices to flush their caches out to permanent
> storage.  Unfortunately, there is nothing we can do about this.
> 
> Until recently this deficiency did not cause any noticeable problems.
> But following commit 89fb4cd1f717 ("scsi: handle flush errors
> properly"), the errors are propagated to userspace where they get
> reported as failures.
> 
> As a workaround, this patch adds a quirks flag for these devices to
> the usb-storage driver, and a corresponding SCSI device flag, to
> indicate the device can't handle SYNCHRONIZE CACHE.  If the flag is
> set, the sd driver will override the cache settings reported by the
> device, so that the device appears to be write-through.  This will
> prevent the unsupported command from being sent.
> 
> This addresses Bugzilla #89511.

I'm not sure I entirely like this:  we are back again treating data
corruption problems silently.

However, I also believe treating a single flush failure as a critical
filesystem error is also wrong:  The data's all there correctly; all it
does is introduce a potential window were the FS could get corrupted in
the unlikely event the system crashed.

Obviously, for a disk with a writeback cache that can't do flush, that
window is much wider and the real solution should be to try to switch
the cache to write through.
How about something like this patch?  It transforms FS FLUSH into a log
warning from an error but preserves the error on any other path.  You'll
still get a fairly continuous dump of warnings for one of these devices,
though ... do they respond to mode selects turning off the writeback?

James

---

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 20badd7..bb9f9f3 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -173,10 +173,21 @@ static bool blk_flush_complete_seq(struct request *rq,
BUG_ON(rq->flush.seq & seq);
rq->flush.seq |= seq;
 
-   if (likely(!error))
+   if (likely(!error)) {
seq = blk_flush_cur_seq(rq);
-   else
+   } else {
seq = REQ_FSEQ_DONE;
+   printk_ratelimited(KERN_ERR "%s: flush failed: data integrity 
problem\n",
+  rq->rq_disk ? rq->rq_disk->disk_name : "?");
+   /*
+* returning an error to the FS is wrong: the data is all
+* there, it just might not be written out in the expected
+* order and thus have a window where the integrity is suspect
+* in a crash.  Given the small likelihood of actually
+* crashing, we should just log a warning here.
+*/
+   error = 0;
+   }
 
switch (seq) {
case REQ_FSEQ_PREFLUSH:


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in


Re: [PATCH] USB: storage: add "no SYNCHRONIZE CACHE" quirk

2015-06-22 Thread James Bottomley
On Mon, 2015-06-22 at 13:30 -0400, Alan Stern wrote:
> On Mon, 22 Jun 2015, James Bottomley wrote:
> 
> > I'm not sure I entirely like this:  we are back again treating data
> > corruption problems silently.
> > 
> > However, I also believe treating a single flush failure as a critical
> > filesystem error is also wrong:  The data's all there correctly; all it
> > does is introduce a potential window were the FS could get corrupted in
> > the unlikely event the system crashed.
> > 
> > Obviously, for a disk with a writeback cache that can't do flush, that
> > window is much wider and the real solution should be to try to switch
> > the cache to write through.
> 
> I agree.  Doing the switch manually (by writing to the "cache_type" 
> attribute file) works, but it's a nuisance to do this when you have a 
> portable USB drive that gets moved among a bunch of machines.

Perhaps it might be wise to do this to every USB device ... for external
devices, the small performance gain doesn't really make up for the
potential data loss.

> > How about something like this patch?  It transforms FS FLUSH into a log
> > warning from an error but preserves the error on any other path.  You'll
> > still get a fairly continuous dump of warnings for one of these devices,
> > though ... do they respond to mode selects turning off the writeback?
> 
> I would be very surprised if any of those drives support MODE SELECT at 
> all.

I assume the cache type attribute file you refer to above is just
pretending their cache is write through rather than actually setting it
to be so?  The original IDE device had no way of turning their cache
types to write through either, but the manufacturers were eventually
convinced of the error of their ways.

> Maybe your patch will be acceptable, though.  We'll have to hear from 
> Markus and Matt.

We'll probably have to take this to fsdevel as well ... they might have
opinions about whether the FS wants to be informed about flush failure.

James

> Alan Stern
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in


Re: [PATCH] USB: storage: add "no SYNCHRONIZE CACHE" quirk

2015-06-22 Thread James Bottomley
On Mon, 2015-06-22 at 13:48 -0400, Alan Stern wrote:
> On Mon, 22 Jun 2015, James Bottomley wrote:
> 
> > On Mon, 2015-06-22 at 13:30 -0400, Alan Stern wrote:
> > > On Mon, 22 Jun 2015, James Bottomley wrote:
> > > 
> > > > I'm not sure I entirely like this:  we are back again treating data
> > > > corruption problems silently.
> > > > 
> > > > However, I also believe treating a single flush failure as a critical
> > > > filesystem error is also wrong:  The data's all there correctly; all it
> > > > does is introduce a potential window were the FS could get corrupted in
> > > > the unlikely event the system crashed.
> > > > 
> > > > Obviously, for a disk with a writeback cache that can't do flush, that
> > > > window is much wider and the real solution should be to try to switch
> > > > the cache to write through.
> > > 
> > > I agree.  Doing the switch manually (by writing to the "cache_type" 
> > > attribute file) works, but it's a nuisance to do this when you have a 
> > > portable USB drive that gets moved among a bunch of machines.
> > 
> > Perhaps it might be wise to do this to every USB device ... for external
> > devices, the small performance gain doesn't really make up for the
> > potential data loss.
> > 
> > > > How about something like this patch?  It transforms FS FLUSH into a log
> > > > warning from an error but preserves the error on any other path.  You'll
> > > > still get a fairly continuous dump of warnings for one of these devices,
> > > > though ... do they respond to mode selects turning off the writeback?
> > > 
> > > I would be very surprised if any of those drives support MODE SELECT at 
> > > all.
> > 
> > I assume the cache type attribute file you refer to above is just
> > pretending their cache is write through rather than actually setting it
> > to be so?
> 
> Yes; I'm referring to cache_type_store() in sd.c, and writing
> "temporary write through", which does not issue a MODE SELECT command.  
> It would be easy enough for people to try leaving out the "temporary", 
> but I don't expect it to work.
> 
> >  The original IDE device had no way of turning their cache
> > types to write through either, but the manufacturers were eventually
> > convinced of the error of their ways.
> 
> In this case the stupidity resides in the USB-ATA bridge.  You can see 
> the gory details at
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=89511#c19

OK, so that says the SAT in the bridge doesn't know what to do with
MODE_SELECT (probably unsurprising given that it's a usb bridge).  the
SATA disk should respond to the ATA command SET FEATURES, though.
Presuming we can get it through the bridge.

You can try it with

hdparm -W 0 

optionally with --prefer_ata_12 to do the 12 instead of 16 byte
encapsulation and see if that makes a difference.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in


Re: [PATCH] USB: storage: add "no SYNCHRONIZE CACHE" quirk

2015-06-26 Thread James Bottomley
On Fri, 2015-06-26 at 11:43 +0200, Stefan Richter wrote:
> On Jun 22 James Bottomley wrote:
> > On Mon, 2015-06-22 at 13:30 -0400, Alan Stern wrote:
> > > On Mon, 22 Jun 2015, James Bottomley wrote:
> > > > Obviously, for a disk with a writeback cache that can't do flush, that
> > > > window is much wider and the real solution should be to try to switch
> > > > the cache to write through.
> > > 
> > > I agree.  Doing the switch manually (by writing to the "cache_type" 
> > > attribute file) works, but it's a nuisance to do this when you have a 
> > > portable USB drive that gets moved among a bunch of machines.
> > 
> > Perhaps it might be wise to do this to every USB device ... for external
> > devices, the small performance gain doesn't really make up for the
> > potential data loss.
> 
> Just a small note on the assumption of externally (and in extension,
> temporarily) attached devices:  Not all USB-attached devices are external,
> and not all external devices are used as removable devices.

The problems don't depend on the connection type: internal devices which
have a writeback cache and don't accept flush commands have data
integrity problems too.  I can't really think of many situations where
you'd be willing to sacrifice data integrity for performance.

James

> (For example, I am using 2 internal CD-ROMs via USB-2 + usb-storage and 2
> internal HDDs via USB 3 + uas in a PC with too few SATA ports and no PCIe
> slot to spare for a controller add-on card, but plenty of USB headers
> available on the mainboard.  Similarly, some NASes have their operating
> system located on a USB-attached device.  Small offices use USB-attached
> disks for backup and won't detach such a disk until rotation for off-site
> deposit.  Not to mention embedded computers with USB-attached but fixed
> disks.)



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with USB-to-SATA adapters

2014-08-20 Thread James Bottomley
On Wed, 2014-08-20 at 16:18 -0400, Dale R. Worley wrote:
> I don't know if this is the correct place for this problem, but you
> people can probably tell me the correct place.

linux-usb can probably also help (cc'd).

> I have two "USB to SATA adapter" dongles.  In general, they work fine.
> However I've discovered that one of them handles large (1 TB and
> above) drives correctly and one does not.  In particular, when I use
> the unsuccessful one, neither fdisk nor gdisk reports the disk size
> correctly, and the kernel does not read the partition table.
> 
> More confusingly, both dongles work correctly with large disks on
> Windows 7 Professional.
> 
> Both of the following cases are when a 3.0 TB drive is attached.
> 
> The successful device has the "Diablo" brand.  The most interesting
> messages in the log when I plug in the USB are (the complete message
> set is appended below):
> 
> Aug 18 21:56:53 hobgoblin kernel: [  294.229462] usb 2-1.3: New USB device 
> found, idVendor=152d, idProduct=2338
> Aug 18 21:56:53 hobgoblin kernel: [  294.229475] usb 2-1.3: New USB device 
> strings: Mfr=1, Product=2, SerialNumber=5
> Aug 18 21:56:53 hobgoblin kernel: [  294.229482] usb 2-1.3: Product: USB to 
> ATA/ATAPI bridge
> Aug 18 21:56:53 hobgoblin kernel: [  294.229488] usb 2-1.3: Manufacturer: 
> JMicron
> Aug 18 21:56:53 hobgoblin kernel: [  294.229495] usb 2-1.3: SerialNumber: 
> 01D91CC4
> Aug 18 21:56:54 hobgoblin kernel: [  295.236576] sd 7:0:0:0: [sdb] Very big 
> device. Trying to use READ CAPACITY(16).
> Aug 18 21:56:54 hobgoblin kernel: [  295.236955] sd 7:0:0:0: [sdb] 5860533168 
> 512-byte logical blocks: (3.00 TB/2.72 TiB)
> 
> The unsuccessful device has the "Ez-Connect" brand.  The most
> interesting messages in the log when I plug in the USB are (the
> complete message set is appended below):
> 
> Aug 18 21:54:06 hobgoblin kernel: [  127.674374] usb 2-2: new high-speed USB 
> device number 4 using ehci-pci
> Aug 18 21:54:06 hobgoblin kernel: [  127.791128] usb 2-2: New USB device 
> found, idVendor=05e3, idProduct=0718
> Aug 18 21:54:06 hobgoblin kernel: [  127.791135] usb 2-2: New USB device 
> strings: Mfr=0, Product=1, SerialNumber=2
> Aug 18 21:54:06 hobgoblin kernel: [  127.791138] usb 2-2: Product: USB Storage
> Aug 18 21:54:06 hobgoblin kernel: [  127.791142] usb 2-2: SerialNumber: 
> 0033
> Aug 18 21:54:07 hobgoblin kernel: [  128.847269] sd 6:0:0:0: [sdb] 1565565872 
> 512-byte logical blocks: (801 GB/746 GiB)

So the basic problem here seems to be that this adapter has wrapped the
INQUIRY response instead of giving us the the conventional response that
means my device is too big.  This could be because the processing engine
on the card doesn't understand 16 byte commands (which are required for
addressing capacity > 2TB) or it could just be a firmware SCSI emulator
mistake.  The best way of telling might be to use the SG_IO interface to
send a read capacity 16.  sg_readcap --16 should be able to do this.

If you get a sensible response, chances are we should just quirk this
device to apply RC16 first.  If something goes wrong, it's likely
unfixably broken for devices > 2TB

James


> Dale
> 
> --
> For the Diablo adapter:
> 
> Aug 18 21:56:53 hobgoblin kernel: [  294.153343] usb 2-1.3: new high-speed 
> USB device number 5 using ehci-pci
> Aug 18 21:56:53 hobgoblin kernel: [  294.229462] usb 2-1.3: New USB device 
> found, idVendor=152d, idProduct=2338
> Aug 18 21:56:53 hobgoblin kernel: [  294.229475] usb 2-1.3: New USB device 
> strings: Mfr=1, Product=2, SerialNumber=5
> Aug 18 21:56:53 hobgoblin kernel: [  294.229482] usb 2-1.3: Product: USB to 
> ATA/ATAPI bridge
> Aug 18 21:56:53 hobgoblin kernel: [  294.229488] usb 2-1.3: Manufacturer: 
> JMicron
> Aug 18 21:56:53 hobgoblin kernel: [  294.229495] usb 2-1.3: SerialNumber: 
> 01D91CC4
> Aug 18 21:56:53 hobgoblin kernel: [  294.231159] usb-storage 2-1.3:1.0: USB 
> Mass Storage device detected
> Aug 18 21:56:53 hobgoblin kernel: [  294.233951] scsi7 : usb-storage 2-1.3:1.0
> Aug 18 21:56:53 hobgoblin mtp-probe: checking bus 2, device 5: 
> "/sys/devices/pci:00/:00:1d.7/usb2/2-1/2-1.3"
> Aug 18 21:56:53 hobgoblin mtp-probe: bus: 2, device: 5 was not an MTP device
> Aug 18 21:56:54 hobgoblin kernel: [  295.235735] scsi 7:0:0:0: Direct-Access  
>WDC WD30 EZRX-00SPEB0  PQ: 0 ANSI: 5
> Aug 18 21:56:54 hobgoblin kernel: [  295.236209] sd 7:0:0:0: Attached scsi 
> generic sg2 type 0
> Aug 18 21:56:54 hobgoblin kernel: [  295.236576] sd 7:0:0:0: [sdb] Very big 
> device. Trying to use READ CAPACITY(16).
> Aug 18 21:56:54 hobgoblin kernel: [  295.236955] sd 7:0:0:0: [sdb] 5860533168 
> 512-byte logical blocks: (3.00 TB/2.72 TiB)
> Aug 18 21:56:54 hobgoblin kernel: [  295.237957] sd 7:0:0:0: [sdb] Write 
> Protect is off
> Aug 18 21:56:54 hobgoblin kernel: [  295.238957] sd 7:0:0:0: [sdb] No Caching 
> mode page found
> Aug 18 21:56:5

Re: [Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN

2014-08-22 Thread James Bottomley

On Fri, 2014-08-22 at 10:53 -0400, Alan Stern wrote:
> On Thu, 21 Aug 2014, Christoph Hellwig wrote:
> 
> > On Thu, Aug 21, 2014 at 05:43:41PM -0400, Martin K. Petersen wrote:
> > > Alan> Okay, here's a patch that implements the suggestion, except that I
> > > Alan> put the flag in the Scsi_Host structure instead of the template.
> > > Alan> This was to minimize the impact of the change.  Among the various
> > > Alan> SCSI-over-USB transports, only the Bulk-Only transport gives the
> > > Alan> LUN separately from the CDB.  I don't know if there are any
> > > Alan> multi-LUN USB devices that don't use the Bulk-Only transport, but
> > > Alan> if there are then they won't work if the LUN isn't stored in
> > > Alan> CDB[1].
> > > 
> > > I'm in agreement with this approach.
> > 
> > I like it too. One idea to unclutter the fastpath would be to have a
> > single flag that controls if the LUN is set which is based on the
> > host(-template) flag and the scsi level, which would allow us to remove
> > all the clutter around that area.
> 
> Good idea.  An enhanced patch is below.  If I can get a Tested-By: from
> Tiziano and one or two Acked-By: responses, I'll submit this for the
> current and stable kernels.
> 
> Sending the initial INQUIRY command to LUNs larger than 0 involves a
> chicken-and-egg problem -- we don't know whether to fill in the LUN
> bits in the command until we know the SCSI level, and we don't know the
> SCSI level until the INQUIRY response is received.  The solution we 
> have been using is to store the most recently discovered level in the 
> target structure, and use it as a default.  If probing starts with LUN 
> 0, and if all the LUNs have similar levels, this ought to work.
> 
> Except for one thing: The code does store the default level in the
> scsi_target, but forgets to copy it back into each newly allocated
> scsi_device!  I added a line to do that into the patch.
> 
> Alan Stern
> 
> 
> 
> Index: usb-3.16/include/scsi/scsi_host.h
> ===
> --- usb-3.16.orig/include/scsi/scsi_host.h
> +++ usb-3.16/include/scsi/scsi_host.h
> @@ -695,6 +695,9 @@ struct Scsi_Host {
>   /* The controller does not support WRITE SAME */
>   unsigned no_write_same:1;
>  
> + /* The transport requires the LUN bits NOT to be stored in CDB[1] */
> + unsigned no_scsi2_lun_in_cdb:1;

I think Christoph mentioned shortening this flag length, but personally
I'm not that bothered.

>   /*
>* Optional work queue to be utilized by the transport
>*/
> Index: usb-3.16/include/scsi/scsi_device.h
> ===
> --- usb-3.16.orig/include/scsi/scsi_device.h
> +++ usb-3.16/include/scsi/scsi_device.h
> @@ -174,6 +174,7 @@ struct scsi_device {
>   unsigned wce_default_on:1;  /* Cache is ON by default */
>   unsigned no_dif:1;  /* T10 PI (DIF) should be disabled */
>   unsigned broken_fua:1;  /* Don't set FUA bit */
> + unsigned lun_in_cdb:1;  /* Store LUN bits in CDB[1] */
>  
>   atomic_t disk_events_disable_depth; /* disable depth for disk events */
>  
> Index: usb-3.16/drivers/scsi/scsi.c
> ===
> --- usb-3.16.orig/drivers/scsi/scsi.c
> +++ usb-3.16/drivers/scsi/scsi.c
> @@ -674,14 +674,10 @@ int scsi_dispatch_cmd(struct scsi_cmnd *
>   goto out;
>   }
>  
> - /* 
> -  * If SCSI-2 or lower, store the LUN value in cmnd.
> -  */
> - if (cmd->device->scsi_level <= SCSI_2 &&
> - cmd->device->scsi_level != SCSI_UNKNOWN) {
> + /* Store the LUN value in cmnd, if needed. */
> + if (cmd->device->lun_in_cdb)
>   cmd->cmnd[1] = (cmd->cmnd[1] & 0x1f) |
>  (cmd->device->lun << 5 & 0xe0);
> - }
>  
>   scsi_log_send(cmd);
>  
> Index: usb-3.16/drivers/scsi/scsi_scan.c
> ===
> --- usb-3.16.orig/drivers/scsi/scsi_scan.c
> +++ usb-3.16/drivers/scsi/scsi_scan.c
> @@ -257,6 +257,7 @@ static struct scsi_device *scsi_alloc_sd
>  
>   sdev->sdev_gendev.parent = get_device(&starget->dev);
>   sdev->sdev_target = starget;
> + sdev->scsi_level = starget->scsi_level;

Why is this necessary?  Isn't this set correctly in scsi_probe_lun?  The
target level is actually set from the device level.

Other than this, I'm fine with the code ... you can add the acked by
from me when we resolve the above question.

James


>   /* usually NULL and set by ->slave_alloc instead */
>   sdev->hostdata = hostdata;
> @@ -735,6 +736,15 @@ static int scsi_probe_lun(struct scsi_de
>   sdev->scsi_level++;
>   sdev->sdev_target->scsi_level = sdev->scsi_level;
> + /*
> +  * If SCSI-2 or lower, and if the transport requires it,
> +  * store the LUN value in CDB[1].
> +  */
> + if (sdev->scsi_level <= SC

Re: RES: AS2105-based enclosure size issues with >2TB HDDs

2014-08-25 Thread James Bottomley
On Mon, 2014-08-25 at 18:48 +, Alfredo Dal Ava Junior wrote:
> On Mon, 25 Aug 2014, Alan Stern wrote:
> > 
> > Don't forget that lots of disks go crazy if you try to read from a 
> > nonexistent
> > block, that is, one beyond the end of the disk.
> > IMO, this bug cannot be worked around in any reasonable manner.  The
> > device simply cannot handle disks larger than 2 TB.
> 
> 
> This device works well on Windows 7 if HDD is already partitioned.

So how did the partition get on there at the correct size in the first
place?  Even under windows partition managers believe the disk size they
get from the system if the disk is blank.

>  Sounds like Win7 gnores the READ_CAPACITY value on a partitioned HDD.
> It shows 4TB on disk manager, but will fall back to 1.8TB if I remove
> the partition.

I assume for those of us on linux-scsi who don't have the start of this
thread, you already tried read capacity(16) and it has this same
problem?

> Could we do the same?

Hm, allowing users to set desired capacity by overriding the partition
size ... I'm sure that's not going to cause support problems ...

>  Would be possible to signalize to upper layers that capacity is not
> accurate (or return zero) on this device, and tell GPT handlers to
> bypass it's partition_size vs disk size consistency check?

If we can find a heuristic to set the correct capacity in the kernel,
then we may be able to fix this ... but as Alain says, it looks hard.
We can't ask userspace to hack tools for broken devices.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN

2014-08-25 Thread James Bottomley
On Mon, 2014-08-25 at 10:44 -0400, Alan Stern wrote:

> James, can you explain how the INQUIRY command in scsi_probe_lun()  
> managed to work back in the days when multi-lun SCSI-2 devices were
> common?  sdev->scsi_level doesn't get set when sdev is allocated, so it
> initially contains 0, so the LUN bits won't get filled in when the
> first INQUIRY command is sent.  Then how could the target know which
> logical unit the INQUIRY was meant for?

Best guess, some patches over the course of time altered the way we do
this and no-one noticed.  I think it was probably the introduction of
the unknown SCSI data level that caused the breakage.

Historically, the LUN in CMD bits is left over from SCSI-1; it was
incorporated into SCSI-2 for backward compatibility (even though SCSI-2
moved the LUN specification to the identify message).  In SCSI-3 and
beyond, those bits were obsoleted and transports took sole
responsibility for LUN handling.  I'm fairly certain all the SCSI-1
devices relying on this behaviour have long ago migrated to the great
data centre in the sky.

Alan's fix looks reasonable because we probe LUN 0 first (for SCSI-1 and
2 which has parallel scanning), which is why it doesn't matter (the bits
are set to zero) and once we have LUN 0 we propagate the data to the
target and make it the basis for future checks.  I'd like to see a
comment explaining this in the code, though ...
James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug 80711] [PATCH]SG_FLAG_LUN_INHIBIT is no longer implemented and there's not way to prevent the kernel from using the 2nd cdb byte for the LUN

2014-08-25 Thread James Bottomley
On Mon, 2014-08-25 at 17:19 -0400, Alan Stern wrote:
> On Mon, 25 Aug 2014, Alan Stern wrote:
> 
> > On Mon, 25 Aug 2014, James Bottomley wrote:
> > 
> > > On Mon, 2014-08-25 at 10:44 -0400, Alan Stern wrote:
> > > 
> > > > James, can you explain how the INQUIRY command in scsi_probe_lun()  
> > > > managed to work back in the days when multi-lun SCSI-2 devices were
> > > > common?  sdev->scsi_level doesn't get set when sdev is allocated, so it
> > > > initially contains 0, so the LUN bits won't get filled in when the
> > > > first INQUIRY command is sent.  Then how could the target know which
> > > > logical unit the INQUIRY was meant for?
> > > 
> > > Best guess, some patches over the course of time altered the way we do
> > > this and no-one noticed.  I think it was probably the introduction of
> > > the unknown SCSI data level that caused the breakage.
> > 
> > Heh.  The change was made by commit 4d7db04a7a69 ([SCSI] add
> > SCSI_UNKNOWN and LUN transfer limit restrictions) back in 2006 -- the
> > 2.6.17 kernel.  If nobody has complained in all this time then it's
> > probably not worth changing.
> 
> It turns out the code is already there, and I didn't realize because I
> was looking at the wrong source file.  scsi_sysfs_device_initialize()
> already does:
> 
>   sdev->scsi_level = starget->scsi_level;
> 
> Here's the update to the patch, adding an appropriate comment and
> setting the new sdev->lun_in_sdb flag properly:

Looks good.  Add your signed-off-by and you can add my acked-by as well.

James

> 
> Index: usb-3.16/drivers/scsi/scsi_sysfs.c
> ===
> --- usb-3.16.orig/drivers/scsi/scsi_sysfs.c
> +++ usb-3.16/drivers/scsi/scsi_sysfs.c
> @@ -1238,7 +1238,19 @@ void scsi_sysfs_device_initialize(struct
>   sdev->sdev_dev.class = &sdev_class;
>   dev_set_name(&sdev->sdev_dev, "%d:%d:%d:%d",
>sdev->host->host_no, sdev->channel, sdev->id, sdev->lun);
> + /*
> +  * Get a default scsi_level from the target (derived from sibling
> +  * devices).  This is the best we can do for guessing how to set
> +  * sdev->lun_in_cdb for the initial INQUIRY command.  For LUN 0 the
> +  * setting doesn't matter, because all the bits are zero anyway.
> +  * But it does matter for higher LUNs.
> +  */
>   sdev->scsi_level = starget->scsi_level;
> + if (sdev->scsi_level <= SCSI_2 &&
> + sdev->scsi_level != SCSI_UNKNOWN &&
> + !shost->no_scsi2_lun_in_cdb)
> + sdev->lun_in_cdb = 1;
> +
>   transport_setup_device(&sdev->sdev_gendev);
>   spin_lock_irqsave(shost->host_lock, flags);
>   list_add_tail(&sdev->same_target_siblings, &starget->devices);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with USB-to-SATA adapters (was: AS2105-based enclosure size issues with >2TB HDDs)

2014-08-26 Thread James Bottomley
On Tue, 2014-08-26 at 15:39 -0400, Dale R. Worley wrote:
> This is almost certainly a form of the problem reported in
> "AS2105-based enclosure size issues with >2TB HDDs".  I'm repeating my
> original message here so linux-usb can see it, and so it can be
> connected to the older thread.  I'll address it in another message.

Did you try read capacity 16 on it?  What happened?  (the AS2105 rejects
read capacity 16, so there's no reliable way to deduce the capacity of
drives over 2TB).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with USB-to-SATA adapters (was: AS2105-based enclosure size issues with >2TB HDDs)

2014-08-27 Thread James Bottomley
On Wed, 2014-08-27 at 15:21 -0400, Dale R. Worley wrote:
> > From: James Bottomley 
> 
> > Did you try read capacity 16 on it?  What happened?  (the AS2105 rejects
> > read capacity 16, so there's no reliable way to deduce the capacity of
> > drives over 2TB).
> 
> OK, I had to track down which package contains sg_readcap.
> 
> The adapter that fails gives this output:
> 
> # sg_readcap --16 /dev/sdb
> bad field in READ CAPACITY (16) cdb including unsupported service action

OK, so that's definitely the tame taxonomy.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with USB-to-SATA adapters (was: AS2105-based enclosure size issues with >2TB HDDs)

2014-09-03 Thread James Bottomley
On Wed, 2014-09-03 at 15:05 -0400, Alan Stern wrote:
> On Wed, 3 Sep 2014, Dale R. Worley wrote:
> 
> > > From: Alan Stern 
> > > 
> > > On Fri, 29 Aug 2014, Matthew Dharm wrote:
> > > > Is there an 'easy' way to override the detected size of a storage
> > > > device from userspace?  If we had that, someone could write a helper
> > > > application which looked for this particular fubar and try to Do The
> > > > Right Thing(tm), or at least offer the user some options.
> > > 
> > > You mean, force a Media Change event and override the capacity reported 
> > > by the hardware?  I'm not aware of any API for doing that, although it 
> > > probably wouldn't be too hard to add one.
> > > 
> > > How would the user know what value to put in for the capacity?  Unless 
> > > the drive had been hooked up to a different computer and the user 
> > > manually noted the correct capacity and typed it in, it would have to 
> > > be guesswork.
> > 
> > The value would most sanely be extracted from the partition table.
> > (After verifying that the partition table looks correct.)  That seems
> > to be what Windows does, and it seems to work consistently enough for
> > Windows to trust that method.  Or at least, it could take the disk
> > size to be the end of the last partition, which would at least make
> > all the partitions accessible.
> 
> If there is a partition table.  It might be worthwhile to try an ATA 
> pass-through command as well.
> 
> > As somebody else hinted at, the userspace program could check the USB
> > device against a list of device types known to have this problem.
> > 
> > It could even verify that the SCSI-reported size matches the size
> > reported by the partition table (modulo two-to-the-whatever) (at least
> > for GPT tables, I don't know if MBR tables report the disk size).
> 
> They don't.  Just the start and end of each partition.
> 
> > Do we have any way of knowing what algorithm Windows uses in this
> > situation?
> 
> Ask Microsoft.  I suspect you're not likely to get an answer, though.
> 
> Anyway, I can try writing a patch to add this capability.  We'll see if 
> it can solve your problem.

Before we embark on elaborate hacks, why don't we just make the capacity
writeable (by root) in sysfs from userspace (will require block change)?
We can then encode all the nasty heuristics (including gpt reading) in
userspace as a udev rule.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with USB-to-SATA adapters (was: AS2105-based enclosure size issues with >2TB HDDs)

2014-09-03 Thread James Bottomley
On Wed, 2014-09-03 at 16:30 -0400, Alan Stern wrote:
> On Wed, 3 Sep 2014, James Bottomley wrote:
> 
> > Before we embark on elaborate hacks, why don't we just make the capacity
> > writeable (by root) in sysfs from userspace (will require block change)?
> > We can then encode all the nasty heuristics (including gpt reading) in
> > userspace as a udev rule.
> 
> That's what I'm working on.  Except that I don't know where to do it in
> the block layer, so for now I'm adding the attribute to sd.c.
> 
> Where in the block layer would the right place be?  We want this to
> apply only to entire devices, not to individual partitions.

The bottom layer for this is part0.nr_sects which is the size attribute
you see in the block sysfs.  However, it looks like we keep a separate
value in sdkp, but we don't ever seem to use it (except to see if the
capacity has changed).

So this could be done in two ways: add a writeable capacity attribute in
sd.c, as you were originally thinking (it would call set_capacity() on
write and that would update the block layer) or make the size parameter
writeable.

This is how you would do the block layer one below, but there's no way
to inform the lower layer (so it better not have any information cached
that it makes use of).

James

---
diff --git a/block/genhd.c b/block/genhd.c
index 791f419..a114636 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -980,7 +980,7 @@ static DEVICE_ATTR(range, S_IRUGO, disk_range_show, NULL);
 static DEVICE_ATTR(ext_range, S_IRUGO, disk_ext_range_show, NULL);
 static DEVICE_ATTR(removable, S_IRUGO, disk_removable_show, NULL);
 static DEVICE_ATTR(ro, S_IRUGO, disk_ro_show, NULL);
-static DEVICE_ATTR(size, S_IRUGO, part_size_show, NULL);
+static DEVICE_ATTR(size, S_IRUGO|S_IWUSR, part_size_show, part_size_store);
 static DEVICE_ATTR(alignment_offset, S_IRUGO, disk_alignment_offset_show, 
NULL);
 static DEVICE_ATTR(discard_alignment, S_IRUGO, disk_discard_alignment_show,
   NULL);
diff --git a/block/partition-generic.c b/block/partition-generic.c
index 789cdea..d0cc418 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -87,6 +87,20 @@ ssize_t part_size_show(struct device *dev,
return sprintf(buf, "%llu\n",(unsigned long long)part_nr_sects_read(p));
 }
 
+ssize_t part_size_store(struct device *dev, struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   struct hd_struct *p = dev_to_part(dev);
+   u64 size;
+
+   if (count > 0 && sscanf(buf, "%llu", &size) > 0)
+   part_nr_sects_write(p, size);
+   else
+   return -EINVAL;
+
+   return count;
+}
+
 static ssize_t part_ro_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index ec274e0..c9b3473 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -628,6 +628,9 @@ extern void blk_unregister_region(dev_t devt, unsigned long 
range);
 
 extern ssize_t part_size_show(struct device *dev,
  struct device_attribute *attr, char *buf);
+extern ssize_t part_size_store(struct device *dev,
+  struct device_attribute *attr,
+  const char *buf, size_t count);
 extern ssize_t part_stat_show(struct device *dev,
  struct device_attribute *attr, char *buf);
 extern ssize_t part_inflight_show(struct device *dev,


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >