Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-03 Thread Hans de Goede

Hi,

On 01/03/2014 11:00 PM, Sarah Sharp wrote:

On Fri, Jan 03, 2014 at 09:56:45AM +0100, Hans de Goede wrote:

Hi,

On 01/03/2014 01:45 AM, Sarah Sharp wrote:

Ok, Alan's additional patch fixed the warnings I was seeing on UAS
device unplug.  James, can you Cc me on the finished patch when you send
it in?

Hans, I don't want to send the UAS patches off to Greg until James'
patches get into mainline.  I believe Greg's usb-next tree is frozen at
this point, so they'll have to wait until 3.15.


Ugh, I must say I'm rather unhappy about this, waiting till 3.14 to give
time to shake things out was fine. But since the start of the 3.13 cycle,
there have been no issues found in the xhci / uas code.


I completely understand why you're unhappy.  I agree that the pull
request you sent me on Nov 18th is fine (aside from not being a GPG
signed git tag). [1]  I also agree that the UAS and xHCI driver changes
have been stable during my testing, other than triggering the SCSI oops
on UAS disconnect.


Yes it triggered an existing problem in another subsys, but the code itself
has been issue free all this time. If Greg's tree is indeed already frozen I
would rather have us asking an exception for this.


Detach yourself emotionally from your code and look at this request from
a maintainer's perspective.

You're asking me to push 69 patches to Greg after -rc6 is out, for a
driver that's been marked broken for several kernel releases (uas).  The
patches enable a feature that's been basically untested across all xHCI
host controllers (streams), and they add new userspace API for usbfs to>
expose streams.


Yes 69 patches which have been ready since November 18. Your argument is
that they were not in linux-next before now, my unhappiness comes from
that they could have been in linux-next for some time now already.


In the meantime, there's a big push to get code into linux-next at least
a week or two before the merge window opens.  I sent my last pull request
on Dec 20th, and Felipe closed his USB gadget tree on Dec 26th.  I had
hoped to get the SCSI issue settled so I could send in the UAS patches
on the 20th, but that didn't happen.  My tree is closed.

These fixes should get merged (and will!), but I will not ask Greg for
an exception to get these patches into 3.14.


Alternatively we could add all of it to 3.14 except for the patch removing
the BROKEN marking from uas. Or at least all the non uas bits, which are
useful to have by themselves.


I think the best way to proceed with this is for me to queue all the
xHCI patches in that pull request for usb-next once 3.14-rc1 is out, and
then you send a separate pull request to Greg.  You're going to need to
be able to send him pull requests with a signed git tag in the future
anyway.


Since you have all the patches in your tree now anyways I believe it would
be best if you just send a pull-req for all of them. I see little value
in me sending a separate pull-req for the uas patches. As discussed before
I'm fine with picking up uas maintenance from then on.

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-03 Thread Sarah Sharp
On Fri, Jan 03, 2014 at 09:56:45AM +0100, Hans de Goede wrote:
> Hi,
> 
> On 01/03/2014 01:45 AM, Sarah Sharp wrote:
> >Ok, Alan's additional patch fixed the warnings I was seeing on UAS
> >device unplug.  James, can you Cc me on the finished patch when you send
> >it in?
> >
> >Hans, I don't want to send the UAS patches off to Greg until James'
> >patches get into mainline.  I believe Greg's usb-next tree is frozen at
> >this point, so they'll have to wait until 3.15.
> 
> Ugh, I must say I'm rather unhappy about this, waiting till 3.14 to give
> time to shake things out was fine. But since the start of the 3.13 cycle,
> there have been no issues found in the xhci / uas code.

I completely understand why you're unhappy.  I agree that the pull
request you sent me on Nov 18th is fine (aside from not being a GPG
signed git tag). [1]  I also agree that the UAS and xHCI driver changes
have been stable during my testing, other than triggering the SCSI oops
on UAS disconnect.

> Yes it triggered an existing problem in another subsys, but the code itself
> has been issue free all this time. If Greg's tree is indeed already frozen I
> would rather have us asking an exception for this.

Detach yourself emotionally from your code and look at this request from
a maintainer's perspective.

You're asking me to push 69 patches to Greg after -rc6 is out, for a
driver that's been marked broken for several kernel releases (uas).  The
patches enable a feature that's been basically untested across all xHCI
host controllers (streams), and they add new userspace API for usbfs to
expose streams.

In the meantime, there's a big push to get code into linux-next at least
a week or two before the merge window opens.  I sent my last pull request
on Dec 20th, and Felipe closed his USB gadget tree on Dec 26th.  I had
hoped to get the SCSI issue settled so I could send in the UAS patches
on the 20th, but that didn't happen.  My tree is closed.

These fixes should get merged (and will!), but I will not ask Greg for
an exception to get these patches into 3.14.

> Alternatively we could add all of it to 3.14 except for the patch removing
> the BROKEN marking from uas. Or at least all the non uas bits, which are
> useful to have by themselves.

I think the best way to proceed with this is for me to queue all the
xHCI patches in that pull request for usb-next once 3.14-rc1 is out, and
then you send a separate pull request to Greg.  You're going to need to
be able to send him pull requests with a signed git tag in the future
anyway.

> James, what is the status of getting the fix for the refcount issue into
> mainline ?

(Greg, James will be queuing the SCSI fix for the 3.14 merge window.)

Sarah Sharp

[1] http://marc.info/?l=linux-usb&m=138478555324055&w=2
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-03 Thread James Bottomley
On Fri, 2014-01-03 at 09:56 +0100, Hans de Goede wrote:
> James, what is the status of getting the fix for the refcount issue into
> mainline ?

The plan is what I wrote in the patch intro:

"Since it's a major change to the infrastructure, we'll incubate
upstream first before backporting to stable."

That means with the merge window and delayed stable backport to make
sure it's sound.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-03 Thread Hans de Goede

Hi,

On 01/03/2014 01:45 AM, Sarah Sharp wrote:

On Sat, Dec 21, 2013 at 03:51:25PM -0500, Alan Stern wrote:

On Fri, 20 Dec 2013, Sarah Sharp wrote:


On Thu, Dec 19, 2013 at 11:48:47AM -0800, James Bottomley wrote:

It should apply incrementally on top of the previous two.  If it
actually works, I'll fold it into the first patch.


Well, it doesn't oops anymore, but it does generate a pile of warnings:


This was a simple oversight.  scsi_target_reap() was called at the
start of __scsi_remove_device(), but it should be called at the end.
The patch below, applied on top of James's three patches, will fix the
warnings.

Alan Stern

P.S.: The comment isn't entirely correct any more...


Ok, Alan's additional patch fixed the warnings I was seeing on UAS
device unplug.  James, can you Cc me on the finished patch when you send
it in?

Hans, I don't want to send the UAS patches off to Greg until James'
patches get into mainline.  I believe Greg's usb-next tree is frozen at
this point, so they'll have to wait until 3.15.


Ugh, I must say I'm rather unhappy about this, waiting till 3.14 to give
time to shake things out was fine. But since the start of the 3.13 cycle,
there have been no issues found in the xhci / uas code.

Yes it triggered an existing problem in another subsys, but the code itself
has been issue free all this time. If Greg's tree is indeed already frozen I
would rather have us asking an exception for this.

Alternatively we could add all of it to 3.14 except for the patch removing
the BROKEN marking from uas. Or at least all the non uas bits, which are
useful to have by themselves.

James, what is the status of getting the fix for the refcount issue into
mainline ?

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-02 Thread James Bottomley
On Thu, 2014-01-02 at 16:45 -0800, Sarah Sharp wrote:
> On Sat, Dec 21, 2013 at 03:51:25PM -0500, Alan Stern wrote:
> > On Fri, 20 Dec 2013, Sarah Sharp wrote:
> > 
> > > On Thu, Dec 19, 2013 at 11:48:47AM -0800, James Bottomley wrote:
> > > > It should apply incrementally on top of the previous two.  If it
> > > > actually works, I'll fold it into the first patch.
> > > 
> > > Well, it doesn't oops anymore, but it does generate a pile of warnings:
> > 
> > This was a simple oversight.  scsi_target_reap() was called at the
> > start of __scsi_remove_device(), but it should be called at the end.
> > The patch below, applied on top of James's three patches, will fix the
> > warnings.
> > 
> > Alan Stern
> > 
> > P.S.: The comment isn't entirely correct any more...
> 
> Ok, Alan's additional patch fixed the warnings I was seeing on UAS
> device unplug.  James, can you Cc me on the finished patch when you send
> it in?

A bit late: it's here, but it should be on your usb list as well:

http://marc.info/?l=linux-scsi&m=138870949118304

James

> Hans, I don't want to send the UAS patches off to Greg until James'
> patches get into mainline.  I believe Greg's usb-next tree is frozen at
> this point, so they'll have to wait until 3.15.
> 
> Sarah Sharp
> 
> > Index: usb-3.13/drivers/scsi/scsi_sysfs.c
> > ===
> > --- usb-3.13.orig/drivers/scsi/scsi_sysfs.c
> > +++ usb-3.13/drivers/scsi/scsi_sysfs.c
> > @@ -1028,13 +1028,6 @@ void __scsi_remove_device(struct scsi_de
> >  {
> > struct device *dev = &sdev->sdev_gendev;
> >  
> > -   /*
> > -* Paired with the kref_get() in scsi_sysfs_initialize().  We're
> > -* removing sysfs visibility from the device, so make the target
> > -* invisible if this was the last device underneath it.
> > -*/
> > -   scsi_target_reap(scsi_target(sdev));
> > -
> > if (sdev->is_visible) {
> > if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
> > return;
> > @@ -1059,6 +1052,13 @@ void __scsi_remove_device(struct scsi_de
> > sdev->host->hostt->slave_destroy(sdev);
> > transport_destroy_device(dev);
> >  
> > +   /*
> > +* Paired with the kref_get() in scsi_sysfs_initialize().  We're
> > +* removing sysfs visibility from the device, so make the target
> > +* invisible if this was the last device underneath it.
> > +*/
> > +   scsi_target_reap(scsi_target(sdev));
> > +
> > put_device(dev);
> >  }
> >  
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-02 Thread Sarah Sharp
On Sat, Dec 21, 2013 at 03:51:25PM -0500, Alan Stern wrote:
> On Fri, 20 Dec 2013, Sarah Sharp wrote:
> 
> > On Thu, Dec 19, 2013 at 11:48:47AM -0800, James Bottomley wrote:
> > > It should apply incrementally on top of the previous two.  If it
> > > actually works, I'll fold it into the first patch.
> > 
> > Well, it doesn't oops anymore, but it does generate a pile of warnings:
> 
> This was a simple oversight.  scsi_target_reap() was called at the
> start of __scsi_remove_device(), but it should be called at the end.
> The patch below, applied on top of James's three patches, will fix the
> warnings.
> 
> Alan Stern
> 
> P.S.: The comment isn't entirely correct any more...

Ok, Alan's additional patch fixed the warnings I was seeing on UAS
device unplug.  James, can you Cc me on the finished patch when you send
it in?

Hans, I don't want to send the UAS patches off to Greg until James'
patches get into mainline.  I believe Greg's usb-next tree is frozen at
this point, so they'll have to wait until 3.15.

Sarah Sharp

> Index: usb-3.13/drivers/scsi/scsi_sysfs.c
> ===
> --- usb-3.13.orig/drivers/scsi/scsi_sysfs.c
> +++ usb-3.13/drivers/scsi/scsi_sysfs.c
> @@ -1028,13 +1028,6 @@ void __scsi_remove_device(struct scsi_de
>  {
>   struct device *dev = &sdev->sdev_gendev;
>  
> - /*
> -  * Paired with the kref_get() in scsi_sysfs_initialize().  We're
> -  * removing sysfs visibility from the device, so make the target
> -  * invisible if this was the last device underneath it.
> -  */
> - scsi_target_reap(scsi_target(sdev));
> -
>   if (sdev->is_visible) {
>   if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
>   return;
> @@ -1059,6 +1052,13 @@ void __scsi_remove_device(struct scsi_de
>   sdev->host->hostt->slave_destroy(sdev);
>   transport_destroy_device(dev);
>  
> + /*
> +  * Paired with the kref_get() in scsi_sysfs_initialize().  We're
> +  * removing sysfs visibility from the device, so make the target
> +  * invisible if this was the last device underneath it.
> +  */
> + scsi_target_reap(scsi_target(sdev));
> +
>   put_device(dev);
>  }
>  
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-02 Thread James Bottomley
This set should fix our target problems with USB by making the target
visibility properly reference counted.  Since it's a major change to the
infrastructure, we'll incubate upstream first before backporting to
stable.

James Bottomley (2):
  [SCSI] fix our current target reap infrastructure.
  [SCSI] dual scan thread bug fix

 drivers/scsi/scsi_scan.c   | 108 +
 drivers/scsi/scsi_sysfs.c  |  20 ++---
 include/scsi/scsi_device.h |   3 +-
 3 files changed, 84 insertions(+), 47 deletions(-)

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2014-01-02 Thread James Bottomley
On Sat, 2013-12-21 at 15:51 -0500, Alan Stern wrote:
> On Fri, 20 Dec 2013, Sarah Sharp wrote:
> 
> > On Thu, Dec 19, 2013 at 11:48:47AM -0800, James Bottomley wrote:
> > > It should apply incrementally on top of the previous two.  If it
> > > actually works, I'll fold it into the first patch.
> > 
> > Well, it doesn't oops anymore, but it does generate a pile of warnings:
> 
> This was a simple oversight.  scsi_target_reap() was called at the
> start of __scsi_remove_device(), but it should be called at the end.
> The patch below, applied on top of James's three patches, will fix the
> warnings.

Thanks, I'll do a repost of the combined series

> Alan Stern
> 
> P.S.: The comment isn't entirely correct any more...

Yes, fixed the tense.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-21 Thread Alan Stern
On Fri, 20 Dec 2013, Sarah Sharp wrote:

> On Thu, Dec 19, 2013 at 11:48:47AM -0800, James Bottomley wrote:
> > It should apply incrementally on top of the previous two.  If it
> > actually works, I'll fold it into the first patch.
> 
> Well, it doesn't oops anymore, but it does generate a pile of warnings:

This was a simple oversight.  scsi_target_reap() was called at the
start of __scsi_remove_device(), but it should be called at the end.
The patch below, applied on top of James's three patches, will fix the
warnings.

Alan Stern

P.S.: The comment isn't entirely correct any more...



Index: usb-3.13/drivers/scsi/scsi_sysfs.c
===
--- usb-3.13.orig/drivers/scsi/scsi_sysfs.c
+++ usb-3.13/drivers/scsi/scsi_sysfs.c
@@ -1028,13 +1028,6 @@ void __scsi_remove_device(struct scsi_de
 {
struct device *dev = &sdev->sdev_gendev;
 
-   /*
-* Paired with the kref_get() in scsi_sysfs_initialize().  We're
-* removing sysfs visibility from the device, so make the target
-* invisible if this was the last device underneath it.
-*/
-   scsi_target_reap(scsi_target(sdev));
-
if (sdev->is_visible) {
if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0)
return;
@@ -1059,6 +1052,13 @@ void __scsi_remove_device(struct scsi_de
sdev->host->hostt->slave_destroy(sdev);
transport_destroy_device(dev);
 
+   /*
+* Paired with the kref_get() in scsi_sysfs_initialize().  We're
+* removing sysfs visibility from the device, so make the target
+* invisible if this was the last device underneath it.
+*/
+   scsi_target_reap(scsi_target(sdev));
+
put_device(dev);
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-19 Thread James Bottomley
On Thu, 2013-12-19 at 10:26 -0800, Sarah Sharp wrote:
> On Wed, Dec 18, 2013 at 04:05:05PM -0800, James Bottomley wrote:
> > On Wed, 2013-12-18 at 16:50 -0500, Alan Stern wrote:
> > > On Wed, 18 Dec 2013, Sarah Sharp wrote:
> > > 
> > > > On Mon, Dec 16, 2013 at 07:10:19AM -0800, James Bottomley wrote:
> > > > > This set should fix our target problems with USB by making the target
> > > > > visibility properly reference counted.  Since it's a major change to 
> > > > > the
> > > > > infrastructure, we'll incubate upstream first before backporting to
> > > > > stable.
> > > > > 
> > > > > James
> > > > 
> > > > I tried these patches, and they cause an oops when a USB mass storage
> > > > device is plugged in.  Note that this device uses the usb-storage
> > > > driver, not the uas driver.
> > > > 
> > > > [14248.340064] scsi6 : usb-storage 2-2:1.0
> > > > [14248.341083] usbcore: registered new interface driver usb-storage
> > > > [14248.346211] usbcore: registered new interface driver uas
> > > > [14249.339937] scsi 6:0:0:0: Direct-Access LexarJumpDrive   
> > > >  1.00 PQ: 0 ANSI: 6
> > > > [14249.340988] [ cut here ]
> > > > [14249.340999] WARNING: CPU: 3 PID: 5578 at lib/kobject.c:227 
> > > > kobject_add_internal+0x13f/0x350()
> > > > [14249.341003] kobject_add_internal failed for 6:0:0:0 (error: -2 
> > > > parent: target6:0:0)
> > > > [14249.341005] Modules linked in: uas usb_storage ctr ccm cuse dm_crypt 
> > > > uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev 
> > > > btusb x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel 
> > > > aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd arc4 iwldvm 
> > > > mac80211 microcode snd_hda_codec_hdmi iwlwifi psmouse 
> > > > snd_hda_codec_realtek serio_raw snd_hda_intel snd_usb_audio 
> > > > snd_hda_codec thinkpad_acpi cfg80211 joydev snd_usbmidi_lib nvram 
> > > > snd_hwdep snd_seq_midi snd_pcm snd_seq_midi_event snd_rawmidi lpc_ich 
> > > > rfcomm bnep snd_seq bluetooth snd_seq_device snd_page_alloc ehci_pci 
> > > > snd_timer ehci_hcd snd soundcore tpm_tis binfmt_misc btrfs libcrc32c 
> > > > xor raid6_pq hid_generic usbhid hid i915 ahci libahci e1000e sdhci_pci 
> > > > sdhci i2c_algo_bit drm_kms_helper ptp pps_core drm xhci_hcd video
> > > > [14249.341095] CPU: 3 PID: 5578 Comm: kworker/u16:0 Not tainted 
> > > > 3.13.0-rc1+ #142
> > > > [14249.341098] Hardware name: LENOVO 2325AP7/2325AP7, BIOS G2ET82WW 
> > > > (2.02 ) 09/11/2012
> > > > [14249.341105] Workqueue: events_unbound async_run_entry_fn
> > > > [14249.341108]  0009 88003aa9db60 81658a4e 
> > > > 88003aa9dba8
> > > > [14249.341115]  88003aa9db98 81048c3d 88006bc551b0 
> > > > fffe
> > > > [14249.341121]   8800bec22838 0200 
> > > > 88003aa9dbf8
> > > > [14249.341127] Call Trace:
> > > > [14249.341135]  [] dump_stack+0x4d/0x66
> > > > [14249.341142]  [] warn_slowpath_common+0x7d/0xa0
> > > > [14249.341148]  [] warn_slowpath_fmt+0x4c/0x50
> > > > [14249.341154]  [] ? _raw_spin_unlock+0x27/0x40
> > > > [14249.341159]  [] kobject_add_internal+0x13f/0x350
> > > > [14249.341163]  [] kobject_add+0x65/0xb0
> > > > [14249.341170]  [] ? get_device+0x30/0x30
> > > > [14249.341175]  [] ? klist_init+0x31/0x40
> > > > [14249.341181]  [] device_add+0x128/0x660
> > > > [14249.341186]  [] ? __pm_runtime_resume+0x5c/0x90
> > > > [14249.341193]  [] scsi_sysfs_add_sdev+0xac/0x340
> > > > [14249.341199]  [] do_scan_async+0x83/0x180
> > > > [14249.341204]  [] async_run_entry_fn+0x37/0x130
> > > > [14249.341210]  [] process_one_work+0x1f4/0x550
> > > > [14249.341215]  [] ? process_one_work+0x192/0x550
> > > > [14249.341220]  [] worker_thread+0x121/0x3a0
> > > > [14249.341225]  [] ? 
> > > > manage_workers.isra.22+0x2a0/0x2a0
> > > > [14249.341231]  [] kthread+0xfc/0x120
> > > > [14249.341238]  [] ? 
> > > > kthread_create_on_node+0x230/0x230
> > > > [14249.341243]  [] ret_from_fork+0x7c/0xb0
> > > > [14249.341249]  [] ? 
> > > > kthread_create_on_node+0x230/0x230
> > > > [14249.341253] ---[ end trace 7f1d8a449e6af5aa ]---
> > > > [14249.341259] scsi 6:0:0:0: failed to add device: -2
> > > 
> > > James:
> > > 
> > > The problem occurs when scsi_finish_async_scan() calls 
> > > scsi_sysfs_add_devices().
> > > 
> > > During an async scan, the devices get stored up and not made visible as
> > > they are found (see the end of scsi_add_lun()).  At the end, the target
> > > gets removed because it has no visible children, of course.  Then when
> > > the children do get added all at once, when the scan is over, it's too
> > > late.
> > > 
> > > How should this be fixed?  Forget about the en-masse registration and 
> > > do each device as it is found?
> > 
> > Great, I knew I'd find a reason to hate the async scanning code
> > eventually.
> > 
> > However, the solution is just to make the kref work for us.  We already
> > properly refcount everything, so we just ta

Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-19 Thread Sarah Sharp
On Wed, Dec 18, 2013 at 04:05:05PM -0800, James Bottomley wrote:
> On Wed, 2013-12-18 at 16:50 -0500, Alan Stern wrote:
> > On Wed, 18 Dec 2013, Sarah Sharp wrote:
> > 
> > > On Mon, Dec 16, 2013 at 07:10:19AM -0800, James Bottomley wrote:
> > > > This set should fix our target problems with USB by making the target
> > > > visibility properly reference counted.  Since it's a major change to the
> > > > infrastructure, we'll incubate upstream first before backporting to
> > > > stable.
> > > > 
> > > > James
> > > 
> > > I tried these patches, and they cause an oops when a USB mass storage
> > > device is plugged in.  Note that this device uses the usb-storage
> > > driver, not the uas driver.
> > > 
> > > [14248.340064] scsi6 : usb-storage 2-2:1.0
> > > [14248.341083] usbcore: registered new interface driver usb-storage
> > > [14248.346211] usbcore: registered new interface driver uas
> > > [14249.339937] scsi 6:0:0:0: Direct-Access LexarJumpDrive
> > > 1.00 PQ: 0 ANSI: 6
> > > [14249.340988] [ cut here ]
> > > [14249.340999] WARNING: CPU: 3 PID: 5578 at lib/kobject.c:227 
> > > kobject_add_internal+0x13f/0x350()
> > > [14249.341003] kobject_add_internal failed for 6:0:0:0 (error: -2 parent: 
> > > target6:0:0)
> > > [14249.341005] Modules linked in: uas usb_storage ctr ccm cuse dm_crypt 
> > > uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev btusb 
> > > x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel aes_x86_64 
> > > lrw gf128mul glue_helper ablk_helper cryptd arc4 iwldvm mac80211 
> > > microcode snd_hda_codec_hdmi iwlwifi psmouse snd_hda_codec_realtek 
> > > serio_raw snd_hda_intel snd_usb_audio snd_hda_codec thinkpad_acpi 
> > > cfg80211 joydev snd_usbmidi_lib nvram snd_hwdep snd_seq_midi snd_pcm 
> > > snd_seq_midi_event snd_rawmidi lpc_ich rfcomm bnep snd_seq bluetooth 
> > > snd_seq_device snd_page_alloc ehci_pci snd_timer ehci_hcd snd soundcore 
> > > tpm_tis binfmt_misc btrfs libcrc32c xor raid6_pq hid_generic usbhid hid 
> > > i915 ahci libahci e1000e sdhci_pci sdhci i2c_algo_bit drm_kms_helper ptp 
> > > pps_core drm xhci_hcd video
> > > [14249.341095] CPU: 3 PID: 5578 Comm: kworker/u16:0 Not tainted 
> > > 3.13.0-rc1+ #142
> > > [14249.341098] Hardware name: LENOVO 2325AP7/2325AP7, BIOS G2ET82WW (2.02 
> > > ) 09/11/2012
> > > [14249.341105] Workqueue: events_unbound async_run_entry_fn
> > > [14249.341108]  0009 88003aa9db60 81658a4e 
> > > 88003aa9dba8
> > > [14249.341115]  88003aa9db98 81048c3d 88006bc551b0 
> > > fffe
> > > [14249.341121]   8800bec22838 0200 
> > > 88003aa9dbf8
> > > [14249.341127] Call Trace:
> > > [14249.341135]  [] dump_stack+0x4d/0x66
> > > [14249.341142]  [] warn_slowpath_common+0x7d/0xa0
> > > [14249.341148]  [] warn_slowpath_fmt+0x4c/0x50
> > > [14249.341154]  [] ? _raw_spin_unlock+0x27/0x40
> > > [14249.341159]  [] kobject_add_internal+0x13f/0x350
> > > [14249.341163]  [] kobject_add+0x65/0xb0
> > > [14249.341170]  [] ? get_device+0x30/0x30
> > > [14249.341175]  [] ? klist_init+0x31/0x40
> > > [14249.341181]  [] device_add+0x128/0x660
> > > [14249.341186]  [] ? __pm_runtime_resume+0x5c/0x90
> > > [14249.341193]  [] scsi_sysfs_add_sdev+0xac/0x340
> > > [14249.341199]  [] do_scan_async+0x83/0x180
> > > [14249.341204]  [] async_run_entry_fn+0x37/0x130
> > > [14249.341210]  [] process_one_work+0x1f4/0x550
> > > [14249.341215]  [] ? process_one_work+0x192/0x550
> > > [14249.341220]  [] worker_thread+0x121/0x3a0
> > > [14249.341225]  [] ? manage_workers.isra.22+0x2a0/0x2a0
> > > [14249.341231]  [] kthread+0xfc/0x120
> > > [14249.341238]  [] ? kthread_create_on_node+0x230/0x230
> > > [14249.341243]  [] ret_from_fork+0x7c/0xb0
> > > [14249.341249]  [] ? kthread_create_on_node+0x230/0x230
> > > [14249.341253] ---[ end trace 7f1d8a449e6af5aa ]---
> > > [14249.341259] scsi 6:0:0:0: failed to add device: -2
> > 
> > James:
> > 
> > The problem occurs when scsi_finish_async_scan() calls 
> > scsi_sysfs_add_devices().
> > 
> > During an async scan, the devices get stored up and not made visible as
> > they are found (see the end of scsi_add_lun()).  At the end, the target
> > gets removed because it has no visible children, of course.  Then when
> > the children do get added all at once, when the scan is over, it's too
> > late.
> > 
> > How should this be fixed?  Forget about the en-masse registration and 
> > do each device as it is found?
> 
> Great, I knew I'd find a reason to hate the async scanning code
> eventually.
> 
> However, the solution is just to make the kref work for us.  We already
> properly refcount everything, so we just take the reap_ref on the target
> at the point the disk has to go through the remove device path, then
> just rely on refcounting ... a bit like this.

Do you want me to test this?  If so, should I apply it on top of the
previous patches, or separately?

Sarah Sharp

> ---
> 
> d

Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-18 Thread James Bottomley
On Wed, 2013-12-18 at 16:50 -0500, Alan Stern wrote:
> On Wed, 18 Dec 2013, Sarah Sharp wrote:
> 
> > On Mon, Dec 16, 2013 at 07:10:19AM -0800, James Bottomley wrote:
> > > This set should fix our target problems with USB by making the target
> > > visibility properly reference counted.  Since it's a major change to the
> > > infrastructure, we'll incubate upstream first before backporting to
> > > stable.
> > > 
> > > James
> > 
> > I tried these patches, and they cause an oops when a USB mass storage
> > device is plugged in.  Note that this device uses the usb-storage
> > driver, not the uas driver.
> > 
> > [14248.340064] scsi6 : usb-storage 2-2:1.0
> > [14248.341083] usbcore: registered new interface driver usb-storage
> > [14248.346211] usbcore: registered new interface driver uas
> > [14249.339937] scsi 6:0:0:0: Direct-Access LexarJumpDrive
> > 1.00 PQ: 0 ANSI: 6
> > [14249.340988] [ cut here ]
> > [14249.340999] WARNING: CPU: 3 PID: 5578 at lib/kobject.c:227 
> > kobject_add_internal+0x13f/0x350()
> > [14249.341003] kobject_add_internal failed for 6:0:0:0 (error: -2 parent: 
> > target6:0:0)
> > [14249.341005] Modules linked in: uas usb_storage ctr ccm cuse dm_crypt 
> > uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev btusb 
> > x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel aes_x86_64 
> > lrw gf128mul glue_helper ablk_helper cryptd arc4 iwldvm mac80211 microcode 
> > snd_hda_codec_hdmi iwlwifi psmouse snd_hda_codec_realtek serio_raw 
> > snd_hda_intel snd_usb_audio snd_hda_codec thinkpad_acpi cfg80211 joydev 
> > snd_usbmidi_lib nvram snd_hwdep snd_seq_midi snd_pcm snd_seq_midi_event 
> > snd_rawmidi lpc_ich rfcomm bnep snd_seq bluetooth snd_seq_device 
> > snd_page_alloc ehci_pci snd_timer ehci_hcd snd soundcore tpm_tis 
> > binfmt_misc btrfs libcrc32c xor raid6_pq hid_generic usbhid hid i915 ahci 
> > libahci e1000e sdhci_pci sdhci i2c_algo_bit drm_kms_helper ptp pps_core drm 
> > xhci_hcd video
> > [14249.341095] CPU: 3 PID: 5578 Comm: kworker/u16:0 Not tainted 3.13.0-rc1+ 
> > #142
> > [14249.341098] Hardware name: LENOVO 2325AP7/2325AP7, BIOS G2ET82WW (2.02 ) 
> > 09/11/2012
> > [14249.341105] Workqueue: events_unbound async_run_entry_fn
> > [14249.341108]  0009 88003aa9db60 81658a4e 
> > 88003aa9dba8
> > [14249.341115]  88003aa9db98 81048c3d 88006bc551b0 
> > fffe
> > [14249.341121]   8800bec22838 0200 
> > 88003aa9dbf8
> > [14249.341127] Call Trace:
> > [14249.341135]  [] dump_stack+0x4d/0x66
> > [14249.341142]  [] warn_slowpath_common+0x7d/0xa0
> > [14249.341148]  [] warn_slowpath_fmt+0x4c/0x50
> > [14249.341154]  [] ? _raw_spin_unlock+0x27/0x40
> > [14249.341159]  [] kobject_add_internal+0x13f/0x350
> > [14249.341163]  [] kobject_add+0x65/0xb0
> > [14249.341170]  [] ? get_device+0x30/0x30
> > [14249.341175]  [] ? klist_init+0x31/0x40
> > [14249.341181]  [] device_add+0x128/0x660
> > [14249.341186]  [] ? __pm_runtime_resume+0x5c/0x90
> > [14249.341193]  [] scsi_sysfs_add_sdev+0xac/0x340
> > [14249.341199]  [] do_scan_async+0x83/0x180
> > [14249.341204]  [] async_run_entry_fn+0x37/0x130
> > [14249.341210]  [] process_one_work+0x1f4/0x550
> > [14249.341215]  [] ? process_one_work+0x192/0x550
> > [14249.341220]  [] worker_thread+0x121/0x3a0
> > [14249.341225]  [] ? manage_workers.isra.22+0x2a0/0x2a0
> > [14249.341231]  [] kthread+0xfc/0x120
> > [14249.341238]  [] ? kthread_create_on_node+0x230/0x230
> > [14249.341243]  [] ret_from_fork+0x7c/0xb0
> > [14249.341249]  [] ? kthread_create_on_node+0x230/0x230
> > [14249.341253] ---[ end trace 7f1d8a449e6af5aa ]---
> > [14249.341259] scsi 6:0:0:0: failed to add device: -2
> 
> James:
> 
> The problem occurs when scsi_finish_async_scan() calls 
> scsi_sysfs_add_devices().
> 
> During an async scan, the devices get stored up and not made visible as
> they are found (see the end of scsi_add_lun()).  At the end, the target
> gets removed because it has no visible children, of course.  Then when
> the children do get added all at once, when the scan is over, it's too
> late.
> 
> How should this be fixed?  Forget about the en-masse registration and 
> do each device as it is found?

Great, I knew I'd find a reason to hate the async scanning code
eventually.

However, the solution is just to make the kref work for us.  We already
properly refcount everything, so we just take the reap_ref on the target
at the point the disk has to go through the remove device path, then
just rely on refcounting ... a bit like this.

James

---

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 34eab7b..cc6e5bd 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -996,7 +996,6 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
return error;
}
transport_add_device(&sdev->sdev_gendev);
-   kref_get(&starget->reap_ref); /* device 

Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-18 Thread Alan Stern
On Wed, 18 Dec 2013, Sarah Sharp wrote:

> On Mon, Dec 16, 2013 at 07:10:19AM -0800, James Bottomley wrote:
> > This set should fix our target problems with USB by making the target
> > visibility properly reference counted.  Since it's a major change to the
> > infrastructure, we'll incubate upstream first before backporting to
> > stable.
> > 
> > James
> 
> I tried these patches, and they cause an oops when a USB mass storage
> device is plugged in.  Note that this device uses the usb-storage
> driver, not the uas driver.
> 
> [14248.340064] scsi6 : usb-storage 2-2:1.0
> [14248.341083] usbcore: registered new interface driver usb-storage
> [14248.346211] usbcore: registered new interface driver uas
> [14249.339937] scsi 6:0:0:0: Direct-Access LexarJumpDrive1.00 
> PQ: 0 ANSI: 6
> [14249.340988] [ cut here ]
> [14249.340999] WARNING: CPU: 3 PID: 5578 at lib/kobject.c:227 
> kobject_add_internal+0x13f/0x350()
> [14249.341003] kobject_add_internal failed for 6:0:0:0 (error: -2 parent: 
> target6:0:0)
> [14249.341005] Modules linked in: uas usb_storage ctr ccm cuse dm_crypt 
> uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev btusb 
> x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel aes_x86_64 lrw 
> gf128mul glue_helper ablk_helper cryptd arc4 iwldvm mac80211 microcode 
> snd_hda_codec_hdmi iwlwifi psmouse snd_hda_codec_realtek serio_raw 
> snd_hda_intel snd_usb_audio snd_hda_codec thinkpad_acpi cfg80211 joydev 
> snd_usbmidi_lib nvram snd_hwdep snd_seq_midi snd_pcm snd_seq_midi_event 
> snd_rawmidi lpc_ich rfcomm bnep snd_seq bluetooth snd_seq_device 
> snd_page_alloc ehci_pci snd_timer ehci_hcd snd soundcore tpm_tis binfmt_misc 
> btrfs libcrc32c xor raid6_pq hid_generic usbhid hid i915 ahci libahci e1000e 
> sdhci_pci sdhci i2c_algo_bit drm_kms_helper ptp pps_core drm xhci_hcd video
> [14249.341095] CPU: 3 PID: 5578 Comm: kworker/u16:0 Not tainted 3.13.0-rc1+ 
> #142
> [14249.341098] Hardware name: LENOVO 2325AP7/2325AP7, BIOS G2ET82WW (2.02 ) 
> 09/11/2012
> [14249.341105] Workqueue: events_unbound async_run_entry_fn
> [14249.341108]  0009 88003aa9db60 81658a4e 
> 88003aa9dba8
> [14249.341115]  88003aa9db98 81048c3d 88006bc551b0 
> fffe
> [14249.341121]   8800bec22838 0200 
> 88003aa9dbf8
> [14249.341127] Call Trace:
> [14249.341135]  [] dump_stack+0x4d/0x66
> [14249.341142]  [] warn_slowpath_common+0x7d/0xa0
> [14249.341148]  [] warn_slowpath_fmt+0x4c/0x50
> [14249.341154]  [] ? _raw_spin_unlock+0x27/0x40
> [14249.341159]  [] kobject_add_internal+0x13f/0x350
> [14249.341163]  [] kobject_add+0x65/0xb0
> [14249.341170]  [] ? get_device+0x30/0x30
> [14249.341175]  [] ? klist_init+0x31/0x40
> [14249.341181]  [] device_add+0x128/0x660
> [14249.341186]  [] ? __pm_runtime_resume+0x5c/0x90
> [14249.341193]  [] scsi_sysfs_add_sdev+0xac/0x340
> [14249.341199]  [] do_scan_async+0x83/0x180
> [14249.341204]  [] async_run_entry_fn+0x37/0x130
> [14249.341210]  [] process_one_work+0x1f4/0x550
> [14249.341215]  [] ? process_one_work+0x192/0x550
> [14249.341220]  [] worker_thread+0x121/0x3a0
> [14249.341225]  [] ? manage_workers.isra.22+0x2a0/0x2a0
> [14249.341231]  [] kthread+0xfc/0x120
> [14249.341238]  [] ? kthread_create_on_node+0x230/0x230
> [14249.341243]  [] ret_from_fork+0x7c/0xb0
> [14249.341249]  [] ? kthread_create_on_node+0x230/0x230
> [14249.341253] ---[ end trace 7f1d8a449e6af5aa ]---
> [14249.341259] scsi 6:0:0:0: failed to add device: -2

James:

The problem occurs when scsi_finish_async_scan() calls 
scsi_sysfs_add_devices().

During an async scan, the devices get stored up and not made visible as
they are found (see the end of scsi_add_lun()).  At the end, the target
gets removed because it has no visible children, of course.  Then when
the children do get added all at once, when the scan is over, it's too
late.

How should this be fixed?  Forget about the en-masse registration and 
do each device as it is found?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-18 Thread Sarah Sharp
On Mon, Dec 16, 2013 at 07:10:19AM -0800, James Bottomley wrote:
> This set should fix our target problems with USB by making the target
> visibility properly reference counted.  Since it's a major change to the
> infrastructure, we'll incubate upstream first before backporting to
> stable.
> 
> James

I tried these patches, and they cause an oops when a USB mass storage
device is plugged in.  Note that this device uses the usb-storage
driver, not the uas driver.

[14248.340064] scsi6 : usb-storage 2-2:1.0
[14248.341083] usbcore: registered new interface driver usb-storage
[14248.346211] usbcore: registered new interface driver uas
[14249.339937] scsi 6:0:0:0: Direct-Access LexarJumpDrive1.00 
PQ: 0 ANSI: 6
[14249.340988] [ cut here ]
[14249.340999] WARNING: CPU: 3 PID: 5578 at lib/kobject.c:227 
kobject_add_internal+0x13f/0x350()
[14249.341003] kobject_add_internal failed for 6:0:0:0 (error: -2 parent: 
target6:0:0)
[14249.341005] Modules linked in: uas usb_storage ctr ccm cuse dm_crypt 
uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev btusb 
x86_pkg_temp_thermal coretemp ghash_clmulni_intel aesni_intel aes_x86_64 lrw 
gf128mul glue_helper ablk_helper cryptd arc4 iwldvm mac80211 microcode 
snd_hda_codec_hdmi iwlwifi psmouse snd_hda_codec_realtek serio_raw 
snd_hda_intel snd_usb_audio snd_hda_codec thinkpad_acpi cfg80211 joydev 
snd_usbmidi_lib nvram snd_hwdep snd_seq_midi snd_pcm snd_seq_midi_event 
snd_rawmidi lpc_ich rfcomm bnep snd_seq bluetooth snd_seq_device snd_page_alloc 
ehci_pci snd_timer ehci_hcd snd soundcore tpm_tis binfmt_misc btrfs libcrc32c 
xor raid6_pq hid_generic usbhid hid i915 ahci libahci e1000e sdhci_pci sdhci 
i2c_algo_bit drm_kms_helper ptp pps_core drm xhci_hcd video
[14249.341095] CPU: 3 PID: 5578 Comm: kworker/u16:0 Not tainted 3.13.0-rc1+ #142
[14249.341098] Hardware name: LENOVO 2325AP7/2325AP7, BIOS G2ET82WW (2.02 ) 
09/11/2012
[14249.341105] Workqueue: events_unbound async_run_entry_fn
[14249.341108]  0009 88003aa9db60 81658a4e 
88003aa9dba8
[14249.341115]  88003aa9db98 81048c3d 88006bc551b0 
fffe
[14249.341121]   8800bec22838 0200 
88003aa9dbf8
[14249.341127] Call Trace:
[14249.341135]  [] dump_stack+0x4d/0x66
[14249.341142]  [] warn_slowpath_common+0x7d/0xa0
[14249.341148]  [] warn_slowpath_fmt+0x4c/0x50
[14249.341154]  [] ? _raw_spin_unlock+0x27/0x40
[14249.341159]  [] kobject_add_internal+0x13f/0x350
[14249.341163]  [] kobject_add+0x65/0xb0
[14249.341170]  [] ? get_device+0x30/0x30
[14249.341175]  [] ? klist_init+0x31/0x40
[14249.341181]  [] device_add+0x128/0x660
[14249.341186]  [] ? __pm_runtime_resume+0x5c/0x90
[14249.341193]  [] scsi_sysfs_add_sdev+0xac/0x340
[14249.341199]  [] do_scan_async+0x83/0x180
[14249.341204]  [] async_run_entry_fn+0x37/0x130
[14249.341210]  [] process_one_work+0x1f4/0x550
[14249.341215]  [] ? process_one_work+0x192/0x550
[14249.341220]  [] worker_thread+0x121/0x3a0
[14249.341225]  [] ? manage_workers.isra.22+0x2a0/0x2a0
[14249.341231]  [] kthread+0xfc/0x120
[14249.341238]  [] ? kthread_create_on_node+0x230/0x230
[14249.341243]  [] ret_from_fork+0x7c/0xb0
[14249.341249]  [] ? kthread_create_on_node+0x230/0x230
[14249.341253] ---[ end trace 7f1d8a449e6af5aa ]---
[14249.341259] scsi 6:0:0:0: failed to add device: -2

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/2] target refcounting infrastructure fixes for usb

2013-12-16 Thread James Bottomley
This set should fix our target problems with USB by making the target
visibility properly reference counted.  Since it's a major change to the
infrastructure, we'll incubate upstream first before backporting to
stable.

James

---

James Bottomley (2):
  [SCSI] fix our current target reap infrastructure.
  [SCSI] dual scan thread bug fix

 drivers/scsi/scsi_scan.c   | 102 -
 drivers/scsi/scsi_sysfs.c  |  15 ---
 include/scsi/scsi_device.h |   3 +-
 3 files changed, 74 insertions(+), 46 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html