Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-16 Thread Alan Stern
On Wed, 16 Jan 2013, Tejun Heo wrote: > Hello, Alan. > > On Wed, Jan 16, 2013 at 12:01:53PM -0500, Alan Stern wrote: > > > The problem here is that "flush everything which comes before me" is > > > used to order async jobs. e.g. after async jobs probe the hardware > > > they order themselves by

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-16 Thread Tejun Heo
Hello, Alan. On Wed, Jan 16, 2013 at 12:01:53PM -0500, Alan Stern wrote: > > The problem here is that "flush everything which comes before me" is > > used to order async jobs. e.g. after async jobs probe the hardware > > they order themselves by flushing before registering them, so unless > > I

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-16 Thread Alan Stern
On Wed, 16 Jan 2013, Tejun Heo wrote: > Hello, Alan. > > On Tue, Jan 15, 2013 at 11:01:15PM -0500, Alan Stern wrote: > > > The current domain implementation is somewhere inbetween. It's not > > > completely simplistic system and at the same time not developed enough > > > to do properly stacked

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-16 Thread Tejun Heo
Hello, Alan. On Tue, Jan 15, 2013 at 11:01:15PM -0500, Alan Stern wrote: > > The current domain implementation is somewhere inbetween. It's not > > completely simplistic system and at the same time not developed enough > > to do properly stacked flushing. > > I like your idea of chronological sy

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 7:05 PM, Ming Lei wrote: > > So looks only sd.c and floppy.c are to be synchronized suppose > some sync interfaces are introduced, doesn't it? What about ata_host_register() (usually called through ata_host_activate())? I don't understand why you continue to push for some

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Alan Stern
On Tue, 15 Jan 2013, Tejun Heo wrote: > Hello, Arjan. > > On Tue, Jan 15, 2013 at 04:25:54PM -0800, Arjan van de Ven wrote: > > async fundamentally had the concept of a monotonic increasing number, > > and that you could always wait for "everyone before me". > > then people (like me) wanted excep

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Ming Lei
On Wed, Jan 16, 2013 at 1:36 AM, Linus Torvalds wrote: > > Because it's not just sd.c that uses async_schedule(), and would need > the async synchronize. It's floppy.c, it's generic scsi scanning (so > scsi tapes etc), and it's libata-core.c. As discussed previously, only the module which will po

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Tejun Heo
On Tue, Jan 15, 2013 at 04:36:34PM -0800, Linus Torvalds wrote: > The thing is, the module loading in particular is not necessarily > happening in the same context as what *started* the module loading. A > module loader will request the module from user space, and then later > user space - through

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 4:36 PM, Linus Torvalds wrote: > > There's a reason I asked for a warning for this. Or the "let's flag > the current thread if it ever started anything asynchronous". Because > it's complicated. Btw, the sequence counter (that is *not* taking anything else into account) is

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 3:50 PM, Tejun Heo wrote: > > For now, I'm gonna implement simple "I'm not gonna wait for myself" > self-deadlock avoidance. You can't really do that. Or rather, it won't *help*. The thing is, the module loading in particular is not necessarily happening in the same conte

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Tejun Heo
Hello, Arjan. On Tue, Jan 15, 2013 at 04:25:54PM -0800, Arjan van de Ven wrote: > async fundamentally had the concept of a monotonic increasing number, > and that you could always wait for "everyone before me". > then people (like me) wanted exceptions to what "everyone" means ;-( > I'm ok with go

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Arjan van de Ven
For now, I'm gonna implement simple "I'm not gonna wait for myself" self-deadlock avoidance. If this needs any more sophistication, I think we better reimplement it so that we can explicitly match up and track who's gonna wait for what instead of throwing everything into a single cookie space a

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Tejun Heo
cc'ing Arjan. Arjan, the original thread can be read from http://thread.gmane.org/gmane.linux.kernel/1420814 Hello, again. On Tue, Jan 15, 2013 at 12:18:01PM -0800, Linus Torvalds wrote: > I think that is a good solution if it works, but look out: we need to > synchronize across *all* domains

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Tejun Heo
Hello, Linus Will continue on another reply but this one is relevant so... On Tue, Jan 15, 2013 at 10:18:45AM -0800, Linus Torvalds wrote: > Tejun, is there a good way for code to see "I'm running in async > context"? Then we could do something like Almost. With a bit of modification we can ask

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 10:32 AM, Tejun Heo wrote: > > I think the root problem here, apart from request_module() from block > - which is a bit nasty but making that part completely async would too > be quite nasty albeit in a different way - is that > async_synchronize_full() is way too indescrim

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Tejun Heo
Hello, Alan. On Tue, Jan 15, 2013 at 01:20:58PM -0500, Alan Stern wrote: > It may not be so easy. When the SCSI async thread probes the new disk, > it has to do I/O. So it needs to use a scheduler. > > But maybe it could use a built-in trivial scheduler until the proper > one is loaded. Then

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Tejun Heo
Hello, Linus. On Tue, Jan 15, 2013 at 09:36:57AM -0800, Linus Torvalds wrote: > Tejun, comments? You can see the whole thread on lkml, but the basic > problem is that the module loading doing the unconditional > async_synchronize_full() has caused problems, because we have > > - load module A >

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Alan Stern
On Tue, 15 Jan 2013, Linus Torvalds wrote: > Tejun, comments? You can see the whole thread on lkml, but the basic > problem is that the module loading doing the unconditional > async_synchronize_full() has caused problems, because we have > > - load module A >- module A does per-controller a

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Linus Torvalds
On Tue, Jan 15, 2013 at 9:36 AM, Linus Torvalds wrote: > > This kind of "let's randomly encourage people to write subtly buggy > code that has magical timing dependencies, so that the developer won't > likely even see it because he has fast disks etc" code is totally > unacceptable. And this code

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-15 Thread Linus Torvalds
[ Added Tejun to the discussion, since he's the async go-to-guy ] On Mon, Jan 14, 2013 at 10:23 PM, Ming Lei wrote: > > But I have another idea to address the problem, and let module code call > async_synchronize_full() only if the module requires that explicitly, so how > about the below draft p

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-14 Thread Ming Lei
On Tue, Jan 15, 2013 at 9:53 AM, Ming Lei wrote: > > I will try to figure out one patch to address the scsi block async probe > issue first, and see if it can fix the problem by moving add_disk() > into sd_probe() > and calling async_synchronize_full_domain(&scsi_sd_probe_domain) > in the entry of

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-14 Thread Ming Lei
On Tue, Jan 15, 2013 at 1:30 AM, Linus Torvalds wrote: > On Sun, Jan 13, 2013 at 11:15 PM, Ming Lei wrote: >> >> The deadlock problem is caused by calling request_module() inside >> async function of do_scan_async(), and it was introduced by Linus's >> below commit: >> >> commit d6de2c80e9d758d2e

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-14 Thread Linus Torvalds
On Mon, Jan 14, 2013 at 10:04 AM, Alan Stern wrote: > > How about skipping that call if the current thread is one of the async > helpers? Is it possible to detect when that happens? > > Or maybe such a check should go inside async_synchronize_full() itself. Do we have some idea of exactly what i

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-14 Thread Alan Stern
On Mon, 14 Jan 2013, Linus Torvalds wrote: > > - from view of driver, introducing async_synchronize_full() after > > do_one_initcall() inside do_init_module() is like a sync probe > > for drivers built as module, and cause this kind of deadlock easily. > > > > So could we revert the commit and fix

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-14 Thread Linus Torvalds
On Sun, Jan 13, 2013 at 11:15 PM, Ming Lei wrote: > > The deadlock problem is caused by calling request_module() inside > async function of do_scan_async(), and it was introduced by Linus's > below commit: > > commit d6de2c80e9d758d2e36c21699117db6178c0f517 > Author: Linus Torvalds > Date: Fri

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-14 Thread Alex Riesen
On Mon, Jan 14, 2013 at 3:39 AM, Alan Stern wrote: > On Sun, 13 Jan 2013, Oliver Neukum wrote: >> This is not a USB problem. You need to involve the SCSI people. >> khubd just stops working because disconnects are processed >> in its context and the removal deadlocks. > > The why whould building t

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-14 Thread Ming Lei
On Mon, Jan 14, 2013 at 4:22 PM, Oliver Neukum wrote: > > OK, your trace is totally different. If your hangs are related, as is likely, > my explanation goes out of the window. If I run 'shutdown' after unplugging usb storage device, another hang trace same with Alex's can be triggered too, so it

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-14 Thread Oliver Neukum
On Monday 14 January 2013 11:47:57 Ming Lei wrote: > [ 181.175323] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 181.183624] modprobeD c04f1920 0 2462 2461 0x > [ 181.183685] [] (__schedule+0x5fc/0x6d4) from [] > (async_synchronize_cookie_

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-13 Thread Ming Lei
On Mon, Jan 14, 2013 at 11:47 AM, Ming Lei wrote: > On Mon, Jan 14, 2013 at 1:42 AM, Alex Riesen wrote: > [ 86.901367] io scheduler deadline registered (default) > [ 181.168487] INFO: task modprobe:2462 blocked for more than 90 seconds. > [ 181.175323] "echo 0 > /proc/sys/kernel/hung_task_tim

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-13 Thread Ming Lei
On Mon, Jan 14, 2013 at 1:42 AM, Alex Riesen wrote: > > 1. Compile a kernel with deadline elevator as module > 2. Boot into it, make sure the elevator is selected > (I used "elevator=deadline" in the kernel command line) > 3. Insert a FAT formatted mass storage device in an USB2 port >Observ

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-13 Thread Alan Stern
On Sun, 13 Jan 2013, Oliver Neukum wrote: > On Sunday 13 January 2013 18:42:49 Alex Riesen wrote: > > On Sun, Jan 13, 2013 at 5:56 PM, Alan Stern > > wrote: > > > On Sun, 13 Jan 2013, Alex Riesen wrote: > > >> > > >> Yes, almost. What about khubd hanging when machine is shutdown? > > > > > > Wha

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-13 Thread Oliver Neukum
On Sunday 13 January 2013 18:42:49 Alex Riesen wrote: > On Sun, Jan 13, 2013 at 5:56 PM, Alan Stern wrote: > > On Sun, 13 Jan 2013, Alex Riesen wrote: > >> > >> Yes, almost. What about khubd hanging when machine is shutdown? > > > > What about it? I have trouble understanding all the descriptions

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-13 Thread Alex Riesen
On Sun, Jan 13, 2013 at 5:56 PM, Alan Stern wrote: > On Sun, 13 Jan 2013, Alex Riesen wrote: >> >> Yes, almost. What about khubd hanging when machine is shutdown? > > What about it? I have trouble understanding all the descriptions you > have provided so far, because you talk about several differ

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-13 Thread Alan Stern
On Sun, 13 Jan 2013, Alex Riesen wrote: > On Sat, Jan 12, 2013 at 11:52 PM, Alan Stern > wrote: > > On Sat, 12 Jan 2013, Alex Riesen wrote: > >> Now, who would be interested to handle this kind of misconfiguration ... > > > > So the whole thing was a false alarm? > > Yes, almost. What about khu

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-13 Thread Alex Riesen
On Sat, Jan 12, 2013 at 11:52 PM, Alan Stern wrote: > On Sat, 12 Jan 2013, Alex Riesen wrote: >> Now, who would be interested to handle this kind of misconfiguration ... > > So the whole thing was a false alarm? Yes, almost. What about khubd hanging when machine is shutdown? > Maybe you should r

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-12 Thread Alan Stern
On Sat, 12 Jan 2013, Alex Riesen wrote: > On Sat, Jan 12, 2013 at 8:39 PM, Alex Riesen wrote: > > On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern > > wrote: > >> On Sat, 12 Jan 2013, Alex Riesen wrote: > >>> One more detail: I usually use the "noop" elevator. That time it was > >>> the "deadline".

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-12 Thread Alex Riesen
On Sat, Jan 12, 2013 at 8:39 PM, Alex Riesen wrote: > On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern wrote: >> On Sat, 12 Jan 2013, Alex Riesen wrote: >>> One more detail: I usually use the "noop" elevator. That time it was >>> the "deadline". And I just reproduced it easily with "deadline". >> >> I

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-12 Thread Alex Riesen
On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern wrote: > On Sat, 12 Jan 2013, Alex Riesen wrote: > >> On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen wrote: >> > >> > the USB stick (an Cruzer Titanium 2GB) was not recognized at any of >> > the USB ports of this system (an System76 lemu4 laptop, XHCI de

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-12 Thread Alex Riesen
On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern wrote: > On Sat, 12 Jan 2013, Alex Riesen wrote: >> One more detail: I usually use the "noop" elevator. That time it was >> the "deadline". And I just reproduced it easily with "deadline". > > I doubt the elevator has anything to do with this. But it lo

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-12 Thread Alan Stern
On Sat, 12 Jan 2013, Alex Riesen wrote: > On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen wrote: > > Hi, > > > > the USB stick (an Cruzer Titanium 2GB) was not recognized at any of > > the USB ports of this system (an System76 lemu4 laptop, XHCI device) > > after it was removed. If I attempt to ins

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-12 Thread Lan Tianyu
On 2013年1月12日 15:48:59, Alex Riesen wrote: On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen wrote: Hi, the USB stick (an Cruzer Titanium 2GB) was not recognized at any of the USB ports of this system (an System76 lemu4 laptop, XHCI device) after it was removed. If I attempt to insert it again in

Re: USB device cannot be reconnected and khubd "blocked for more than 120 seconds"

2013-01-11 Thread Alex Riesen
On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen wrote: > Hi, > > the USB stick (an Cruzer Titanium 2GB) was not recognized at any of > the USB ports of this system (an System76 lemu4 laptop, XHCI device) > after it was removed. If I attempt to insert it again in any of the > ports (one of the two US