On Wed, 16 Jan 2013, Tejun Heo wrote:
> Hello, Alan.
>
> On Wed, Jan 16, 2013 at 12:01:53PM -0500, Alan Stern wrote:
> > > The problem here is that "flush everything which comes before me" is
> > > used to order async jobs. e.g. after async jobs probe the hardware
> > > they order themselves by
Hello, Alan.
On Wed, Jan 16, 2013 at 12:01:53PM -0500, Alan Stern wrote:
> > The problem here is that "flush everything which comes before me" is
> > used to order async jobs. e.g. after async jobs probe the hardware
> > they order themselves by flushing before registering them, so unless
>
> I
On Wed, 16 Jan 2013, Tejun Heo wrote:
> Hello, Alan.
>
> On Tue, Jan 15, 2013 at 11:01:15PM -0500, Alan Stern wrote:
> > > The current domain implementation is somewhere inbetween. It's not
> > > completely simplistic system and at the same time not developed enough
> > > to do properly stacked
Hello, Alan.
On Tue, Jan 15, 2013 at 11:01:15PM -0500, Alan Stern wrote:
> > The current domain implementation is somewhere inbetween. It's not
> > completely simplistic system and at the same time not developed enough
> > to do properly stacked flushing.
>
> I like your idea of chronological sy
On Tue, Jan 15, 2013 at 7:05 PM, Ming Lei wrote:
>
> So looks only sd.c and floppy.c are to be synchronized suppose
> some sync interfaces are introduced, doesn't it?
What about ata_host_register() (usually called through ata_host_activate())?
I don't understand why you continue to push for some
On Tue, 15 Jan 2013, Tejun Heo wrote:
> Hello, Arjan.
>
> On Tue, Jan 15, 2013 at 04:25:54PM -0800, Arjan van de Ven wrote:
> > async fundamentally had the concept of a monotonic increasing number,
> > and that you could always wait for "everyone before me".
> > then people (like me) wanted excep
On Wed, Jan 16, 2013 at 1:36 AM, Linus Torvalds
wrote:
>
> Because it's not just sd.c that uses async_schedule(), and would need
> the async synchronize. It's floppy.c, it's generic scsi scanning (so
> scsi tapes etc), and it's libata-core.c.
As discussed previously, only the module which will po
On Tue, Jan 15, 2013 at 04:36:34PM -0800, Linus Torvalds wrote:
> The thing is, the module loading in particular is not necessarily
> happening in the same context as what *started* the module loading. A
> module loader will request the module from user space, and then later
> user space - through
On Tue, Jan 15, 2013 at 4:36 PM, Linus Torvalds
wrote:
>
> There's a reason I asked for a warning for this. Or the "let's flag
> the current thread if it ever started anything asynchronous". Because
> it's complicated.
Btw, the sequence counter (that is *not* taking anything else into
account) is
On Tue, Jan 15, 2013 at 3:50 PM, Tejun Heo wrote:
>
> For now, I'm gonna implement simple "I'm not gonna wait for myself"
> self-deadlock avoidance.
You can't really do that. Or rather, it won't *help*.
The thing is, the module loading in particular is not necessarily
happening in the same conte
Hello, Arjan.
On Tue, Jan 15, 2013 at 04:25:54PM -0800, Arjan van de Ven wrote:
> async fundamentally had the concept of a monotonic increasing number,
> and that you could always wait for "everyone before me".
> then people (like me) wanted exceptions to what "everyone" means ;-(
> I'm ok with go
For now, I'm gonna implement simple "I'm not gonna wait for myself"
self-deadlock avoidance. If this needs any more sophistication, I
think we better reimplement it so that we can explicitly match up and
track who's gonna wait for what instead of throwing everything into a
single cookie space a
cc'ing Arjan. Arjan, the original thread can be read from
http://thread.gmane.org/gmane.linux.kernel/1420814
Hello, again.
On Tue, Jan 15, 2013 at 12:18:01PM -0800, Linus Torvalds wrote:
> I think that is a good solution if it works, but look out: we need to
> synchronize across *all* domains
Hello, Linus
Will continue on another reply but this one is relevant so...
On Tue, Jan 15, 2013 at 10:18:45AM -0800, Linus Torvalds wrote:
> Tejun, is there a good way for code to see "I'm running in async
> context"? Then we could do something like
Almost. With a bit of modification we can ask
On Tue, Jan 15, 2013 at 10:32 AM, Tejun Heo wrote:
>
> I think the root problem here, apart from request_module() from block
> - which is a bit nasty but making that part completely async would too
> be quite nasty albeit in a different way - is that
> async_synchronize_full() is way too indescrim
Hello, Alan.
On Tue, Jan 15, 2013 at 01:20:58PM -0500, Alan Stern wrote:
> It may not be so easy. When the SCSI async thread probes the new disk,
> it has to do I/O. So it needs to use a scheduler.
>
> But maybe it could use a built-in trivial scheduler until the proper
> one is loaded. Then
Hello, Linus.
On Tue, Jan 15, 2013 at 09:36:57AM -0800, Linus Torvalds wrote:
> Tejun, comments? You can see the whole thread on lkml, but the basic
> problem is that the module loading doing the unconditional
> async_synchronize_full() has caused problems, because we have
>
> - load module A
>
On Tue, 15 Jan 2013, Linus Torvalds wrote:
> Tejun, comments? You can see the whole thread on lkml, but the basic
> problem is that the module loading doing the unconditional
> async_synchronize_full() has caused problems, because we have
>
> - load module A
>- module A does per-controller a
On Tue, Jan 15, 2013 at 9:36 AM, Linus Torvalds
wrote:
>
> This kind of "let's randomly encourage people to write subtly buggy
> code that has magical timing dependencies, so that the developer won't
> likely even see it because he has fast disks etc" code is totally
> unacceptable. And this code
[ Added Tejun to the discussion, since he's the async go-to-guy ]
On Mon, Jan 14, 2013 at 10:23 PM, Ming Lei wrote:
>
> But I have another idea to address the problem, and let module code call
> async_synchronize_full() only if the module requires that explicitly, so how
> about the below draft p
On Tue, Jan 15, 2013 at 9:53 AM, Ming Lei wrote:
>
> I will try to figure out one patch to address the scsi block async probe
> issue first, and see if it can fix the problem by moving add_disk()
> into sd_probe()
> and calling async_synchronize_full_domain(&scsi_sd_probe_domain)
> in the entry of
On Tue, Jan 15, 2013 at 1:30 AM, Linus Torvalds
wrote:
> On Sun, Jan 13, 2013 at 11:15 PM, Ming Lei wrote:
>>
>> The deadlock problem is caused by calling request_module() inside
>> async function of do_scan_async(), and it was introduced by Linus's
>> below commit:
>>
>> commit d6de2c80e9d758d2e
On Mon, Jan 14, 2013 at 10:04 AM, Alan Stern wrote:
>
> How about skipping that call if the current thread is one of the async
> helpers? Is it possible to detect when that happens?
>
> Or maybe such a check should go inside async_synchronize_full() itself.
Do we have some idea of exactly what i
On Mon, 14 Jan 2013, Linus Torvalds wrote:
> > - from view of driver, introducing async_synchronize_full() after
> > do_one_initcall() inside do_init_module() is like a sync probe
> > for drivers built as module, and cause this kind of deadlock easily.
> >
> > So could we revert the commit and fix
On Sun, Jan 13, 2013 at 11:15 PM, Ming Lei wrote:
>
> The deadlock problem is caused by calling request_module() inside
> async function of do_scan_async(), and it was introduced by Linus's
> below commit:
>
> commit d6de2c80e9d758d2e36c21699117db6178c0f517
> Author: Linus Torvalds
> Date: Fri
On Mon, Jan 14, 2013 at 3:39 AM, Alan Stern wrote:
> On Sun, 13 Jan 2013, Oliver Neukum wrote:
>> This is not a USB problem. You need to involve the SCSI people.
>> khubd just stops working because disconnects are processed
>> in its context and the removal deadlocks.
>
> The why whould building t
On Mon, Jan 14, 2013 at 4:22 PM, Oliver Neukum wrote:
>
> OK, your trace is totally different. If your hangs are related, as is likely,
> my explanation goes out of the window.
If I run 'shutdown' after unplugging usb storage device, another hang trace
same with Alex's can be triggered too, so it
On Monday 14 January 2013 11:47:57 Ming Lei wrote:
> [ 181.175323] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 181.183624] modprobeD c04f1920 0 2462 2461 0x
> [ 181.183685] [] (__schedule+0x5fc/0x6d4) from []
> (async_synchronize_cookie_
On Mon, Jan 14, 2013 at 11:47 AM, Ming Lei wrote:
> On Mon, Jan 14, 2013 at 1:42 AM, Alex Riesen wrote:
> [ 86.901367] io scheduler deadline registered (default)
> [ 181.168487] INFO: task modprobe:2462 blocked for more than 90 seconds.
> [ 181.175323] "echo 0 > /proc/sys/kernel/hung_task_tim
On Mon, Jan 14, 2013 at 1:42 AM, Alex Riesen wrote:
>
> 1. Compile a kernel with deadline elevator as module
> 2. Boot into it, make sure the elevator is selected
> (I used "elevator=deadline" in the kernel command line)
> 3. Insert a FAT formatted mass storage device in an USB2 port
>Observ
On Sun, 13 Jan 2013, Oliver Neukum wrote:
> On Sunday 13 January 2013 18:42:49 Alex Riesen wrote:
> > On Sun, Jan 13, 2013 at 5:56 PM, Alan Stern
> > wrote:
> > > On Sun, 13 Jan 2013, Alex Riesen wrote:
> > >>
> > >> Yes, almost. What about khubd hanging when machine is shutdown?
> > >
> > > Wha
On Sunday 13 January 2013 18:42:49 Alex Riesen wrote:
> On Sun, Jan 13, 2013 at 5:56 PM, Alan Stern wrote:
> > On Sun, 13 Jan 2013, Alex Riesen wrote:
> >>
> >> Yes, almost. What about khubd hanging when machine is shutdown?
> >
> > What about it? I have trouble understanding all the descriptions
On Sun, Jan 13, 2013 at 5:56 PM, Alan Stern wrote:
> On Sun, 13 Jan 2013, Alex Riesen wrote:
>>
>> Yes, almost. What about khubd hanging when machine is shutdown?
>
> What about it? I have trouble understanding all the descriptions you
> have provided so far, because you talk about several differ
On Sun, 13 Jan 2013, Alex Riesen wrote:
> On Sat, Jan 12, 2013 at 11:52 PM, Alan Stern
> wrote:
> > On Sat, 12 Jan 2013, Alex Riesen wrote:
> >> Now, who would be interested to handle this kind of misconfiguration ...
> >
> > So the whole thing was a false alarm?
>
> Yes, almost. What about khu
On Sat, Jan 12, 2013 at 11:52 PM, Alan Stern wrote:
> On Sat, 12 Jan 2013, Alex Riesen wrote:
>> Now, who would be interested to handle this kind of misconfiguration ...
>
> So the whole thing was a false alarm?
Yes, almost. What about khubd hanging when machine is shutdown?
> Maybe you should r
On Sat, 12 Jan 2013, Alex Riesen wrote:
> On Sat, Jan 12, 2013 at 8:39 PM, Alex Riesen wrote:
> > On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern
> > wrote:
> >> On Sat, 12 Jan 2013, Alex Riesen wrote:
> >>> One more detail: I usually use the "noop" elevator. That time it was
> >>> the "deadline".
On Sat, Jan 12, 2013 at 8:39 PM, Alex Riesen wrote:
> On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern wrote:
>> On Sat, 12 Jan 2013, Alex Riesen wrote:
>>> One more detail: I usually use the "noop" elevator. That time it was
>>> the "deadline". And I just reproduced it easily with "deadline".
>>
>> I
On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern wrote:
> On Sat, 12 Jan 2013, Alex Riesen wrote:
>
>> On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen wrote:
>> >
>> > the USB stick (an Cruzer Titanium 2GB) was not recognized at any of
>> > the USB ports of this system (an System76 lemu4 laptop, XHCI de
On Sat, Jan 12, 2013 at 6:37 PM, Alan Stern wrote:
> On Sat, 12 Jan 2013, Alex Riesen wrote:
>> One more detail: I usually use the "noop" elevator. That time it was
>> the "deadline". And I just reproduced it easily with "deadline".
>
> I doubt the elevator has anything to do with this.
But it lo
On Sat, 12 Jan 2013, Alex Riesen wrote:
> On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen wrote:
> > Hi,
> >
> > the USB stick (an Cruzer Titanium 2GB) was not recognized at any of
> > the USB ports of this system (an System76 lemu4 laptop, XHCI device)
> > after it was removed. If I attempt to ins
On 2013年1月12日 15:48:59, Alex Riesen wrote:
On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen wrote:
Hi,
the USB stick (an Cruzer Titanium 2GB) was not recognized at any of
the USB ports of this system (an System76 lemu4 laptop, XHCI device)
after it was removed. If I attempt to insert it again in
On Fri, Jan 11, 2013 at 10:04 PM, Alex Riesen wrote:
> Hi,
>
> the USB stick (an Cruzer Titanium 2GB) was not recognized at any of
> the USB ports of this system (an System76 lemu4 laptop, XHCI device)
> after it was removed. If I attempt to insert it again in any of the
> ports (one of the two US
42 matches
Mail list logo