Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Fri, Apr 08 2005, James Bottomley wrote: > On Thu, 2005-04-07 at 16:45 +0200, Jens Axboe wrote: > > So clear ->request_queue instead. > > > Will do. Did you want me to look after your patch and add this, or do > you want to send it to Linus (after the purdah is over)? Just queue it with the rest of your changes, that is fine with me. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Thu, 2005-04-07 at 16:45 +0200, Jens Axboe wrote: > So clear ->request_queue instead. Will do. Did you want me to look after your patch and add this, or do you want to send it to Linus (after the purdah is over)? James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Thu, Apr 07 2005, James Bottomley wrote: > On Thu, 2005-04-07 at 15:32 +0200, Jens Axboe wrote: > > I think Christophs point is that why add sdev_lock as a pointer, instead > > of just killing it? It's only used in one location, so it's not really > > that confusing (and a comment could fix that). > > Because any use of sdev->request_queue->queue_lock would likely succeed > even after we've freed the device and released the queue. If it's a > pointer and we null it after free and release, then any later use will > trigger an immediate NULL deref oops. So clear ->request_queue instead. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Thu, 2005-04-07 at 15:32 +0200, Jens Axboe wrote: > I think Christophs point is that why add sdev_lock as a pointer, instead > of just killing it? It's only used in one location, so it's not really > that confusing (and a comment could fix that). Because any use of sdev->request_queue->queue_lock would likely succeed even after we've freed the device and released the queue. If it's a pointer and we null it after free and release, then any later use will trigger an immediate NULL deref oops. Since we've had so many nasty problems around refcounting, I just would like to assure myself that we're doing everything correctly (I really believe we are, but empirical evidence is also nice). James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Thu, Apr 07 2005, James Bottomley wrote: > On Thu, 2005-04-07 at 14:22 +0100, Christoph Hellwig wrote: > > Do we really need the sdev_lock pointer? There's just a single place > > where we're using it and the code would be much more clear if it had just > > one name. > > Humour me for a while. I don't believe we have any way the lock can be > used after calling queue free, but nulling the sdev_lock pointer will > surely catch them. If nothing turns up after a few kernel revisions, > feel free to kill it. I think Christophs point is that why add sdev_lock as a pointer, instead of just killing it? It's only used in one location, so it's not really that confusing (and a comment could fix that). -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Thu, 2005-04-07 at 14:22 +0100, Christoph Hellwig wrote: > Do we really need the sdev_lock pointer? There's just a single place > where we're using it and the code would be much more clear if it had just > one name. Humour me for a while. I don't believe we have any way the lock can be used after calling queue free, but nulling the sdev_lock pointer will surely catch them. If nothing turns up after a few kernel revisions, feel free to kill it. James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Thu, Apr 07 2005, James Bottomley wrote: > On Thu, 2005-04-07 at 08:49 +0200, Jens Axboe wrote: > > On Wed, Apr 06 2005, James Bottomley wrote: > > > My proposal is to correct this by moving the data back to the correct > > > object, and make any object using it hold a reference, so this would > > > make the provider of the block request_fn hold a reference to the queue > > > and initialise the queue lock pointer with the lock currently in the > > > queue. Drivers that still use a global lock would be unaffected. This > > > > But this is the current requirement, as long as you use the queue you > > must hold a reference to it. > > Exactly! that's why I think this solution must work independently of > subsystem. > > > What do you think of the attached, then? Allow NULL lock to be passed > > in, in which case we use the queue private lock (that no one should ever > > ever touch). It looks a little confusing that > > sdev->request_queue->queue_lock now protects some sdev structures, if > > you want we can retain ->sdev_lock but as a pointer to the queue lock > > instead. > > Looks good. How about the attached modification? It makes sdev_lock a > pointer that uses the queue lock which we null out when we release it > (not that I don't trust SCSI or anything ;-) That's fine with me, up to you. As I mentioned, it does look less confusing to have the lock associated with sdev still. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Thu, Apr 07 2005, Christoph Hellwig wrote: > On Thu, Apr 07, 2005 at 09:18:38AM -0400, James Bottomley wrote: > > On Thu, 2005-04-07 at 08:49 +0200, Jens Axboe wrote: > > > On Wed, Apr 06 2005, James Bottomley wrote: > > > > My proposal is to correct this by moving the data back to the correct > > > > object, and make any object using it hold a reference, so this would > > > > make the provider of the block request_fn hold a reference to the queue > > > > and initialise the queue lock pointer with the lock currently in the > > > > queue. Drivers that still use a global lock would be unaffected. This > > > > > > But this is the current requirement, as long as you use the queue you > > > must hold a reference to it. > > > > Exactly! that's why I think this solution must work independently of > > subsystem. > > > > > What do you think of the attached, then? Allow NULL lock to be passed > > > in, in which case we use the queue private lock (that no one should ever > > > ever touch). It looks a little confusing that > > > sdev->request_queue->queue_lock now protects some sdev structures, if > > > you want we can retain ->sdev_lock but as a pointer to the queue lock > > > instead. > > > > Looks good. How about the attached modification? It makes sdev_lock a > > pointer that uses the queue lock which we null out when we release it > > (not that I don't trust SCSI or anything ;-) > > Do we really need the sdev_lock pointer? There's just a single place > where we're using it and the code would be much more clear if it had just > one name. A comment would work equally well, and save space of course :-) -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Thu, Apr 07, 2005 at 09:18:38AM -0400, James Bottomley wrote: > On Thu, 2005-04-07 at 08:49 +0200, Jens Axboe wrote: > > On Wed, Apr 06 2005, James Bottomley wrote: > > > My proposal is to correct this by moving the data back to the correct > > > object, and make any object using it hold a reference, so this would > > > make the provider of the block request_fn hold a reference to the queue > > > and initialise the queue lock pointer with the lock currently in the > > > queue. Drivers that still use a global lock would be unaffected. This > > > > But this is the current requirement, as long as you use the queue you > > must hold a reference to it. > > Exactly! that's why I think this solution must work independently of > subsystem. > > > What do you think of the attached, then? Allow NULL lock to be passed > > in, in which case we use the queue private lock (that no one should ever > > ever touch). It looks a little confusing that > > sdev->request_queue->queue_lock now protects some sdev structures, if > > you want we can retain ->sdev_lock but as a pointer to the queue lock > > instead. > > Looks good. How about the attached modification? It makes sdev_lock a > pointer that uses the queue lock which we null out when we release it > (not that I don't trust SCSI or anything ;-) Do we really need the sdev_lock pointer? There's just a single place where we're using it and the code would be much more clear if it had just one name. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Thu, 2005-04-07 at 08:49 +0200, Jens Axboe wrote: > On Wed, Apr 06 2005, James Bottomley wrote: > > My proposal is to correct this by moving the data back to the correct > > object, and make any object using it hold a reference, so this would > > make the provider of the block request_fn hold a reference to the queue > > and initialise the queue lock pointer with the lock currently in the > > queue. Drivers that still use a global lock would be unaffected. This > > But this is the current requirement, as long as you use the queue you > must hold a reference to it. Exactly! that's why I think this solution must work independently of subsystem. > What do you think of the attached, then? Allow NULL lock to be passed > in, in which case we use the queue private lock (that no one should ever > ever touch). It looks a little confusing that > sdev->request_queue->queue_lock now protects some sdev structures, if > you want we can retain ->sdev_lock but as a pointer to the queue lock > instead. Looks good. How about the attached modification? It makes sdev_lock a pointer that uses the queue lock which we null out when we release it (not that I don't trust SCSI or anything ;-) James = drivers/block/ll_rw_blk.c 1.287 vs edited = --- 1.287/drivers/block/ll_rw_blk.c 2005-03-11 15:32:27 -05:00 +++ edited/drivers/block/ll_rw_blk.c2005-04-07 09:05:19 -04:00 @@ -1714,6 +1714,15 @@ if (blk_init_free_list(q)) goto out_init; + /* +* if caller didn't supply a lock, they get per-queue locking with +* our embedded lock +*/ + if (!lock) { + spin_lock_init(&q->__queue_lock); + lock = &q->__queue_lock; + } + q->request_fn = rfn; q->back_merge_fn= ll_back_merge_fn; q->front_merge_fn = ll_front_merge_fn; = drivers/scsi/scsi_lib.c 1.151 vs edited = --- 1.151/drivers/scsi/scsi_lib.c 2005-02-17 14:17:22 -05:00 +++ edited/drivers/scsi/scsi_lib.c 2005-04-07 09:10:31 -04:00 @@ -349,9 +349,9 @@ shost->host_failed)) scsi_eh_wakeup(shost); spin_unlock(shost->host_lock); - spin_lock(&sdev->sdev_lock); + spin_lock(sdev->sdev_lock); sdev->device_busy--; - spin_unlock_irqrestore(&sdev->sdev_lock, flags); + spin_unlock_irqrestore(sdev->sdev_lock, flags); } /* @@ -1391,7 +1391,7 @@ struct Scsi_Host *shost = sdev->host; struct request_queue *q; - q = blk_init_queue(scsi_request_fn, &sdev->sdev_lock); + q = blk_init_queue(scsi_request_fn, NULL); if (!q) return NULL; = drivers/scsi/scsi_scan.c 1.142 vs edited = --- 1.142/drivers/scsi/scsi_scan.c 2005-02-16 13:21:35 -05:00 +++ edited/drivers/scsi/scsi_scan.c 2005-04-07 09:10:01 -04:00 @@ -249,7 +249,6 @@ */ sdev->borken = 1; - spin_lock_init(&sdev->sdev_lock); sdev->request_queue = scsi_alloc_queue(sdev); if (!sdev->request_queue) { /* release fn is set up in scsi_sysfs_device_initialise, so @@ -257,7 +256,8 @@ put_device(&starget->dev); goto out; } - + /* Now make the sdev_lock point to the request queue lock */ + sdev->sdev_lock = q->queue_lock; sdev->request_queue->queuedata = sdev; scsi_adjust_queue_depth(sdev, 0, sdev->host->cmd_per_lun); = drivers/scsi/scsi_sysfs.c 1.69 vs edited = --- 1.69/drivers/scsi/scsi_sysfs.c 2005-02-16 20:05:37 -05:00 +++ edited/drivers/scsi/scsi_sysfs.c2005-04-07 09:12:17 -04:00 @@ -168,6 +168,9 @@ list_del(&sdev->starved_entry); spin_unlock_irqrestore(sdev->host->host_lock, flags); + /* After releasing the queue we may no longer access its lock */ + BUG_ON(spin_is_locked(sdev->sdev_lock)); + sdev->sdev_lock = NULL; if (sdev->request_queue) scsi_free_queue(sdev->request_queue); = include/linux/blkdev.h 1.161 vs edited = --- 1.161/include/linux/blkdev.h2005-03-09 03:03:24 -05:00 +++ edited/include/linux/blkdev.h 2005-04-07 09:05:21 -04:00 @@ -355,8 +355,11 @@ unsigned long queue_flags; /* -* protects queue structures from reentrancy +* protects queue structures from reentrancy. ->__queue_lock should +* _never_ be used directly, it is queue private. always use +* ->queue_lock. */ + spinlock_t __queue_lock; spinlock_t *queue_lock; /* = include/scsi/scsi_device.h 1.32 vs edited = --- 1.32/include/scsi/scsi_device.h 2005-02-17 14:15:51 -05:00 +++ edited/include/scsi/scsi_device.h 2005-04-07 09:08:25 -04:00 @@ -44,7 +44,7 @@ struct list_headsame_target_siblings; /* just the devices sharing same target id */ volatile unsigned short device_busy;/* commands ac
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Wed, Apr 06 2005, James Bottomley wrote: > On Wed, 2005-04-06 at 21:08 +0200, Jens Axboe wrote: > > > I think the correct model for all of this is that the block driver > > > shouldn't care (or be tied to) the scsi one. Thus, as long as SCSI can > > > reject requests from a queue whose device has been released (without > > > checking the device) then everything is fine as long as we sort out the > > > lock lifetime problem. > > > > But you are tying it completely to the SCSI problem, since we have no > > locking problems of this sort elsewhere. At least the notified could > > potentially be used for something else. The lock is just hack number two > > to work around the problem. > > Then help me understand the problem as you see it. > > My current understanding is that these problems occur because we've > mixed data in two objects of different lifetimes. So far, this is stack > independent. Correct. > My proposal is to correct this by moving the data back to the correct > object, and make any object using it hold a reference, so this would > make the provider of the block request_fn hold a reference to the queue > and initialise the queue lock pointer with the lock currently in the > queue. Drivers that still use a global lock would be unaffected. This But this is the current requirement, as long as you use the queue you must hold a reference to it. > also means that any provider of a request_fn may expect to process (and > reject) requests for a period of time after blk_cleanup_queue(). > Really, this refcounting is inherent in blk_init_queue(), > blk_cleanup_queue() so the only additional requirement is sanely > processing queue requests after you think it's been cleaned up. This is > request_fn() provider independent, I think (providers who currently > don't violate the lifetime rules don't need fixing). > > I claim that this proposal solves the current problem, and any other > ones we run across that occur because of data mixing. The lock is just one piece, but I guess it is the only remaining after the ->request_fn switchover killing requests. So perhaps it is just easier to embed the lock to kill off that problem. What do you think of the attached, then? Allow NULL lock to be passed in, in which case we use the queue private lock (that no one should ever ever touch). It looks a little confusing that sdev->request_queue->queue_lock now protects some sdev structures, if you want we can retain ->sdev_lock but as a pointer to the queue lock instead. > Your current patch tries to solve the problem by tying the two objects > together; unifying the lifetimes. I agree this can be done (it was how > the sd/sr close race was fixed). But the current way the freeing > functions thread across the subsystems doesn't look nice and I don't > currently see a way to get the queue freed: We free the queue on scsi > device release; if the queue holds a reference to the scsi_device, the > release function will never be called. Our only other choice is to try > to free the queue on device_del instead, but that's too early, I think. Hmm yes, it probably doesn't work. Unless we properly embed the affected structures, I don't think it can work. = drivers/block/ll_rw_blk.c 1.288 vs edited = --- 1.288/drivers/block/ll_rw_blk.c 2005-03-31 12:47:54 +02:00 +++ edited/drivers/block/ll_rw_blk.c2005-04-07 08:38:01 +02:00 @@ -1714,6 +1711,15 @@ request_queue_t *blk_init_queue(request_ if (blk_init_free_list(q)) goto out_init; + /* +* if caller didn't supply a lock, they get per-queue locking with +* our embedded lock +*/ + if (!lock) { + spin_lock_init(&q->__queue_lock); + lock = &q->__queue_lock; + } + q->request_fn = rfn; q->back_merge_fn= ll_back_merge_fn; q->front_merge_fn = ll_front_merge_fn; = drivers/scsi/scsi_lib.c 1.153 vs edited = --- 1.153/drivers/scsi/scsi_lib.c 2005-03-30 21:49:45 +02:00 +++ edited/drivers/scsi/scsi_lib.c 2005-04-07 08:42:30 +02:00 @@ -360,9 +360,9 @@ void scsi_device_unbusy(struct scsi_devi shost->host_failed)) scsi_eh_wakeup(shost); spin_unlock(shost->host_lock); - spin_lock(&sdev->sdev_lock); + spin_lock(sdev->request_queue->queue_lock); sdev->device_busy--; - spin_unlock_irqrestore(&sdev->sdev_lock, flags); + spin_unlock_irqrestore(sdev->request_queue->queue_lock, flags); } /* @@ -1425,7 +1425,7 @@ struct request_queue *scsi_alloc_queue(s struct Scsi_Host *shost = sdev->host; struct request_queue *q; - q = blk_init_queue(scsi_request_fn, &sdev->sdev_lock); + q = blk_init_queue(scsi_request_fn, NULL); if (!q) return NULL; = drivers/scsi/scsi_scan.c 1.143 vs edited = --- 1.143/drivers/scsi/scsi_scan.c 2005-03-23 22:58:13 +01:00 +++ edited/drivers/scsi/scs
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Wed, 2005-04-06 at 21:08 +0200, Jens Axboe wrote: > > I think the correct model for all of this is that the block driver > > shouldn't care (or be tied to) the scsi one. Thus, as long as SCSI can > > reject requests from a queue whose device has been released (without > > checking the device) then everything is fine as long as we sort out the > > lock lifetime problem. > > But you are tying it completely to the SCSI problem, since we have no > locking problems of this sort elsewhere. At least the notified could > potentially be used for something else. The lock is just hack number two > to work around the problem. Then help me understand the problem as you see it. My current understanding is that these problems occur because we've mixed data in two objects of different lifetimes. So far, this is stack independent. My proposal is to correct this by moving the data back to the correct object, and make any object using it hold a reference, so this would make the provider of the block request_fn hold a reference to the queue and initialise the queue lock pointer with the lock currently in the queue. Drivers that still use a global lock would be unaffected. This also means that any provider of a request_fn may expect to process (and reject) requests for a period of time after blk_cleanup_queue(). Really, this refcounting is inherent in blk_init_queue(), blk_cleanup_queue() so the only additional requirement is sanely processing queue requests after you think it's been cleaned up. This is request_fn() provider independent, I think (providers who currently don't violate the lifetime rules don't need fixing). I claim that this proposal solves the current problem, and any other ones we run across that occur because of data mixing. Your current patch tries to solve the problem by tying the two objects together; unifying the lifetimes. I agree this can be done (it was how the sd/sr close race was fixed). But the current way the freeing functions thread across the subsystems doesn't look nice and I don't currently see a way to get the queue freed: We free the queue on scsi device release; if the queue holds a reference to the scsi_device, the release function will never be called. Our only other choice is to try to free the queue on device_del instead, but that's too early, I think. James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
Tejun Heo [EMAIL PROTECTED] wrote: > Jens Axboe wrote: > >On Wed, Apr 06 2005, Arjan van de Ven wrote: > > > >>>@@ -324,6 +334,7 @@ > >>> issue_flush_fn *issue_flush_fn; > >>> prepare_flush_fn*prepare_flush_fn; > >>> end_flush_fn*end_flush_fn; > >>>+ release_queue_data_fn *release_queue_data_fn; > >>> > >>> /* > >>>* Auto-unplugging state > >> > >>where does this function method actually get called? > > > > > >I missed the hunk in ll_rw_blk.c, rmk pointed the same thing out not 5 > >minutes ago :-) > > > >The patch would not work anyways, as scsi_sysfs.c clears queuedata > >unconditionally. This is a better work-around, it just makes the queue > >hold a reference to the device as well only killing it when the queue is > >torn down. > > > >Still not super happy with it, but I don't see how to solve the circular > >dependency problem otherwise. > > > > Hello, Jens. > > I've been thinking about it for a while. The problem is that we're > reference counting two different objects to track lifetime of one > entity. This happens in both SCSI upper and mid layers. In the upper > layer, genhd and scsi_disk (or scsi_cd, ...) are ref'ed separately while > they share their destiny together (not really different entity) and in > the middle layer scsi_device and request_queue does the same thing. > Circular dependency is occuring because we separate one entity into two > and reference counting them separately. Two are actually one and > necessarily want each other. (until death aparts. Wow, serious. :-) > > IMHO, what we need to do is consolidate ref counting such that in each > layer only one object is reference counted, and the other object is > freed when the ref counted object is released. The object of choice > would be genhd in upper layer and request_queue in mid layer. All > ref-counting should be updated to only ref those objects. We'll need to > add a release callback to genhd and make request_queue properly > reference counted. > > Conceptually, scsi_disk extends genhd and scsi_device extends > request_queue. So, to go one step further, as what UL represents is > genhd (disk device) and ML request_queue (request-based device), > embedding scsi_disk into genhd and scsi_device into request_queue will > make the architecture clearer. To do this, we'll need something like > alloc_disk_with_udata(int minors, size_t udata_len) and the equivalent > for request_queue. > > I've done this half-way and then doing it without fixing the SCSI > model seemed silly so got into working on the state model. (BTW, the > state model is almost done, I'm about to run tests.) > > What do you think? Jens? Well I think extends is one way to look at the subsystem objects, Couldn't it also be said that these objects from each subsystem have just a relationship (parent / child, etc). As reference counting has been implemented in each subsystem sometimes interfaces that cross subsystem boundaries (had / have) not been converted to use similar life time rules. Well your solution tries to solve the problem by creating a new larger object that contains both of the old objects. Another solution would be to use a consistent lifetime rules and stay with smaller objects. Unless going to large objects helps with allocation fragmentation or we get some other benefit it would seem that these combined structures may sometime in the future limit creation of lighter or flexible objects. It would appear another solution is that when you allocate a resource from another subsystem (i.e. blk_init_queue) that both subsystems participate in the same reference counting model and in the allocation routine you past in your object to be referenced counted by the allocating subsystem. Then when it is time to shutdown you do not free the others subsystems object directly, but use the normal release routines. -andmike -- Michael Anderson [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Wed, Apr 06 2005, James Bottomley wrote: > On Wed, 2005-04-06 at 19:58 +0200, Jens Axboe wrote: > > I rather like the queue lock being a pointer, so you can share at > > whatever level you want. Lets not grow the request_queue a full lock > > just to work around a bug elsewhere. > > I'm not proposing that it not be a pointer, merely that it could be > intialiased to point to a lock structure within the request queue. Sorry that's what I meant, you are adding a lock inside the queue. I didn't mean removing q->queue_lock or making it a non-pointer. > Doing this looks much simpler than your current patch ... one of the > problems with which looks to be that removing the scsi_driver module is > in trouble because we currently have the queue_release in the sdev > release (which won't get called while the queue holds a reference). It is simpler, but it only solves this particular problem of the lock going away. > I think the correct model for all of this is that the block driver > shouldn't care (or be tied to) the scsi one. Thus, as long as SCSI can > reject requests from a queue whose device has been released (without > checking the device) then everything is fine as long as we sort out the > lock lifetime problem. But you are tying it completely to the SCSI problem, since we have no locking problems of this sort elsewhere. At least the notified could potentially be used for something else. The lock is just hack number two to work around the problem. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Wed, 2005-04-06 at 19:58 +0200, Jens Axboe wrote: > I rather like the queue lock being a pointer, so you can share at > whatever level you want. Lets not grow the request_queue a full lock > just to work around a bug elsewhere. I'm not proposing that it not be a pointer, merely that it could be intialiased to point to a lock structure within the request queue. Doing this looks much simpler than your current patch ... one of the problems with which looks to be that removing the scsi_driver module is in trouble because we currently have the queue_release in the sdev release (which won't get called while the queue holds a reference). I think the correct model for all of this is that the block driver shouldn't care (or be tied to) the scsi one. Thus, as long as SCSI can reject requests from a queue whose device has been released (without checking the device) then everything is fine as long as we sort out the lock lifetime problem. James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Wed, Apr 06 2005, Tejun Heo wrote: > Jens Axboe wrote: > >On Wed, Apr 06 2005, Arjan van de Ven wrote: > > > >>>@@ -324,6 +334,7 @@ > >>> issue_flush_fn *issue_flush_fn; > >>> prepare_flush_fn*prepare_flush_fn; > >>> end_flush_fn*end_flush_fn; > >>>+ release_queue_data_fn *release_queue_data_fn; > >>> > >>> /* > >>>* Auto-unplugging state > >> > >>where does this function method actually get called? > > > > > >I missed the hunk in ll_rw_blk.c, rmk pointed the same thing out not 5 > >minutes ago :-) > > > >The patch would not work anyways, as scsi_sysfs.c clears queuedata > >unconditionally. This is a better work-around, it just makes the queue > >hold a reference to the device as well only killing it when the queue is > >torn down. > > > >Still not super happy with it, but I don't see how to solve the circular > >dependency problem otherwise. > > > > Hello, Jens. > > I've been thinking about it for a while. The problem is that we're > reference counting two different objects to track lifetime of one > entity. This happens in both SCSI upper and mid layers. In the upper > layer, genhd and scsi_disk (or scsi_cd, ...) are ref'ed separately while > they share their destiny together (not really different entity) and in > the middle layer scsi_device and request_queue does the same thing. > Circular dependency is occuring because we separate one entity into two > and reference counting them separately. Two are actually one and > necessarily want each other. (until death aparts. Wow, serious. :-) That's precisely correct. > IMHO, what we need to do is consolidate ref counting such that in each > layer only one object is reference counted, and the other object is > freed when the ref counted object is released. The object of choice > would be genhd in upper layer and request_queue in mid layer. All > ref-counting should be updated to only ref those objects. We'll need to > add a release callback to genhd and make request_queue properly > reference counted. > > Conceptually, scsi_disk extends genhd and scsi_device extends > request_queue. So, to go one step further, as what UL represents is > genhd (disk device) and ML request_queue (request-based device), > embedding scsi_disk into genhd and scsi_device into request_queue will > make the architecture clearer. To do this, we'll need something like > alloc_disk_with_udata(int minors, size_t udata_len) and the equivalent > for request_queue. > > I've done this half-way and then doing it without fixing the SCSI > model seemed silly so got into working on the state model. (BTW, the > state model is almost done, I'm about to run tests.) > > What do you think? Jens? It is of course the optimal solution to really solve the hierarchy of references, but more involved. If you have time / desire to do it, I'd be happy to review it :-) For now I'm happy with the work-around. It's not too ugly, and at least it makes it possible to kill the worse work-around of scsi_kill_requests(). -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Wed, Apr 06 2005, James Bottomley wrote: > On Tue, 2005-03-29 at 14:03 +0200, Jens Axboe wrote: > > It is quite a serious problem, not just for CFQ. SCSI referencing is > > badly broken there. > > OK ... I accept that with regard to the queue lock. It is much deeper than that. The recent hack to kill requests is yet another example of that. At least this work-around makes it a little better, but the mid layer assumption that sdev going to zero implies the queue going away at the same time is inherently broken. > However, rather than trying to work out a way to tie all the refcounted > objects together, what about the simpler solution of making the lock > bound to the lifetime of the queue? That's essentially what the work-around does. > As far as SCSI is concerned, we could simply move the lock into the > request_queue structure and everything would work since the device holds > a reference to the queue. The way it would work is that we'd simply > have a lock in the request_queue structure, but it would be up to the > device to pass it in in blk_init_queue. Then we'd alter the scsi_device > sdev_lock to be a pointer to the queue lock? This scheme would also > work for the current users who have a global lock (they simply wouldn't > use the lock int the request_queue). > > The only could on the horizon with this scheme is that there may > genuinely be places where we want multiple queues to share a non-global > lock: situations where we have shared issue queues (like IDE), or > shared tag resources are a possibility. To cope with those, we'd > probably have to have a separately allocated, reference counted lock. > > However, I'm happy to implement the simpler solution (lock in > requuest_queue) if you agree. I rather like the queue lock being a pointer, so you can share at whatever level you want. Lets not grow the request_queue a full lock just to work around a bug elsewhere. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Tue, 2005-03-29 at 14:03 +0200, Jens Axboe wrote: > It is quite a serious problem, not just for CFQ. SCSI referencing is > badly broken there. OK ... I accept that with regard to the queue lock. However, rather than trying to work out a way to tie all the refcounted objects together, what about the simpler solution of making the lock bound to the lifetime of the queue? As far as SCSI is concerned, we could simply move the lock into the request_queue structure and everything would work since the device holds a reference to the queue. The way it would work is that we'd simply have a lock in the request_queue structure, but it would be up to the device to pass it in in blk_init_queue. Then we'd alter the scsi_device sdev_lock to be a pointer to the queue lock? This scheme would also work for the current users who have a global lock (they simply wouldn't use the lock int the request_queue). The only could on the horizon with this scheme is that there may genuinely be places where we want multiple queues to share a non-global lock: situations where we have shared issue queues (like IDE), or shared tag resources are a possibility. To cope with those, we'd probably have to have a separately allocated, reference counted lock. However, I'm happy to implement the simpler solution (lock in requuest_queue) if you agree. James James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
Jens Axboe wrote: On Wed, Apr 06 2005, Arjan van de Ven wrote: @@ -324,6 +334,7 @@ issue_flush_fn *issue_flush_fn; prepare_flush_fn*prepare_flush_fn; end_flush_fn*end_flush_fn; + release_queue_data_fn *release_queue_data_fn; /* * Auto-unplugging state where does this function method actually get called? I missed the hunk in ll_rw_blk.c, rmk pointed the same thing out not 5 minutes ago :-) The patch would not work anyways, as scsi_sysfs.c clears queuedata unconditionally. This is a better work-around, it just makes the queue hold a reference to the device as well only killing it when the queue is torn down. Still not super happy with it, but I don't see how to solve the circular dependency problem otherwise. Hello, Jens. I've been thinking about it for a while. The problem is that we're reference counting two different objects to track lifetime of one entity. This happens in both SCSI upper and mid layers. In the upper layer, genhd and scsi_disk (or scsi_cd, ...) are ref'ed separately while they share their destiny together (not really different entity) and in the middle layer scsi_device and request_queue does the same thing. Circular dependency is occuring because we separate one entity into two and reference counting them separately. Two are actually one and necessarily want each other. (until death aparts. Wow, serious. :-) IMHO, what we need to do is consolidate ref counting such that in each layer only one object is reference counted, and the other object is freed when the ref counted object is released. The object of choice would be genhd in upper layer and request_queue in mid layer. All ref-counting should be updated to only ref those objects. We'll need to add a release callback to genhd and make request_queue properly reference counted. Conceptually, scsi_disk extends genhd and scsi_device extends request_queue. So, to go one step further, as what UL represents is genhd (disk device) and ML request_queue (request-based device), embedding scsi_disk into genhd and scsi_device into request_queue will make the architecture clearer. To do this, we'll need something like alloc_disk_with_udata(int minors, size_t udata_len) and the equivalent for request_queue. I've done this half-way and then doing it without fixing the SCSI model seemed silly so got into working on the state model. (BTW, the state model is almost done, I'm about to run tests.) What do you think? Jens? -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Wed, Apr 06 2005, Arjan van de Ven wrote: > > @@ -324,6 +334,7 @@ > > issue_flush_fn *issue_flush_fn; > > prepare_flush_fn*prepare_flush_fn; > > end_flush_fn*end_flush_fn; > > + release_queue_data_fn *release_queue_data_fn; > > > > /* > > * Auto-unplugging state > > where does this function method actually get called? I missed the hunk in ll_rw_blk.c, rmk pointed the same thing out not 5 minutes ago :-) The patch would not work anyways, as scsi_sysfs.c clears queuedata unconditionally. This is a better work-around, it just makes the queue hold a reference to the device as well only killing it when the queue is torn down. Still not super happy with it, but I don't see how to solve the circular dependency problem otherwise. = drivers/scsi/scsi_lib.c 1.153 vs edited = --- 1.153/drivers/scsi/scsi_lib.c 2005-03-30 21:49:45 +02:00 +++ edited/drivers/scsi/scsi_lib.c 2005-04-06 14:41:15 +02:00 @@ -1420,6 +1420,13 @@ } EXPORT_SYMBOL(scsi_calculate_bounce_limit); +static void scsi_release_queue_data(request_queue_t *q) +{ + struct scsi_device *sdev = q->queuedata; + + put_device(&sdev->sdev_gendev); +} + struct request_queue *scsi_alloc_queue(struct scsi_device *sdev) { struct Scsi_Host *shost = sdev->host; @@ -1437,6 +1444,12 @@ blk_queue_bounce_limit(q, scsi_calculate_bounce_limit(shost)); blk_queue_segment_boundary(q, shost->dma_boundary); blk_queue_issue_flush_fn(q, scsi_issue_flush_fn); + + /* +* let the queue drop a reference, when it is killed +*/ + get_device(&sdev->sdev_gendev); + q->release_queue_data_fn = scsi_release_queue_data; /* * ordered tags are superior to flush ordering = include/linux/blkdev.h 1.162 vs edited = --- 1.162/include/linux/blkdev.h2005-03-29 03:42:37 +02:00 +++ edited/include/linux/blkdev.h 2005-04-06 11:22:44 +02:00 @@ -279,6 +288,7 @@ typedef int (issue_flush_fn) (request_queue_t *, struct gendisk *, sector_t *); typedef int (prepare_flush_fn) (request_queue_t *, struct request *); typedef void (end_flush_fn) (request_queue_t *, struct request *); +typedef void (release_queue_data_fn) (request_queue_t *); enum blk_queue_state { Queue_down, @@ -324,6 +334,7 @@ issue_flush_fn *issue_flush_fn; prepare_flush_fn*prepare_flush_fn; end_flush_fn*end_flush_fn; + release_queue_data_fn *release_queue_data_fn; /* * Auto-unplugging state = drivers/block/ll_rw_blk.c 1.288 vs edited = --- 1.288/drivers/block/ll_rw_blk.c 2005-03-31 12:47:54 +02:00 +++ edited/drivers/block/ll_rw_blk.c2005-04-06 11:24:51 +02:00 @@ -1621,6 +1623,9 @@ blk_sync_queue(q); + if (q->release_queue_data_fn) + q->release_queue_data_fn(q); + if (rl->rq_pool) mempool_destroy(rl->rq_pool); -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
> @@ -324,6 +334,7 @@ > issue_flush_fn *issue_flush_fn; > prepare_flush_fn*prepare_flush_fn; > end_flush_fn*end_flush_fn; > + release_queue_data_fn *release_queue_data_fn; > > /* >* Auto-unplugging state where does this function method actually get called? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Tue, Mar 29 2005, Jens Axboe wrote: > On Tue, Mar 29 2005, Chris Rankin wrote: > > >> > I have one IDE hard disc, but I was using a USB memory stick at one > > > > point. (Notice the usb-storage and vfat modules in my list.) Could > > > > that be the troublesome SCSI device? > > > > --- Jens Axboe <[EMAIL PROTECTED]> wrote: > > > Yes, it probably is. What happens is that you insert the stick and do io > > > against it, which sets up a process io context for that device. That > > > context persists until the process exits (or later, if someone still > > > holds a reference to it), but the queue_lock will be dead when you yank > > > the usb device. > > > > > > It is quite a serious problem, not just for CFQ. SCSI referencing is > > > badly broken there. > > > > That would explain why it was nautilus which caused the oops then. > > Does this mean that the major distros aren't using the CFQ then? > > Because how else can they be avoiding this oops with USB storage > > devices? > > CFQ with io contexts is relatively new, only there since 2.6.10 or so. > On UP, we don't grab the queue lock effetively so the problem isn't seen > there. > > You can work around this issue by using a different default io scheduler > at boot time, and then select cfq for your ide hard drive when the > system has booted with: > > # echo cfq > /sys/block/hda/queue/scheduler > > (substitute hda for any other solid storage device). Can you check if this work-around solves the problem for you? Thanks! = include/linux/blkdev.h 1.162 vs edited = --- 1.162/include/linux/blkdev.h2005-03-29 03:42:37 +02:00 +++ edited/include/linux/blkdev.h 2005-04-06 11:22:44 +02:00 @@ -279,6 +288,7 @@ typedef int (issue_flush_fn) (request_queue_t *, struct gendisk *, sector_t *); typedef int (prepare_flush_fn) (request_queue_t *, struct request *); typedef void (end_flush_fn) (request_queue_t *, struct request *); +typedef void (release_queue_data_fn) (request_queue_t *); enum blk_queue_state { Queue_down, @@ -324,6 +334,7 @@ issue_flush_fn *issue_flush_fn; prepare_flush_fn*prepare_flush_fn; end_flush_fn*end_flush_fn; + release_queue_data_fn *release_queue_data_fn; /* * Auto-unplugging state = drivers/scsi/scsi_sysfs.c 1.72 vs edited = --- 1.72/drivers/scsi/scsi_sysfs.c 2005-03-28 07:03:53 +02:00 +++ edited/drivers/scsi/scsi_sysfs.c2005-04-06 11:24:27 +02:00 @@ -175,9 +175,6 @@ scsi_target_reap(scsi_target(sdev)); - kfree(sdev->inquiry); - kfree(sdev); - if (parent) put_device(parent); } = drivers/scsi/scsi_lib.c 1.153 vs edited = --- 1.153/drivers/scsi/scsi_lib.c 2005-03-30 21:49:45 +02:00 +++ edited/drivers/scsi/scsi_lib.c 2005-04-06 11:24:32 +02:00 @@ -1420,6 +1420,18 @@ } EXPORT_SYMBOL(scsi_calculate_bounce_limit); +static void scsi_release_queue_data(request_queue_t *q) +{ + struct scsi_device *sdev = q->queuedata; + + if (sdev) { + kfree(sdev->inquiry); + kfree(sdev); + } + + q->queuedata = NULL; +} + struct request_queue *scsi_alloc_queue(struct scsi_device *sdev) { struct Scsi_Host *shost = sdev->host; @@ -1437,6 +1449,8 @@ blk_queue_bounce_limit(q, scsi_calculate_bounce_limit(shost)); blk_queue_segment_boundary(q, shost->dma_boundary); blk_queue_issue_flush_fn(q, scsi_issue_flush_fn); + + q->release_queue_data_fn = scsi_release_queue_data; /* * ordered tags are superior to flush ordering -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Tue, Mar 29 2005, Chris Rankin wrote: > >> > I have one IDE hard disc, but I was using a USB memory stick at one > > > point. (Notice the usb-storage and vfat modules in my list.) Could > > > that be the troublesome SCSI device? > > --- Jens Axboe <[EMAIL PROTECTED]> wrote: > > Yes, it probably is. What happens is that you insert the stick and do io > > against it, which sets up a process io context for that device. That > > context persists until the process exits (or later, if someone still > > holds a reference to it), but the queue_lock will be dead when you yank > > the usb device. > > > > It is quite a serious problem, not just for CFQ. SCSI referencing is > > badly broken there. > > That would explain why it was nautilus which caused the oops then. > Does this mean that the major distros aren't using the CFQ then? > Because how else can they be avoiding this oops with USB storage > devices? CFQ with io contexts is relatively new, only there since 2.6.10 or so. On UP, we don't grab the queue lock effetively so the problem isn't seen there. You can work around this issue by using a different default io scheduler at boot time, and then select cfq for your ide hard drive when the system has booted with: # echo cfq > /sys/block/hda/queue/scheduler (substitute hda for any other solid storage device). -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
>> > I have one IDE hard disc, but I was using a USB memory stick at one > > point. (Notice the usb-storage and vfat modules in my list.) Could > > that be the troublesome SCSI device? --- Jens Axboe <[EMAIL PROTECTED]> wrote: > Yes, it probably is. What happens is that you insert the stick and do io > against it, which sets up a process io context for that device. That > context persists until the process exits (or later, if someone still > holds a reference to it), but the queue_lock will be dead when you yank > the usb device. > > It is quite a serious problem, not just for CFQ. SCSI referencing is > badly broken there. That would explain why it was nautilus which caused the oops then. Does this mean that the major distros aren't using the CFQ then? Because how else can they be avoiding this oops with USB storage devices? Send instant messages to your online friends http://uk.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Tue, Mar 29 2005, Chris Rankin wrote: (please don't top post) > --- Jens Axboe <[EMAIL PROTECTED]> wrote: > > On Sun, Mar 27 2005, Chris Rankin wrote: > > > [gcc-3.4.3, Linux-2.6.11-SMP, Dual P4 Xeon with HT enabled] > > > > > > Hi, > > > > > > My Linux 2.6.11 box oopsed when I tried to logout. I have switched to > > > using the anticipatory > > > scheduler instead. > > > > > > Cheers, > > > Chris > > > > > > NMI Watchdog detected LOCKUP on CPU1, eip c0275cc7, registers: > > > Modules linked in: snd_pcm_oss snd_mixer_oss snd_usb_audio snd_usb_lib > > > snd_intel8x0 > > snd_seq_oss > > > snd_seq_midi snd_emu10k1_synth snd_emu10k1 snd_ac97_codec snd_pcm > > > snd_page_alloc > > snd_emux_synth > > > snd_seq_virmidi snd_rawmidi snd_seq_midi_event snd_seq_midi_emul > > > snd_hwdep snd_util_mem > > snd_seq > > > snd_seq_device snd_rtctimer snd_timer snd nls_iso8859_1 nls_cp437 vfat > > > fat usb_storage radeon > > drm > > > i2c_algo_bit emu10k1_gp gameport deflate zlib_deflate zlib_inflate > > > twofish serpent aes_i586 > > > blowfish des sha256 crypto_null af_key binfmt_misc eeprom i2c_sensor > > > button processor psmouse > > > pcspkr p4_clockmod speedstep_lib usbserial lp nfsd exportfs md5 ipv6 > > > sd_mod scsi_mod autofs > > nfs > > > lockd sunrpc af_packet ohci_hcd parport_pc parport e1000 video1394 > > > raw1394 i2c_i801 i2c_core > > > ohci1394 ieee1394 ehci_hcd soundcore pwc videodev uhci_hcd usbcore > > > intel_agp agpgart ide_cd > > cdrom > > > ext3 jbd > > > CPU:1 > > > EIP:0060:[]Not tainted VLI > > > EFLAGS: 00200086 (2.6.11) > > > EIP is at _spin_lock+0x7/0xf > > > eax: f7b8b01c ebx: f7c82b88 ecx: f7c82b94 edx: f6c33714 > > > esi: eb68ad88 edi: f6c33708 ebp: f6c33714 esp: f5b32f70 > > > ds: 007b es: 007b ss: 0068 > > > Process nautilus (pid: 5757, threadinfo=f5b32000 task=f7518020) > > > Stack: c01f7f79 00200282 f76bda24 f6c323e4 f7518020 > > > c01f1d0c > > >f5b32000 c011d7b3 0001 b65ffa40 f5b32fac > > > > > > f5b32000 c011d8d6 c0102e7f b65ffbf0 > > > b6640bf0 > > > Call Trace: > > > [] cfq_exit_io_context+0x54/0xb3 > > > [] exit_io_context+0x45/0x51 > > > [] do_exit+0x205/0x308 > > > [] next_thread+0x0/0xc > > > [] syscall_call+0x7/0xb > > > Code: 05 e8 3a e6 ff ff c3 ba 00 f0 ff ff 21 e2 81 42 14 00 01 00 00 f0 > > > 81 28 00 00 00 01 74 > > 05 e8 > > > 1d e6 ff ff c3 f0 fe 08 79 09 f3 90 <80> 38 00 7e f9 eb f2 c3 f0 81 28 00 > > > 00 00 01 74 05 e8 ff > > e5 > > > ff > > > console shuts up ... > > > > The queue was gone by the time the process exited. What type of storage > > do you have attached to the box? At least with SCSI, it has some > > problems in this area - it will glady free the scsi device structure > > (where the queue lock is located) while the queue reference count still > > hasn't dropped to zero. > > I have one IDE hard disc, but I was using a USB memory stick at one > point. (Notice the usb-storage and vfat modules in my list.) Could > that be the troublesome SCSI device? Yes, it probably is. What happens is that you insert the stick and do io against it, which sets up a process io context for that device. That context persists until the process exits (or later, if someone still holds a reference to it), but the queue_lock will be dead when you yank the usb device. It is quite a serious problem, not just for CFQ. SCSI referencing is badly broken there. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
I have one IDE hard disc, but I was using a USB memory stick at one point. (Notice the usb-storage and vfat modules in my list.) Could that be the troublesome SCSI device? --- Jens Axboe <[EMAIL PROTECTED]> wrote: > On Sun, Mar 27 2005, Chris Rankin wrote: > > [gcc-3.4.3, Linux-2.6.11-SMP, Dual P4 Xeon with HT enabled] > > > > Hi, > > > > My Linux 2.6.11 box oopsed when I tried to logout. I have switched to using > > the anticipatory > > scheduler instead. > > > > Cheers, > > Chris > > > > NMI Watchdog detected LOCKUP on CPU1, eip c0275cc7, registers: > > Modules linked in: snd_pcm_oss snd_mixer_oss snd_usb_audio snd_usb_lib > > snd_intel8x0 > snd_seq_oss > > snd_seq_midi snd_emu10k1_synth snd_emu10k1 snd_ac97_codec snd_pcm > > snd_page_alloc > snd_emux_synth > > snd_seq_virmidi snd_rawmidi snd_seq_midi_event snd_seq_midi_emul snd_hwdep > > snd_util_mem > snd_seq > > snd_seq_device snd_rtctimer snd_timer snd nls_iso8859_1 nls_cp437 vfat fat > > usb_storage radeon > drm > > i2c_algo_bit emu10k1_gp gameport deflate zlib_deflate zlib_inflate twofish > > serpent aes_i586 > > blowfish des sha256 crypto_null af_key binfmt_misc eeprom i2c_sensor button > > processor psmouse > > pcspkr p4_clockmod speedstep_lib usbserial lp nfsd exportfs md5 ipv6 sd_mod > > scsi_mod autofs > nfs > > lockd sunrpc af_packet ohci_hcd parport_pc parport e1000 video1394 raw1394 > > i2c_i801 i2c_core > > ohci1394 ieee1394 ehci_hcd soundcore pwc videodev uhci_hcd usbcore > > intel_agp agpgart ide_cd > cdrom > > ext3 jbd > > CPU:1 > > EIP:0060:[]Not tainted VLI > > EFLAGS: 00200086 (2.6.11) > > EIP is at _spin_lock+0x7/0xf > > eax: f7b8b01c ebx: f7c82b88 ecx: f7c82b94 edx: f6c33714 > > esi: eb68ad88 edi: f6c33708 ebp: f6c33714 esp: f5b32f70 > > ds: 007b es: 007b ss: 0068 > > Process nautilus (pid: 5757, threadinfo=f5b32000 task=f7518020) > > Stack: c01f7f79 00200282 f76bda24 f6c323e4 f7518020 > > c01f1d0c > >f5b32000 c011d7b3 0001 b65ffa40 f5b32fac > > > > f5b32000 c011d8d6 c0102e7f b65ffbf0 > > b6640bf0 > > Call Trace: > > [] cfq_exit_io_context+0x54/0xb3 > > [] exit_io_context+0x45/0x51 > > [] do_exit+0x205/0x308 > > [] next_thread+0x0/0xc > > [] syscall_call+0x7/0xb > > Code: 05 e8 3a e6 ff ff c3 ba 00 f0 ff ff 21 e2 81 42 14 00 01 00 00 f0 81 > > 28 00 00 00 01 74 > 05 e8 > > 1d e6 ff ff c3 f0 fe 08 79 09 f3 90 <80> 38 00 7e f9 eb f2 c3 f0 81 28 00 > > 00 00 01 74 05 e8 ff > e5 > > ff > > console shuts up ... > > The queue was gone by the time the process exited. What type of storage > do you have attached to the box? At least with SCSI, it has some > problems in this area - it will glady free the scsi device structure > (where the queue lock is located) while the queue reference count still > hasn't dropped to zero. > > -- > Jens Axboe > > Send instant messages to your online friends http://uk.messenger.yahoo.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [OOPS] 2.6.11 - NMI lockup with CFQ scheduler
On Sun, Mar 27 2005, Chris Rankin wrote: > [gcc-3.4.3, Linux-2.6.11-SMP, Dual P4 Xeon with HT enabled] > > Hi, > > My Linux 2.6.11 box oopsed when I tried to logout. I have switched to using > the anticipatory > scheduler instead. > > Cheers, > Chris > > NMI Watchdog detected LOCKUP on CPU1, eip c0275cc7, registers: > Modules linked in: snd_pcm_oss snd_mixer_oss snd_usb_audio snd_usb_lib > snd_intel8x0 snd_seq_oss > snd_seq_midi snd_emu10k1_synth snd_emu10k1 snd_ac97_codec snd_pcm > snd_page_alloc snd_emux_synth > snd_seq_virmidi snd_rawmidi snd_seq_midi_event snd_seq_midi_emul snd_hwdep > snd_util_mem snd_seq > snd_seq_device snd_rtctimer snd_timer snd nls_iso8859_1 nls_cp437 vfat fat > usb_storage radeon drm > i2c_algo_bit emu10k1_gp gameport deflate zlib_deflate zlib_inflate twofish > serpent aes_i586 > blowfish des sha256 crypto_null af_key binfmt_misc eeprom i2c_sensor button > processor psmouse > pcspkr p4_clockmod speedstep_lib usbserial lp nfsd exportfs md5 ipv6 sd_mod > scsi_mod autofs nfs > lockd sunrpc af_packet ohci_hcd parport_pc parport e1000 video1394 raw1394 > i2c_i801 i2c_core > ohci1394 ieee1394 ehci_hcd soundcore pwc videodev uhci_hcd usbcore intel_agp > agpgart ide_cd cdrom > ext3 jbd > CPU:1 > EIP:0060:[]Not tainted VLI > EFLAGS: 00200086 (2.6.11) > EIP is at _spin_lock+0x7/0xf > eax: f7b8b01c ebx: f7c82b88 ecx: f7c82b94 edx: f6c33714 > esi: eb68ad88 edi: f6c33708 ebp: f6c33714 esp: f5b32f70 > ds: 007b es: 007b ss: 0068 > Process nautilus (pid: 5757, threadinfo=f5b32000 task=f7518020) > Stack: c01f7f79 00200282 f76bda24 f6c323e4 f7518020 > c01f1d0c >f5b32000 c011d7b3 0001 b65ffa40 f5b32fac > > f5b32000 c011d8d6 c0102e7f b65ffbf0 > b6640bf0 > Call Trace: > [] cfq_exit_io_context+0x54/0xb3 > [] exit_io_context+0x45/0x51 > [] do_exit+0x205/0x308 > [] next_thread+0x0/0xc > [] syscall_call+0x7/0xb > Code: 05 e8 3a e6 ff ff c3 ba 00 f0 ff ff 21 e2 81 42 14 00 01 00 00 f0 81 28 > 00 00 00 01 74 05 e8 > 1d e6 ff ff c3 f0 fe 08 79 09 f3 90 <80> 38 00 7e f9 eb f2 c3 f0 81 28 00 00 > 00 01 74 05 e8 ff e5 > ff > console shuts up ... The queue was gone by the time the process exited. What type of storage do you have attached to the box? At least with SCSI, it has some problems in this area - it will glady free the scsi device structure (where the queue lock is located) while the queue reference count still hasn't dropped to zero. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/