On Thu, Mar 01 2007, Frank Seidel wrote: > Am Mittwoch, 28. Februar 2007 19:02 schrieb Dan Williams: > > I can reliably reproduce a null pointer dereference on 2.6.20 and > > 2.6.21-rc2. I will keep digging to find the kernel version where > > this last worked, but wanted to see if there were any immediate > > experiments I should try. > > ... > > Kernel 2.6.21-rc2 on an i686 > > ... > > [ 431.709022] BUG: unable to handle kernel NULL pointer dereference > > at virtual address 0000005c [ 431.717993] printing eip: > > ... > > [ 431.825386] EIP is at cfq_dispatch_insert+0xb/0x53 > > ... > > [ 431.887396] [<c01e1fc9>] cfq_dispatch_requests+0x138/0x3f0 > Hi, > unfortunately i yet don't really have much/enough knowledge of cfq and > the kernels inwards at the moment... > but looking at cfq_dispatch_insert+0xb it seems the struct request > pointer given (as second parameter by cfq_dispatch_request) was NULL > and dereferencing it in the RQ_CFQQ macro leads to this oops. > > The "break"-out patch below for __cfq_dispatch_request might be at least > a possible workaround for this, but it could also be total bullsh.. > Perhaps someone smarter might pick this up.. and give a real fix. > > Have fun, > Frank > --- > > block/cfq-iosched.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletion(-) > > Index: linux-2.6/block/cfq-iosched.c > =================================================================== > --- linux-2.6.orig/block/cfq-iosched.c > +++ linux-2.6/block/cfq-iosched.c > @@ -962,7 +962,8 @@ __cfq_dispatch_requests(struct cfq_data > * follow expired path, else get first next available > */ > if ((rq = cfq_check_fifo(cfqq)) == NULL) > - rq = cfqq->next_rq; > + if ((rq = cfqq->next_rq) == NULL) > + break; > > /* > * finally, insert request into driver dispatch list
That is not the right fix. A little further up in this function, a check (well BUG_ON()) is done for a non-empty sort list. So we know at this point, that we have requests pending for this queue. When that is the case, ->next_rq must always be kept uptodate and non-NULL. The oops at least tells us this, it should not be papered around. The real fix is finding out _where_ this now isn't being updated. I'm puzzled why this is hitting Dan, but no one else has reported anything. Dan, did 2.6.19 work for you? -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/