Re: Multipath failover handling (Was: Re: 2.6.24-rc3-mm1)

2008-01-07 Thread Mike Christie

James Bottomley wrote:

However, there's still devloss_tmo to consider ... even in
multipath, I don't think you want to signal path failure until
devloss_tmo has fired otherwise you'll get too many transient up/down
events which damage performance if the array has an expensive failover
model.


Yes. But currently we have a very high failover latency as we always have
to wait for the requeued commands to time-out.
Hence we're damaging performance on arrays with inexpensive failover.


If it's a either/or choice between the two that's showing our current
approach to multi-path is broken.


The other problem is what to do with in-flight commands at the time the
link went down.  With your current patch, they're still stuck until they
time out ... surely there needs to be some type of recovery mechanism
for these?


Well, the in-flight commands are owned by the HBA driver, which should
have the proper code to terminate / return those commands with the
appriopriate codes. They will then be rescheduled and will be caught
like 'normal' IO requests.


But my point is that if a driver goes blocked, those commands will be
forced to wait the blocked timeout anyway, so your proposed patch does
nothing to improve the case for dm anyway ... you only avoid commands
stuck when a device goes blocked if by chance its request queue was
empty.



How about my patches to use new transport error values and make the 
iscsi and fc behave the same.


The problem I think Hannes and I are both trying to solve is this:

1. We do not want to wait for dev_loss_tmo seconds for failover.

2. The FC drivers can hook into fast_io_fail_tmo related callouts and 
with that set that tmo to a very low value like a couple of seconds if 
they are using multipath, so failovers are fast. However, there is a bug 
with where when the fast_io_fail_tmo fires requests that made it to the 
driver get failed and returned to the multipath layer, but commands in 
the blocked request queue are stuck in there until dev_loss_tmo fires.


With my patches here (need to be rediffed and for FC I need to handle 
JamesS's comments about not using a new field for the fast_fail_timeout 
state bit):


http://marc.info/?l=linux-scsi&m=117399843216280&w=2
http://marc.info/?l=linux-scsi&m=117399544112073&w=2
http://marc.info/?l=linux-scsi&m=117399844316771&w=2
http://marc.info/?l=linux-scsi&m=117400203324693&w=2
http://marc.info/?l=linux-scsi&m=117400203324690&w=2

For FC we can use the fast_io_fail_tmo for fast failovers, and commands 
will not get stuck in a blocked queue for dev_loss_tmo seconds because 
when the fast_io_fail_tmo fires the target's queues are unblocked and 
fc_remote_port_chkready() ready kicks in (iSCSI does the same with the 
patches in the links). And with the patches if multipath-tools is 
sending its path testing IO it will get a DID_TRANSPORT_* error code 
that it can use to make a decent path failing decision with.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Multipath failover handling (Was: Re: 2.6.24-rc3-mm1)

2008-01-07 Thread James Bottomley
On Mon, 2008-01-07 at 15:05 +0100, Hannes Reinecke wrote:
> James Bottomley wrote:
> > On Fri, 2007-12-14 at 10:00 +0100, Hannes Reinecke wrote:
> >> James Bottomley wrote:
> >>> On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
>  OK, thanks.  I'll assume that James and Hannes have this in hand (or will
>  have, by mid-week) and I won't do anything here.
> >>> Just to confirm what I think I'm going to be doing:  rebasing the
> >>> scsi-misc tree to remove this commit:
> >>>
> >>> commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> >>> Author: Hannes Reinecke <[EMAIL PROTECTED]>
> >>> Date:   Tue Nov 6 09:23:40 2007 +0100
> >>>
> >>> [SCSI] Do not requeue requests if REQ_FAILFAST is set
> >>>
> >>> And its allied fix ups:
> >>>
> >>> commit 983289045faa96fba8841d3c51b98bb8623d9504
> >>> Author: James Bottomley <[EMAIL PROTECTED]>
> >>> Date:   Sat Nov 24 19:47:25 2007 +0200
> >>>
> >>> [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> >>>
> >>> commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> >>> Author: James Bottomley <[EMAIL PROTECTED]>
> >>> Date:   Sat Nov 24 19:55:53 2007 +0200
> >>>
> >>> [SCSI] fix domain validation to work again
> >>>
> >>> James
> >>>
> >>>
> >> Or just apply my latest patch (cf Undo __scsi_kill_request).
> >> The main point is that we shouldn't retry requests
> >> with FAILFAST set when the queue is blocked. AFAICS
> >> only FC and iSCSI transports set the queue to blocked,
> >> and use this to indicate a loss of connection. So any
> >> retry with queue blocked is futile.
> > 
> > I still don't think this is the right approach.
> > 
> > For link up/down events, those are direct pathing events and should be
> > signalled along a kernel notifier, not by mucking with the SCSI state
> > machine.
> Of course they will be signalled. And eventually we should patch up
> mutltipath-tools to read the exising events from the uevent socket.
> But even with that patch there is a quite largish window during
> which IOs will be sent to the blocked device, and hence will be
> stuck in the request queue until the timer expires.

But the assumption your code makes is that if REQ_FAILFAST is set then
it's a dm request ... and that's not true.  The code in question
negatively impacts other users of REQ_FAILFAST.  For every user other
than dm, the right thing to do is to wait out the block.

> > However, there's still devloss_tmo to consider ... even in
> > multipath, I don't think you want to signal path failure until
> > devloss_tmo has fired otherwise you'll get too many transient up/down
> > events which damage performance if the array has an expensive failover
> > model.
> > 
> Yes. But currently we have a very high failover latency as we always have
> to wait for the requeued commands to time-out.
> Hence we're damaging performance on arrays with inexpensive failover.

If it's a either/or choice between the two that's showing our current
approach to multi-path is broken.

> > The other problem is what to do with in-flight commands at the time the
> > link went down.  With your current patch, they're still stuck until they
> > time out ... surely there needs to be some type of recovery mechanism
> > for these?
> > 
> Well, the in-flight commands are owned by the HBA driver, which should
> have the proper code to terminate / return those commands with the
> appriopriate codes. They will then be rescheduled and will be caught
> like 'normal' IO requests.

But my point is that if a driver goes blocked, those commands will be
forced to wait the blocked timeout anyway, so your proposed patch does
nothing to improve the case for dm anyway ... you only avoid commands
stuck when a device goes blocked if by chance its request queue was
empty.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Multipath failover handling (Was: Re: 2.6.24-rc3-mm1)

2008-01-07 Thread Hannes Reinecke
James Bottomley wrote:
> On Fri, 2007-12-14 at 10:00 +0100, Hannes Reinecke wrote:
>> James Bottomley wrote:
>>> On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
 OK, thanks.  I'll assume that James and Hannes have this in hand (or will
 have, by mid-week) and I won't do anything here.
>>> Just to confirm what I think I'm going to be doing:  rebasing the
>>> scsi-misc tree to remove this commit:
>>>
>>> commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
>>> Author: Hannes Reinecke <[EMAIL PROTECTED]>
>>> Date:   Tue Nov 6 09:23:40 2007 +0100
>>>
>>> [SCSI] Do not requeue requests if REQ_FAILFAST is set
>>>
>>> And its allied fix ups:
>>>
>>> commit 983289045faa96fba8841d3c51b98bb8623d9504
>>> Author: James Bottomley <[EMAIL PROTECTED]>
>>> Date:   Sat Nov 24 19:47:25 2007 +0200
>>>
>>> [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
>>>
>>> commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
>>> Author: James Bottomley <[EMAIL PROTECTED]>
>>> Date:   Sat Nov 24 19:55:53 2007 +0200
>>>
>>> [SCSI] fix domain validation to work again
>>>
>>> James
>>>
>>>
>> Or just apply my latest patch (cf Undo __scsi_kill_request).
>> The main point is that we shouldn't retry requests
>> with FAILFAST set when the queue is blocked. AFAICS
>> only FC and iSCSI transports set the queue to blocked,
>> and use this to indicate a loss of connection. So any
>> retry with queue blocked is futile.
> 
> I still don't think this is the right approach.
> 
> For link up/down events, those are direct pathing events and should be
> signalled along a kernel notifier, not by mucking with the SCSI state
> machine.
Of course they will be signalled. And eventually we should patch up
mutltipath-tools to read the exising events from the uevent socket.
But even with that patch there is a quite largish window during
which IOs will be sent to the blocked device, and hence will be
stuck in the request queue until the timer expires.

> However, there's still devloss_tmo to consider ... even in
> multipath, I don't think you want to signal path failure until
> devloss_tmo has fired otherwise you'll get too many transient up/down
> events which damage performance if the array has an expensive failover
> model.
> 
Yes. But currently we have a very high failover latency as we always have
to wait for the requeued commands to time-out.
Hence we're damaging performance on arrays with inexpensive failover.

> The other problem is what to do with in-flight commands at the time the
> link went down.  With your current patch, they're still stuck until they
> time out ... surely there needs to be some type of recovery mechanism
> for these?
> 
Well, the in-flight commands are owned by the HBA driver, which should
have the proper code to terminate / return those commands with the
appriopriate codes. They will then be rescheduled and will be caught
like 'normal' IO requests.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
[EMAIL PROTECTED] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-12-14 Thread James Bottomley

On Fri, 2007-12-14 at 10:00 +0100, Hannes Reinecke wrote:
> James Bottomley wrote:
> > On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
> >> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
> >> have, by mid-week) and I won't do anything here.
> > 
> > Just to confirm what I think I'm going to be doing:  rebasing the
> > scsi-misc tree to remove this commit:
> > 
> > commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> > Author: Hannes Reinecke <[EMAIL PROTECTED]>
> > Date:   Tue Nov 6 09:23:40 2007 +0100
> > 
> > [SCSI] Do not requeue requests if REQ_FAILFAST is set
> > 
> > And its allied fix ups:
> > 
> > commit 983289045faa96fba8841d3c51b98bb8623d9504
> > Author: James Bottomley <[EMAIL PROTECTED]>
> > Date:   Sat Nov 24 19:47:25 2007 +0200
> > 
> > [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> > 
> > commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> > Author: James Bottomley <[EMAIL PROTECTED]>
> > Date:   Sat Nov 24 19:55:53 2007 +0200
> > 
> > [SCSI] fix domain validation to work again
> > 
> > James
> > 
> > 
> Or just apply my latest patch (cf Undo __scsi_kill_request).
> The main point is that we shouldn't retry requests
> with FAILFAST set when the queue is blocked. AFAICS
> only FC and iSCSI transports set the queue to blocked,
> and use this to indicate a loss of connection. So any
> retry with queue blocked is futile.

I still don't think this is the right approach.

For link up/down events, those are direct pathing events and should be
signalled along a kernel notifier, not by mucking with the SCSI state
machine.  However, there's still devloss_tmo to consider ... even in
multipath, I don't think you want to signal path failure until
devloss_tmo has fired otherwise you'll get too many transient up/down
events which damage performance if the array has an expensive failover
model.

The other problem is what to do with in-flight commands at the time the
link went down.  With your current patch, they're still stuck until they
time out ... surely there needs to be some type of recovery mechanism
for these?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-12-14 Thread Hannes Reinecke
James Bottomley wrote:
> On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
>> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
>> have, by mid-week) and I won't do anything here.
> 
> Just to confirm what I think I'm going to be doing:  rebasing the
> scsi-misc tree to remove this commit:
> 
> commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> Author: Hannes Reinecke <[EMAIL PROTECTED]>
> Date:   Tue Nov 6 09:23:40 2007 +0100
> 
> [SCSI] Do not requeue requests if REQ_FAILFAST is set
> 
> And its allied fix ups:
> 
> commit 983289045faa96fba8841d3c51b98bb8623d9504
> Author: James Bottomley <[EMAIL PROTECTED]>
> Date:   Sat Nov 24 19:47:25 2007 +0200
> 
> [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> 
> commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> Author: James Bottomley <[EMAIL PROTECTED]>
> Date:   Sat Nov 24 19:55:53 2007 +0200
> 
> [SCSI] fix domain validation to work again
> 
> James
> 
> 
Or just apply my latest patch (cf Undo __scsi_kill_request).
The main point is that we shouldn't retry requests
with FAILFAST set when the queue is blocked. AFAICS
only FC and iSCSI transports set the queue to blocked,
and use this to indicate a loss of connection. So any
retry with queue blocked is futile.

Cheers,

Hannes

-- 
Dr. Hannes Reinecke   zSeries & Storage
[EMAIL PROTECTED] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-12-12 Thread Jens Axboe
On Wed, Dec 12 2007, Boaz Harrosh wrote:
> On Tue, Dec 11 2007 at 18:33 +0200, James Bottomley <[EMAIL PROTECTED]> wrote:
> > On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
> >> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
> >> have, by mid-week) and I won't do anything here.
> > 
> > Just to confirm what I think I'm going to be doing:  rebasing the
> > scsi-misc tree to remove this commit:
> > 
> > commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> > Author: Hannes Reinecke <[EMAIL PROTECTED]>
> > Date:   Tue Nov 6 09:23:40 2007 +0100
> > 
> > [SCSI] Do not requeue requests if REQ_FAILFAST is set
> > 
> > And its allied fix ups:
> > 
> > commit 983289045faa96fba8841d3c51b98bb8623d9504
> > Author: James Bottomley <[EMAIL PROTECTED]>
> > Date:   Sat Nov 24 19:47:25 2007 +0200
> > 
> > [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> > 
> > commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> > Author: James Bottomley <[EMAIL PROTECTED]>
> > Date:   Sat Nov 24 19:55:53 2007 +0200
> > 
> > [SCSI] fix domain validation to work again
> > 
> > James
> > 
> 
> The problems caused by this patch where nagging me at the back of my head
> from the begging. Why should we fail on a check of FAIL_FAST in all kind
> of weird places like boots, when the only place that should ever set the 
> flag should be one of the multi-path drivers. finally it struck me:
> 
> It might be a bug in ll_rw_blk at blk_rq_bio_prep() there is this:
> 
> static void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
>   struct bio *bio)
> {
>   /* first two bits are identical in rq->cmd_flags and bio->bi_rw */
>   rq->cmd_flags |= (bio->bi_rw & 3);
>   ...
> 
> Now this is no longer true and is a bug.
> Second bit of bio->bi_rw defined in bio.h is:
> #define BIO_RW_AHEAD  1
> but
> Second bit of rq->cmd_flags is __REQ_FAILFAST
> 
> so maybe we are getting FAILFAST in the wrong places?

But that's actually on purpose, though the comment is pretty much crap.
We don't want to be retrying readahead requests, those should always
just be tossable.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-12-12 Thread Boaz Harrosh
On Tue, Dec 11 2007 at 18:33 +0200, James Bottomley <[EMAIL PROTECTED]> wrote:
> On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
>> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
>> have, by mid-week) and I won't do anything here.
> 
> Just to confirm what I think I'm going to be doing:  rebasing the
> scsi-misc tree to remove this commit:
> 
> commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
> Author: Hannes Reinecke <[EMAIL PROTECTED]>
> Date:   Tue Nov 6 09:23:40 2007 +0100
> 
> [SCSI] Do not requeue requests if REQ_FAILFAST is set
> 
> And its allied fix ups:
> 
> commit 983289045faa96fba8841d3c51b98bb8623d9504
> Author: James Bottomley <[EMAIL PROTECTED]>
> Date:   Sat Nov 24 19:47:25 2007 +0200
> 
> [SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE
> 
> commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
> Author: James Bottomley <[EMAIL PROTECTED]>
> Date:   Sat Nov 24 19:55:53 2007 +0200
> 
> [SCSI] fix domain validation to work again
> 
> James
> 

The problems caused by this patch where nagging me at the back of my head
from the begging. Why should we fail on a check of FAIL_FAST in all kind
of weird places like boots, when the only place that should ever set the 
flag should be one of the multi-path drivers. finally it struck me:

It might be a bug in ll_rw_blk at blk_rq_bio_prep() there is this:

static void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
struct bio *bio)
{
/* first two bits are identical in rq->cmd_flags and bio->bi_rw */
rq->cmd_flags |= (bio->bi_rw & 3);
...

Now this is no longer true and is a bug.
Second bit of bio->bi_rw defined in bio.h is:
#define BIO_RW_AHEAD1
but
Second bit of rq->cmd_flags is __REQ_FAILFAST

so maybe we are getting FAILFAST in the wrong places?

(I will look for an old patch I sent a year ago that fixes
this bug)

Boaz
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-12-11 Thread James Bottomley

On Mon, 2007-11-26 at 22:15 -0800, Andrew Morton wrote:
> OK, thanks.  I'll assume that James and Hannes have this in hand (or will
> have, by mid-week) and I won't do anything here.

Just to confirm what I think I'm going to be doing:  rebasing the
scsi-misc tree to remove this commit:

commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0
Author: Hannes Reinecke <[EMAIL PROTECTED]>
Date:   Tue Nov 6 09:23:40 2007 +0100

[SCSI] Do not requeue requests if REQ_FAILFAST is set

And its allied fix ups:

commit 983289045faa96fba8841d3c51b98bb8623d9504
Author: James Bottomley <[EMAIL PROTECTED]>
Date:   Sat Nov 24 19:47:25 2007 +0200

[SCSI] fix up REQ_FASTFAIL not to fail when state is QUIESCE

commit 9dd15a13b332e9f5c8ee752b1ccd9b84cb5bdf17
Author: James Bottomley <[EMAIL PROTECTED]>
Date:   Sat Nov 24 19:55:53 2007 +0200

[SCSI] fix domain validation to work again

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-12-02 Thread Avi Kivity

Andrew Morton wrote:

On Wed, 21 Nov 2007 12:17:14 +0200 Avi Kivity <[EMAIL PROTECTED]> wrote:

  

Avi Kivity wrote:

  
  

The make headers_check fails,

 CHECK   include/linux/usb/gadgetfs.h
 CHECK   include/linux/usb/ch9.h
 CHECK   include/linux/usb/cdc.h
 CHECK   include/linux/usb/audio.h
 CHECK   include/linux/kvm.h
/root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires 
asm/kvm.h, which does not exist in exported headers
   


hm, works for me, on i386 and x86_64.  What's different over there?
   
  

Hi Andrew,

It fails on the powerpc box, with allyesconfig option.

 
  


How do we fix this?  Export linux/kvm.h only on x86?  Seems ugly.

  

Is kvm x86 specific? Then move the .h file to asm-x86.
Otherwise no good idea...

  

kvm.h is x86 specific today, but will be s390, ppc, ia64, and x86 
specific tomorrow.


What about having a asm-generic/kvm.h with a nice #error?would 
that suit?


  
headers_check continues to complain.  Is the only recourse to add 
asm/kvm.h for all archs?





That would work.

Meanwhile my recourse is to drop the kvm tree ;)
  


Since you put it this way...

I committed the attached (sorry) patch to kvm.git.   Rather than 
touching 2*($NARCH - 1) file, I changed include/linux/Kbuild to only 
export kvm.h if the arch actually supports it.  Currently that's just x86.



--
error compiling committee.c: too many arguments to function

>From a393444c97f6d7355a6d7d6d7aeb80f1e72472b1 Mon Sep 17 00:00:00 2001
From: Avi Kivity <[EMAIL PROTECTED]>
Date: Sun, 2 Dec 2007 10:50:06 +0200
Subject: [PATCH] KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM

Currently, make headers_check barfs due to , which 
includes, not existing.  Rather than add a zillion s, export kvm.h
only if the arch actually supports it.

Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig |3 +++
 drivers/kvm/Kconfig  |4 ++--
 include/linux/Kbuild |2 +-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 368864d..eded44e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -112,6 +112,9 @@ config GENERIC_TIME_VSYSCALL
 	bool
 	default X86_64
 
+config ARCH_SUPPORTS_KVM
+	bool
+	default y
 
 
 
diff --git a/drivers/kvm/Kconfig b/drivers/kvm/Kconfig
index 6569206..4086080 100644
--- a/drivers/kvm/Kconfig
+++ b/drivers/kvm/Kconfig
@@ -3,7 +3,7 @@
 #
 menuconfig VIRTUALIZATION
 	bool "Virtualization"
-	depends on X86
+	depends on ARCH_SUPPORTS_KVM || X86
 	default y
 	---help---
 	  Say Y here to get to see options for using your Linux host to run other
@@ -16,7 +16,7 @@ if VIRTUALIZATION
 
 config KVM
 	tristate "Kernel-based Virtual Machine (KVM) support"
-	depends on X86 && EXPERIMENTAL
+	depends on ARCH_SUPPORTS_KVM && EXPERIMENTAL
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	---help---
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index 105c5d6..397197f 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -254,7 +254,7 @@ unifdef-y += kd.h
 unifdef-y += kernelcapi.h
 unifdef-y += kernel.h
 unifdef-y += keyboard.h
-unifdef-y += kvm.h
+unifdef-$(CONFIG_ARCH_SUPPORTS_KVM) += kvm.h
 unifdef-y += llc.h
 unifdef-y += loop.h
 unifdef-y += lp.h
-- 
1.5.3



Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-28 Thread Laurent Riffard
Le 25.11.2007 21:39, Laurent Riffard a écrit :
> Le 25.11.2007 08:37, James Bottomley a écrit :
>> On Sat, 2007-11-24 at 23:59 +0100, Laurent Riffard wrote:
>>> Le 24.11.2007 14:26, James Bottomley a écrit :
 OK, could you post dmesgs again, please.  I actually tested this
>>> with an
 aic79xx card, and for me it does cause Domain Validation to succeed
 again.
>>> James, 
>>>
>>> Here is a dmesg produced by 2.6.24-rc3-mm1 + your patch "separates
>>> the 
>>> BLOCK and QUIESCE states
>>> correctly" (http://lkml.org/lkml/2007/11/24/8).
>>>
[...]
>>> [   25.521256] scsi0 : pata_via
>>> [   25.521711] scsi1 : pata_via
>>> [   25.524089] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xb800 irq 
>>> 14
>>> [   25.524176] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xb808 irq 
>>> 15
>>> [   25.683141] ata1.00: ATA-5: ST340016A, 3.75, max UDMA/100
>>> [   25.683208] ata1.00: 78165360 sectors, multi 16: LBA 
>>> [   25.683475] ata1.01: ATA-7: Maxtor 6Y080L0, YAR41BW0, max UDMA/133
>>> [   25.684116] ata1.01: 160086528 sectors, multi 16: LBA 
>>> [   25.691127] ata1.00: configured for UDMA/100
>>> [   25.699142] ata1.01: configured for UDMA/100
>>> [   26.170860] ata2.00: ATAPI: HL-DT-ST DVDRAM GSA-4165B, DL05, max UDMA/33
>>> [   26.171562] ata2.01: ATAPI: CD-950E/AKU, A4Q, max MWDMA2, CDB intr
>>> [   26.330839] ata2.00: configured for UDMA/33
>>> [   26.490828] ata2.01: configured for MWDMA2
>>> [   26.503014] scsi 0:0:0:0: Direct-Access ATA  ST340016A 3.75 PQ: 
>>> 0 ANSI: 5
>>> [   26.504670] scsi 0:0:1:0: Direct-Access ATA  Maxtor 6Y080L0 YAR4 
>>> PQ: 0 ANSI: 5
>>> [   26.509842] scsi 1:0:0:0: CD-ROMHL-DT-ST DVDRAM GSA-4165B 
>>> DL05 PQ: 0 ANSI: 5
>>> [   26.511673] scsi 1:0:1:0: CD-ROME-IDECD-950E/AKU A4Q  
>>> PQ: 0 ANSI: 5
>> [...]
>>> [   60.216113] sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> [   60.216124] end_request: I/O error, dev sda, sector 16460
>> I think this one's quite easy:  PATA devices in libata are queue depth 1
>> (since they don't do NCQ).  Thus, they're peculiarly sensitive to the
>> bug where we fail over queue depth requests.
>>
>> On the other hand, I don't see how a filesystem request is getting
>> REQ_FAILFAST ... unless there's a bio or readahead issue involved.
>> Anyway, could you try this patch:
>>
>> http://marc.info/?l=linux-scsi&m=119592627425498
>>
>> Which should fix the queue depth issue, and see if the errors go away?
> 
> No, this one doesn't help...
 
still happens with 2.6.24-rc3-mm2...
-- 
laurent
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-27 Thread Andrew Morton
On Wed, 21 Nov 2007 12:17:14 +0200 Avi Kivity <[EMAIL PROTECTED]> wrote:

> Avi Kivity wrote:
> >   
> >> The make headers_check fails,
> >>
> >>  CHECK   include/linux/usb/gadgetfs.h
> >>  CHECK   include/linux/usb/ch9.h
> >>  CHECK   include/linux/usb/cdc.h
> >>  CHECK   include/linux/usb/audio.h
> >>  CHECK   include/linux/kvm.h
> >> /root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires 
> >> asm/kvm.h, which does not exist in exported headers
> >>
> > hm, works for me, on i386 and x86_64.  What's different over there?
> >
>  Hi Andrew,
> 
>  It fails on the powerpc box, with allyesconfig option.
> 
>   
>    
> >>> How do we fix this?  Export linux/kvm.h only on x86?  Seems ugly.
> >>> 
> >>
> >> Is kvm x86 specific? Then move the .h file to asm-x86.
> >> Otherwise no good idea...
> >>
> >>   
> >
> > kvm.h is x86 specific today, but will be s390, ppc, ia64, and x86 
> > specific tomorrow.
> >
> > What about having a asm-generic/kvm.h with a nice #error?would 
> > that suit?
> >
> 
> headers_check continues to complain.  Is the only recourse to add 
> asm/kvm.h for all archs?
> 

That would work.

Meanwhile my recourse is to drop the kvm tree ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-27 Thread Rik van Riel
On Mon, 26 Nov 2007 15:28:32 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Tue, 27 Nov 2007 00:14:17 +0100
> Jiri Slaby <[EMAIL PROTECTED]> wrote:
> 
> > On 11/26/2007 11:17 PM, Andrew Morton wrote:
> > >> Maybe if you can emit a broken-out with the fresh pull to test?
> > > 
> > > http://userweb.kernel.org/~akpm/mmotm/ is current.  But it probably won't
> > > compile. 
> > 
> > Yes it did :). And it worked. Both in qemu and on my desktop...
> 
> boggle.  Let's slap 2.6.25 on it and take the rest of the year off.

No worries, the mmotm compiling issue seems to have been fixed:

  CC [M]  drivers/scsi/libsas/sas_ata.o
drivers/scsi/libsas/sas_ata.c:39: error: field ‘rphy’ has incomplete type
drivers/scsi/libsas/sas_ata.c: In function ‘sas_discover_sata’:
drivers/scsi/libsas/sas_ata.c:773: error: implicit declaration of function 
‘ata_sas_rphy_alloc’
drivers/scsi/libsas/sas_ata.c:775: error: dereferencing pointer to incomplete 
type
drivers/scsi/libsas/sas_ata.c:775: warning: assignment makes pointer from 
integer without a cast
drivers/scsi/libsas/sas_ata.c:781: error: dereferencing pointer to incomplete 
type
drivers/scsi/libsas/sas_ata.c:782: error: dereferencing pointer to incomplete 
type
drivers/scsi/libsas/sas_ata.c:784: warning: type defaults to ‘int’ in 
declaration of ‘__mptr’
drivers/scsi/libsas/sas_ata.c:784: warning: initialization from incompatible 
pointer type
drivers/scsi/libsas/sas_ata.c:791: error: implicit declaration of function 
‘ata_sas_rphy_add’
drivers/scsi/libsas/sas_ata.c:807: error: implicit declaration of function 
‘ata_sas_rphy_delete’
drivers/scsi/libsas/sas_ata.c:809: error: implicit declaration of function 
‘ata_sas_rphy_free’
make[3]: *** [drivers/scsi/libsas/sas_ata.o] Error 1
make[2]: *** [drivers/scsi/libsas] Error 2
make[1]: *** [drivers/scsi] Error 2
make: *** [drivers] Error 2

So much for continuing the bisect with that tree, to find the
cause of the second bug :)

Guess I'll extract an x86 tree changeset first, to place into
the 2.6.23-rc3-mm1 broken out tree and work from there...

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 -- arch/x86/xen/en lighten.c:591: error: 'TLB_FLUSH_ALL' undeclared (first use in this function)

2007-11-27 Thread Jeremy Fitzhardinge
Miles Lane wrote:
> arch/x86/xen/enlighten.c: In function 'xen_flush_tlb_others':
> arch/x86/xen/enlighten.c:591: error: 'TLB_FLUSH_ALL' undeclared (first
> use in this function)
> arch/x86/xen/enlighten.c:591: error: (Each undeclared identifier is
> reported only once
> arch/x86/xen/enlighten.c:591: error: for each function it appears in.)
> make[1]: *** [arch/x86/xen/enlighten.o] Error 1
>   

Hm, I can't reproduce this in current git with your .config.  Is there
something in -mm which touches the tlb headers?

I do have a stack of tglx's x86 unification patches applied as well. 
Perhaps they help.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - brick my Dell Latitude D820

2007-11-27 Thread Ingo Molnar

* Andrew Morton <[EMAIL PROTECTED]> wrote:

> Otherwise, please proceed to work out which diff I need to drop and 
> hope like hell that it isn't git-x86..

hm? x86.git is fully bisectable - so a more accurate statement would be 
"and hope that it's x86.git, so that it can be properly bisected" :-) 
For x86.git bisection, pull the 'mm' branch from:

   git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - brick my Dell Latitude D820

2007-11-27 Thread Valdis . Kletnieks
On Tue, 27 Nov 2007 16:25:22 +0800, Dave Young said:

> does boot_delay helps?

It might, if the kernel lived long enough to output a first printk for
us to delay after.  :)

Shooting this one would be *easy* if the problem was an boot-time oops that
would otherwise scroll off the screen without a boot_delay...


pgpqJlvYAN6Ou.pgp
Description: PGP signature


Re: 2.6.24-rc3-mm1 - brick my Dell Latitude D820

2007-11-27 Thread Dave Young
On Nov 27, 2007 3:16 PM,  <[EMAIL PROTECTED]> wrote:
> On Tue, 20 Nov 2007 20:45:25 PST, Andrew Morton said:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>
> Finally got both time and motivation to at least start a bisect..
>
> 2.6.23-mm1 works on my D820 (x86_64 kernel, Core2 Duo T7200)
>
> 24-rc3-mm1 (plus 3 patches from hotfixes/) bricks *instantly* at boot - grub
> prints its 3 or 4 lines saying what it loaded, the screen clears, and *blam*
> dead. No serial console output, no pair of penguins on the monitor, no
> netconsole, no earlyprintk=vga output, no alt-sysrq, only thing that does
> *anything* is "hold the power button for 5 seconds".  Whatever it is, it
> happens *very* early (before we get as far as the 'Linux version 2.6.mumble'
> banner), and happens *hard*.
>
> I've bisected it down this far:
>
> git-ipwireless_cs.patch GOOD
> git-x86.patch
> git-x86-fixup.patch
> git-x86-thread_order-borkage.patch
> git-x86-thread_order-borkage-fix.patch
> git-x86-identify_cpu-fix.patch
> git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko.patch
> git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko-checkpatch-fixes.patch
> git-x86-inlining-borkage.patch
> x86_64-set-cpu_index-to-nr_cpus-instead-of-0.patch
> x86_64-make-sparsemem-vmemmap-the-default-memory-model-v2.patch BAD
>
> Anybody got any good debugging ideas before I go through and do the final
> 3 or 4 bisects?  I suspect I'll need them once I find the offending patch
> to tell *why* said patch dies on my box - I've seen enough traffic regarding
> -rc3-mm1 dying *later* to know it's probably a subtle issue and not one
> that will be obvious once I finger a specific patch.  For example, it's
> probably not the IO-APIC panic that people are seeing, because their kernels
> live long enough to panic. ;)

Hi,
does boot_delay helps?

Regards
dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - brick my Dell Latitude D820

2007-11-27 Thread Andrew Morton
On Tue, 27 Nov 2007 02:54:56 -0500 [EMAIL PROTECTED] wrote:

> On Mon, 26 Nov 2007 23:27:03 PST, Andrew Morton said:
> 
> > > git-x86.patch
> > > git-x86-fixup.patch
> > > git-x86-thread_order-borkage.patch
> > > git-x86-thread_order-borkage-fix.patch
> > > git-x86-identify_cpu-fix.patch
> > > git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko.patch
> > > git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko-checkpatch-fixes.patch
> > > git-x86-inlining-borkage.patch
> > > x86_64-set-cpu_index-to-nr_cpus-instead-of-0.patch
> > > x86_64-make-sparsemem-vmemmap-the-default-memory-model-v2.patch BAD
> 
> > You could try http://userweb.kernel.org/~akpm/mmotm/ - we might have already
> > fixed it.
> 
> I suspect that trying -rc3-mm1 but refreshing just the 10 patches above
> from -mmotm would be far less likely to pull in other heartburn?

All the above are no longer in -mm.  They got merged, dropped,
otherwise-fixed, etc.

> > Otherwise, please proceed to work out which diff I need to drop and hope 
> > like
> > hell that it isn't git-x86..
> 
> That's a 41,240 line diff, the rest *total* to about 400 lines.  I don't have
> warm-n-fuzzies about my odds here. ;)

No.

> I'm a git-idiot, but *do* know how to git-bisect through Linus tree - what
> would I need to do to git-bisect through git-x86.patch? (I do *not* know how
> to deal with more than 1 source git tree, so if the magic is just 'get a
> linus tree, merge git-x86, then bisect as usual", I'm stuck on "merge 
> git-x86")..

umm, I'm minimally git-afflicted hence am the wrong person to ask. 
Something like:


- checkout Linus's tree

- echo 'git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git#mm' 
> .git/branches/git-x86

- git-fetch git-x86

- git-checkout git-x86

- start bisecting.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - brick my Dell Latitude D820

2007-11-26 Thread Valdis . Kletnieks
On Mon, 26 Nov 2007 23:27:03 PST, Andrew Morton said:

> > git-x86.patch
> > git-x86-fixup.patch
> > git-x86-thread_order-borkage.patch
> > git-x86-thread_order-borkage-fix.patch
> > git-x86-identify_cpu-fix.patch
> > git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko.patch
> > git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko-checkpatch-fixes.patch
> > git-x86-inlining-borkage.patch
> > x86_64-set-cpu_index-to-nr_cpus-instead-of-0.patch
> > x86_64-make-sparsemem-vmemmap-the-default-memory-model-v2.patch BAD

> You could try http://userweb.kernel.org/~akpm/mmotm/ - we might have already
> fixed it.

I suspect that trying -rc3-mm1 but refreshing just the 10 patches above
from -mmotm would be far less likely to pull in other heartburn?

> Otherwise, please proceed to work out which diff I need to drop and hope like
> hell that it isn't git-x86..

That's a 41,240 line diff, the rest *total* to about 400 lines.  I don't have
warm-n-fuzzies about my odds here. ;)

I'm a git-idiot, but *do* know how to git-bisect through Linus tree - what
would I need to do to git-bisect through git-x86.patch? (I do *not* know how
to deal with more than 1 source git tree, so if the magic is just 'get a
linus tree, merge git-x86, then bisect as usual", I'm stuck on "merge 
git-x86")..



pgpxMGUuWzdJd.pgp
Description: PGP signature


Re: 2.6.24-rc3-mm1 - brick my Dell Latitude D820

2007-11-26 Thread Andrew Morton
On Tue, 27 Nov 2007 02:16:26 -0500 [EMAIL PROTECTED] wrote:

> On Tue, 20 Nov 2007 20:45:25 PST, Andrew Morton said:
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> 
> Finally got both time and motivation to at least start a bisect..
> 
> 2.6.23-mm1 works on my D820 (x86_64 kernel, Core2 Duo T7200)
> 
> 24-rc3-mm1 (plus 3 patches from hotfixes/) bricks *instantly* at boot - grub
> prints its 3 or 4 lines saying what it loaded, the screen clears, and *blam*
> dead. No serial console output, no pair of penguins on the monitor, no
> netconsole, no earlyprintk=vga output, no alt-sysrq, only thing that does
> *anything* is "hold the power button for 5 seconds".  Whatever it is, it
> happens *very* early (before we get as far as the 'Linux version 2.6.mumble'
> banner), and happens *hard*.
> 
> I've bisected it down this far:
> 
> git-ipwireless_cs.patch GOOD
> git-x86.patch
> git-x86-fixup.patch
> git-x86-thread_order-borkage.patch
> git-x86-thread_order-borkage-fix.patch
> git-x86-identify_cpu-fix.patch
> git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko.patch
> git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko-checkpatch-fixes.patch
> git-x86-inlining-borkage.patch
> x86_64-set-cpu_index-to-nr_cpus-instead-of-0.patch
> x86_64-make-sparsemem-vmemmap-the-default-memory-model-v2.patch BAD
> 
> Anybody got any good debugging ideas before I go through and do the final
> 3 or 4 bisects?  I suspect I'll need them once I find the offending patch
> to tell *why* said patch dies on my box - I've seen enough traffic regarding
> -rc3-mm1 dying *later* to know it's probably a subtle issue and not one
> that will be obvious once I finger a specific patch.  For example, it's
> probably not the IO-APIC panic that people are seeing, because their kernels
> live long enough to panic. ;)
> 

You could try http://userweb.kernel.org/~akpm/mmotm/ - we might have already
fixed it.

Otherwise, please proceed to work out which diff I need to drop and hope like
hell that it isn't git-x86..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - brick my Dell Latitude D820

2007-11-26 Thread Valdis . Kletnieks
On Tue, 20 Nov 2007 20:45:25 PST, Andrew Morton said:
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/

Finally got both time and motivation to at least start a bisect..

2.6.23-mm1 works on my D820 (x86_64 kernel, Core2 Duo T7200)

24-rc3-mm1 (plus 3 patches from hotfixes/) bricks *instantly* at boot - grub
prints its 3 or 4 lines saying what it loaded, the screen clears, and *blam*
dead. No serial console output, no pair of penguins on the monitor, no
netconsole, no earlyprintk=vga output, no alt-sysrq, only thing that does
*anything* is "hold the power button for 5 seconds".  Whatever it is, it
happens *very* early (before we get as far as the 'Linux version 2.6.mumble'
banner), and happens *hard*.

I've bisected it down this far:

git-ipwireless_cs.patch GOOD
git-x86.patch
git-x86-fixup.patch
git-x86-thread_order-borkage.patch
git-x86-thread_order-borkage-fix.patch
git-x86-identify_cpu-fix.patch
git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko.patch
git-x86-memory_add_physaddr_to_nid-export-for-acpi-memhotplugko-checkpatch-fixes.patch
git-x86-inlining-borkage.patch
x86_64-set-cpu_index-to-nr_cpus-instead-of-0.patch
x86_64-make-sparsemem-vmemmap-the-default-memory-model-v2.patch BAD

Anybody got any good debugging ideas before I go through and do the final
3 or 4 bisects?  I suspect I'll need them once I find the offending patch
to tell *why* said patch dies on my box - I've seen enough traffic regarding
-rc3-mm1 dying *later* to know it's probably a subtle issue and not one
that will be obvious once I finger a specific patch.  For example, it's
probably not the IO-APIC panic that people are seeing, because their kernels
live long enough to panic. ;)



pgpbW8UIlUa1z.pgp
Description: PGP signature


Re: 2.6.24-rc3-mm1

2007-11-26 Thread Andrew Morton
On Fri, 23 Nov 2007 06:55:41 +0100 Gabriel C <[EMAIL PROTECTED]> wrote:

> Andrew Morton wrote:
> > On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C <[EMAIL PROTECTED]> wrote:
> > 
> >> I have some warnings on each SCSI disc:
> >>
> >>
> >> ...
> >>
> >> [   30.724410] scsi 0:0:0:0: Direct-Access SEAGATE  ST318406LW   
> >> 0109 PQ: 0 ANSI: 3
> >> [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
> >> [   30.724435]  target0:0:0: Beginning Domain Validation
> >> [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
> >> [   30.724572]  target0:0:0: Ending Domain Validation
> >> [   30.729747] scsi 0:0:1:0: Direct-Access FUJITSU  MAH3182MP
> >> 0114 PQ: 0 ANSI: 4
> >> [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
> >> [   30.729771]  target0:0:1: Beginning Domain Validation
> >> [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
> >> [   30.729908]  target0:0:1: Ending Domain Validation
> >>
> > 
> > Don't know what would have caused that.  But yes, something is wrong in
> > scsi land.
> 
> Actually I'm lucky the author didn't fix that FIXME in scsi_transport_spi.c 
> and I still can boot ;)
> 
> > 
> >> no idea whatever this is related but buffered disk reads are 2.XX MB/sec 
> >> and the box is somewhat laggy.
> >>
> >> hdparm -t on sda and sdb reports :
> >>
> >> /dev/sda:
> >>  Timing buffered disk reads:8 MB in  3.26 seconds =   2.46 MB/sec
> >>
> >> /dev/sdb:
> >>  Timing buffered disk reads:8 MB in  3.56 seconds =   2.25 MB/sec
> >>
> >> My IDE discs are fine.
> >>
> >> Please let me know if you need my config or any other informations.
> >>
> > 
> > And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.
> > 
> 
> I found the commit which cause these problems , it is in git-scsi-misc patch 
> and reverting it fixes both problems for me.
> 
> http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d
> 

OK, thanks.  I'll assume that James and Hannes have this in hand (or will
have, by mid-week) and I won't do anything here.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Andrew Morton
On Tue, 27 Nov 2007 00:14:17 +0100
Jiri Slaby <[EMAIL PROTECTED]> wrote:

> On 11/26/2007 11:17 PM, Andrew Morton wrote:
> >> Maybe if you can emit a broken-out with the fresh pull to test?
> > 
> > http://userweb.kernel.org/~akpm/mmotm/ is current.  But it probably won't
> > compile. 
> 
> Yes it did :). And it worked. Both in qemu and on my desktop...

boggle.  Let's slap 2.6.25 on it and take the rest of the year off.

> qemu output at:
> http://www.fi.muni.cz/~xslaby/sklad/qemu-output.txt

Thanks for testing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Jiri Slaby
On 11/26/2007 11:17 PM, Andrew Morton wrote:
>> Maybe if you can emit a broken-out with the fresh pull to test?
> 
> http://userweb.kernel.org/~akpm/mmotm/ is current.  But it probably won't
> compile. 

Yes it did :). And it worked. Both in qemu and on my desktop...

qemu output at:
http://www.fi.muni.cz/~xslaby/sklad/qemu-output.txt

thanks,
-- 
Jiri Slaby ([EMAIL PROTECTED])
Faculty of Informatics, Masaryk University
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Jiri Slaby
On 11/26/2007 11:17 PM, Andrew Morton wrote:
> http://userweb.kernel.org/~akpm/mmotm/ is current.  But it probably won't
> compile.  I'd suggest bisecting 2.6.24-rc3-mm1 would be easier.  

Yes, I've bisected this and it pointed to git-x86.patch + 2 pushed fixes from
series, Then tried x86 git, but its HEAD was OK.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Andrew Morton
On Mon, 26 Nov 2007 23:08:33 +0100
Jiri Slaby <[EMAIL PROTECTED]> wrote:

> On 11/26/2007 09:45 PM, Ingo Molnar wrote:
> > * Andrew Morton <[EMAIL PROTECTED]> wrote:
> > 
> >> On Mon, 26 Nov 2007 14:39:43 -0500
> >> Rik van Riel <[EMAIL PROTECTED]> wrote:
> >>
> >>> On Tue, 20 Nov 2007 22:18:39 -0800
> >>> Andrew Morton <[EMAIL PROTECTED]> wrote:
> >>>
> > ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> > Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> > 'noapic' kernel parameter
>  ACPI or x86 breakage, I guess.
> 
>  Did 'noapic' work?
> >>> I got the same bug as above, 'noapic' gets past that point 
> >> We still don't know what caused this, afaik.
> > 
> > yes. Is it a regression? If yes, could someone try to bisect it so that 
> > we can fix it? If it's caused by x86.git then the 'mm' branch of the x86 
> > git tree can be used for bisection:
> > 
> >git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git
> 
> I did, but it's hard, if you don't know the BAD point. HEAD boots fine and 
> 'x86:
> randomize brk' too (the top of git-x86.patch).

So the bug wasn't in git-x86 in 2.6.24-rc3-mm1.

But it might be in there now, as some patches got moved over.

Or it could be git-acpi.  Or lots of other things.

> Andrew, how do you pull it, git
> #mm doesn't fit to the ids from the patch.

The -mm git tree reimports the plain git-foo.patch files back into a new
git tree, so the commit IDs won't line up.

The way to find the culprit patch in 2.6.24-rc3-mm1 is
http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt.  It
will be quite quick.

> Maybe if you can emit a broken-out with the fresh pull to test?

http://userweb.kernel.org/~akpm/mmotm/ is current.  But it probably won't
compile.  I'd suggest bisecting 2.6.24-rc3-mm1 would be easier.  
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Jiri Slaby
On 11/26/2007 09:45 PM, Ingo Molnar wrote:
> * Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
>> On Mon, 26 Nov 2007 14:39:43 -0500
>> Rik van Riel <[EMAIL PROTECTED]> wrote:
>>
>>> On Tue, 20 Nov 2007 22:18:39 -0800
>>> Andrew Morton <[EMAIL PROTECTED]> wrote:
>>>
> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> 'noapic' kernel parameter
 ACPI or x86 breakage, I guess.

 Did 'noapic' work?
>>> I got the same bug as above, 'noapic' gets past that point 
>> We still don't know what caused this, afaik.
> 
> yes. Is it a regression? If yes, could someone try to bisect it so that 
> we can fix it? If it's caused by x86.git then the 'mm' branch of the x86 
> git tree can be used for bisection:
> 
>git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

I did, but it's hard, if you don't know the BAD point. HEAD boots fine and 'x86:
randomize brk' too (the top of git-x86.patch). Andrew, how do you pull it, git
#mm doesn't fit to the ids from the patch.

Maybe if you can emit a broken-out with the fresh pull to test?

regards,
-- 
Jiri Slaby ([EMAIL PROTECTED])
Faculty of Informatics, Masaryk University
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-26 Thread Christoph Lameter
On Mon, 26 Nov 2007, Randy Dunlap wrote:

> ARCH_SELECT_MEMORY_MODEL depends on X86_32.  Is that too restrictive?

No. X86_64 only has one memory model.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Christoph Lameter
On Mon, 26 Nov 2007, Andrew Morton wrote:

> hm.  This smells like a startup ordering problem, but everything which
> refresh_zone_stat_thresholds() should be set up by the time we run
> initcalls.  Maybe the zone lists are bad?

refresh_zone_stat_thresholds goes through each zone and updates
the stat threshold for every per cpu structure in each zone.

So this could be a processor marked online where the pcp structures have 
not been allocated or a zone NULL pointer.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Rik van Riel
On Mon, 26 Nov 2007 12:33:19 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> > Unable to handle kernel NULL pointer dereference at 0021 RIP:
> >  [] refresh_zone_stat_thresholds+0x6d/0x90
> > PGD 0
> > Oops: 0002 [1] SMP
> > last sysfs file:
> > CPU 0
> > Modules linked in:
> > Pid: 1, comm: swapper Not tainted 2.6.24-rc3-mm1 #2
> > RIP: 0010:[]  [] 
> > refresh_zone_stat_thresholds+0x6d/0x90
> > RSP: :81007fb59ec0  EFLAGS: 00010293
> > RAX:  RBX: 0004 RCX: 0001
> > RDX: 0001 RSI: 8146fb38 RDI: 0001
> > RBP: 8100c000 R08:  R09: 
> > R10: 81007fb59e60 R11: 0028 R12: 814d4558
> > R13:  R14: 814b62c0 R15: 
> > FS:  () GS:813d9000() knlGS:
> > CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
> > CR2: 0021 CR3: 00201000 CR4: 06a0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: 0ff0 DR7: 0400
> > Process swapper (pid: 1, threadinfo 81007FB58000, task 81007FB56000)
> > Stack:     814a3839
> >   8148e626 81007fb56000 8126d36a
> >    8105786b 
> > Call Trace:
> >  [] setup_vmstat+0x6/0x40
> >  [] kernel_init+0x169/0x2d8
> >  [] trace_hardirqs_on_thunk+0x35/0x3a
> >  [] trace_hardirqs_on+0x115/0x138
> >  [] child_rip+0xa/0x12
> >  [] restore_args+0x0/0x30
> >  [] kernel_init+0x0/0x2d8
> >  [] child_rip+0x0/0x12
> > 
> > INFO: lockdep is turned off.
> 
> hm.  This smells like a startup ordering problem, but everything which
> refresh_zone_stat_thresholds() should be set up by the time we run
> initcalls.  Maybe the zone lists are bad?

Or the CPU array. Look at the oops Kamalesh got a few mails upthread...

I guess I'll have to start a bisect - can't port the VM code to a kernel
that doesn't boot...

-- 
All Rights Reversed
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Ingo Molnar

* Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Mon, 26 Nov 2007 14:39:43 -0500
> Rik van Riel <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, 20 Nov 2007 22:18:39 -0800
> > Andrew Morton <[EMAIL PROTECTED]> wrote:
> > 
> > > > ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> > > > Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> > > > 'noapic' kernel parameter
> > > 
> > > ACPI or x86 breakage, I guess.
> > > 
> > > Did 'noapic' work?
> > 
> > I got the same bug as above, 'noapic' gets past that point 
> 
> We still don't know what caused this, afaik.

yes. Is it a regression? If yes, could someone try to bisect it so that 
we can fix it? If it's caused by x86.git then the 'mm' branch of the x86 
git tree can be used for bisection:

   git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

it's supposed to build and boot fine at every bisection point. The 
bisection run can be cut significantly by narrowing the bisection to the 
arch/x86 changes only:

  git-bisect start arch/x86 include/asm-x86/

(and if it finds a nonsensical commit, i.e. the breakage is not caused 
by the x86 commits, save the "git-bisect log" output into a file, 
restart the git bisection and use "git-bisect replay" to insert all the 
test points into a fuller bisection run - this saves quite some time.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-26 Thread Randy Dunlap
On Mon, 26 Nov 2007 11:34:15 -0800 (PST) Christoph Lameter wrote:

> On Mon, 26 Nov 2007, Randy Dunlap wrote:
> 
> > On Tue, 20 Nov 2007 20:45:25 -0800 Andrew Morton wrote:
> > 
> > > 
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> > 
> > allnoconfig on x86_64 gives:
> > 
> > arch/x86/mm/init_64.c:84: error: implicit declaration of function 
> > 'pfn_valid'
> > mm/page_alloc.c:2533: error: implicit declaration of function 'pfn_valid'
> > mm/vmstat.c:518: error: implicit declaration of function 'pfn_valid'
> > mm/memory.c:400: error: implicit declaration of function 'pfn_valid'
> > drivers/char/mem.c:312: error: implicit declaration of function 'pfn_valid'
> 
> Hmmm... CONFIG_SPARSEMEM is not set if you do allnoconfig
> 
> config SPARSEMEM
> def_bool y
> depends on SPARSEMEM_MANUAL
> 
> So I guess we need to set SPARSEMEM_MANUAL
> 
> But arch/x86/Kconfig has
> 
> config SPARSEMEM_MANUAL
> bool "Sparse Memory"
> depends on ARCH_SPARSEMEM_ENABLE
> help
>   This will be the only option for some systems, including
>   memory hotplug systems.  This is normal.
> 
> It needs to be not deselectable for x86_64. 
> 
> Inserting
> 
>   def_bool y if X86_64
> 
> did not help
> 
> Somehow make menuconfig did not give me an ability to even enable this 
> again.

Thanks for the hint.

ARCH_SELECT_MEMORY_MODEL depends on X86_32.  Is that too restrictive?

config ARCH_SELECT_MEMORY_MODEL
def_bool y
depends on X86_32 && ARCH_SPARSEMEM_ENABLE

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Andrew Morton
On Mon, 26 Nov 2007 14:39:43 -0500
Rik van Riel <[EMAIL PROTECTED]> wrote:

> On Tue, 20 Nov 2007 22:18:39 -0800
> Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > > ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> > > Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> > > 'noapic' kernel parameter
> > 
> > ACPI or x86 breakage, I guess.
> > 
> > Did 'noapic' work?
> 
> I got the same bug as above, 'noapic' gets past that point 

We still don't know what caused this, afaik.

> and right to the
> next oops.  I'm posting it here because this one is different from the others
> in the thread, yet looks vaguely related:
> 
> Unable to handle kernel NULL pointer dereference at 0021 RIP:
>  [] refresh_zone_stat_thresholds+0x6d/0x90
> PGD 0
> Oops: 0002 [1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.24-rc3-mm1 #2
> RIP: 0010:[]  [] 
> refresh_zone_stat_thresholds+0x6d/0x90
> RSP: :81007fb59ec0  EFLAGS: 00010293
> RAX:  RBX: 0004 RCX: 0001
> RDX: 0001 RSI: 8146fb38 RDI: 0001
> RBP: 8100c000 R08:  R09: 
> R10: 81007fb59e60 R11: 0028 R12: 814d4558
> R13:  R14: 814b62c0 R15: 
> FS:  () GS:813d9000() knlGS:
> CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
> CR2: 0021 CR3: 00201000 CR4: 06a0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process swapper (pid: 1, threadinfo 81007FB58000, task 81007FB56000)
> Stack:     814a3839
>   8148e626 81007fb56000 8126d36a
>    8105786b 
> Call Trace:
>  [] setup_vmstat+0x6/0x40
>  [] kernel_init+0x169/0x2d8
>  [] trace_hardirqs_on_thunk+0x35/0x3a
>  [] trace_hardirqs_on+0x115/0x138
>  [] child_rip+0xa/0x12
>  [] restore_args+0x0/0x30
>  [] kernel_init+0x0/0x2d8
>  [] child_rip+0x0/0x12
> 
> INFO: lockdep is turned off.

hm.  This smells like a startup ordering problem, but everything which
refresh_zone_stat_thresholds() should be set up by the time we run
initcalls.  Maybe the zone lists are bad?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-26 Thread Rik van Riel
On Tue, 20 Nov 2007 22:18:39 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> > ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> > Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> > 'noapic' kernel parameter
> 
> ACPI or x86 breakage, I guess.
> 
> Did 'noapic' work?

I got the same bug as above, 'noapic' gets past that point and right to the
next oops.  I'm posting it here because this one is different from the others
in the thread, yet looks vaguely related:

Unable to handle kernel NULL pointer dereference at 0021 RIP:
 [] refresh_zone_stat_thresholds+0x6d/0x90
PGD 0
Oops: 0002 [1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.24-rc3-mm1 #2
RIP: 0010:[]  [] 
refresh_zone_stat_thresholds+0x6d/0x90
RSP: :81007fb59ec0  EFLAGS: 00010293
RAX:  RBX: 0004 RCX: 0001
RDX: 0001 RSI: 8146fb38 RDI: 0001
RBP: 8100c000 R08:  R09: 
R10: 81007fb59e60 R11: 0028 R12: 814d4558
R13:  R14: 814b62c0 R15: 
FS:  () GS:813d9000() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 0021 CR3: 00201000 CR4: 06a0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process swapper (pid: 1, threadinfo 81007FB58000, task 81007FB56000)
Stack:     814a3839
  8148e626 81007fb56000 8126d36a
   8105786b 
Call Trace:
 [] setup_vmstat+0x6/0x40
 [] kernel_init+0x169/0x2d8
 [] trace_hardirqs_on_thunk+0x35/0x3a
 [] trace_hardirqs_on+0x115/0x138
 [] child_rip+0xa/0x12
 [] restore_args+0x0/0x30
 [] kernel_init+0x0/0x2d8
 [] child_rip+0x0/0x12

INFO: lockdep is turned off.

-- 
All Rights Reversed
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-26 Thread Christoph Lameter
On Mon, 26 Nov 2007, Randy Dunlap wrote:

> On Tue, 20 Nov 2007 20:45:25 -0800 Andrew Morton wrote:
> 
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> 
> allnoconfig on x86_64 gives:
> 
> arch/x86/mm/init_64.c:84: error: implicit declaration of function 'pfn_valid'
> mm/page_alloc.c:2533: error: implicit declaration of function 'pfn_valid'
> mm/vmstat.c:518: error: implicit declaration of function 'pfn_valid'
> mm/memory.c:400: error: implicit declaration of function 'pfn_valid'
> drivers/char/mem.c:312: error: implicit declaration of function 'pfn_valid'

Hmmm... CONFIG_SPARSEMEM is not set if you do allnoconfig

config SPARSEMEM
def_bool y
depends on SPARSEMEM_MANUAL

So I guess we need to set SPARSEMEM_MANUAL

But arch/x86/Kconfig has

config SPARSEMEM_MANUAL
bool "Sparse Memory"
depends on ARCH_SPARSEMEM_ENABLE
help
  This will be the only option for some systems, including
  memory hotplug systems.  This is normal.

It needs to be not deselectable for x86_64. 

Inserting

def_bool y if X86_64

did not help

Somehow make menuconfig did not give me an ability to even enable this 
again.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-26 Thread Jiri Slaby
On 11/26/2007 07:48 PM, Rik van Riel wrote:
 ERROR: "empty_zero_page" [drivers/kvm/kvm.ko] undefined!
[...]
> FYI, x86_64 has the exact same issue.

yes:
hot-fixes/git-x86-dont-unexport-empty_zero_page.patch

regards,
-- 
Jiri Slaby ([EMAIL PROTECTED])
Faculty of Informatics, Masaryk University
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-26 Thread Randy Dunlap
On Tue, 20 Nov 2007 20:45:25 -0800 Andrew Morton wrote:

> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/

allnoconfig on x86_64 gives:

arch/x86/mm/init_64.c:84: error: implicit declaration of function 'pfn_valid'
mm/page_alloc.c:2533: error: implicit declaration of function 'pfn_valid'
mm/vmstat.c:518: error: implicit declaration of function 'pfn_valid'
mm/memory.c:400: error: implicit declaration of function 'pfn_valid'
drivers/char/mem.c:312: error: implicit declaration of function 'pfn_valid'


---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-26 Thread Rik van Riel
On Wed, 21 Nov 2007 14:03:34 +0800
"Dave Young" <[EMAIL PROTECTED]> wrote:
> On Nov 21, 2007 2:00 PM, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Wed, 21 Nov 2007 13:51:47 +0800 "Dave Young" <[EMAIL PROTECTED]> wrote:
> >
> > > Hi, andrew
> > >
> > > modpost failed for me:
> > >   MODPOST 360 modules
> > > ERROR: "empty_zero_page" [drivers/kvm/kvm.ko] undefined!
> > > make[1]: *** [__modpost] Error 1
> > > make: *** [modules] Error 2
> > >
> >
> > You're a victim of the hasty unexporting fad.  Which architecture?
> > x86_64 I guess?

> ia32 instead.

FYI, x86_64 has the exact same issue.


KVM needs the empty_zero_page export reinstated.

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>

diff -up 
linux-2.6.24-rc3-mm1/arch/x86/kernel/x8664_ksyms_64.c.export-empty-zero-page 
linux-2.6.24-rc3-mm1/arch/x86/kernel/x8664_ksyms_64.c
--- 
linux-2.6.24-rc3-mm1/arch/x86/kernel/x8664_ksyms_64.c.export-empty-zero-page
2007-11-26 13:47:53.0 -0500
+++ linux-2.6.24-rc3-mm1/arch/x86/kernel/x8664_ksyms_64.c   2007-11-26 
13:41:32.0 -0500
@@ -33,6 +33,7 @@ EXPORT_SYMBOL(__copy_from_user_inatomic)
 
 EXPORT_SYMBOL(copy_page);
 EXPORT_SYMBOL(clear_page);
+EXPORT_SYMBOL(empty_zero_page);
 
 /* Export string functions. We normally rely on gcc builtin for most of these,
but gcc sometimes decides not to inline them. */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-25 Thread Hannes Reinecke
On Sat, Nov 24, 2007 at 07:44:13PM +0200, James Bottomley wrote:
> Probing intermittent failures in Domain Validation, even with the fixes
> applied leads me to the conclusion that there are further problems with
> this commit:
> 
> commit fc5eb4facedbd6d7117905e775cee1975f894e79
> Author: Hannes Reinecke <[EMAIL PROTECTED]>
> Date:   Tue Nov 6 09:23:40 2007 +0100
> 
> [SCSI] Do not requeue requests if REQ_FAILFAST is set
>  
> The essence of the problems is that you're causing REQ_FAILFAST to
> terminate commands with error on requeuing conditions, some of which are
> relatively common on most SCSI devices.  While this may be the correct
> behaviour for multi-path, it's certainly wrong for the previously
> understood meaning of REQ_FAILFAST, which was don't retry on error,
> which is why domain validation and other applications use it to control
> error handling, but don't expect to get failures for a simple requeue
> are now spitting errors.
> 
> I honestly can't see that, even for the multi-path case, returning an
> error when we're over queue depth is the correct thing to do (it may not
> matter to something like a symmetrix, but an array that has a non-zero
> cost associated with a path change, like a CPQ HSV or the AVT
> controllers, will show fairly large slow downs if you do this).  Even if
> this is the desired behaviour (and I think that's a policy issue),
> DID_NO_CONNECT is almost certainly the wrong error to be sending back.
> 
> This patch fixes up domain validation to work again correctly, however,
> I really think it's just a bandaid.  Do you want to rethink the above
> commit?
> 
Given the amounted error, yes, I'll have to.
But we still face the initial problem that requeued requests will be
stuck in the queue forever (ie until the timeout catches it), causing
failover to be painfully slow.

Anyway, I'll think it over.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
[EMAIL PROTECTED] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Markus Rex, HRB 16746 (AG N�rnberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 (sync is slow ?)

2007-11-25 Thread KAMEZAWA Hiroyuki
On Sat, 24 Nov 2007 19:04:34 +0100
Gabriel C <[EMAIL PROTECTED]> wrote:
> >> It seems OK here from a quick test (i386, ext3-on-IDE).
> >>
> >> Maybe device driver/block breakage?
> 
> Try revert
> 
> http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d
> 
> See also :
> http://lkml.org/lkml/2007/11/23/5
> 
> and search for '2.6.24-rc3-mm1: I/O error, system hangs' on LKML
> 

Thank you!
The problem was fixed by reverting the patch you pointed out.

-Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-25 Thread Laurent Riffard
Le 25.11.2007 08:37, James Bottomley a écrit :
> On Sat, 2007-11-24 at 23:59 +0100, Laurent Riffard wrote:
>> Le 24.11.2007 14:26, James Bottomley a écrit :
>>> OK, could you post dmesgs again, please.  I actually tested this
>> with an
>>> aic79xx card, and for me it does cause Domain Validation to succeed
>>> again.
>> James, 
>>
>> Here is a dmesg produced by 2.6.24-rc3-mm1 + your patch "separates
>> the 
>> BLOCK and QUIESCE states
>> correctly" (http://lkml.org/lkml/2007/11/24/8).
>>
>> How to reproduce :
>> - boot
>> - switch to a text console
>> - capture dmesg in a file, sync, etc. There are 3 I/O errors, but the 
>>   system does work.
>> - switch to X console, log in the Gnome Desktop, the system partially 
>>   hangs.
>> - switch back to a text console: dmesg(1) still works, it shows some 
>>   additonal I/O errors. At this point, any disk access makes the system 
>>   completely hung.
>>
>> Additionnal data:
>> - the I/O errors always happen on the same blocks.
>>
>> plain text document attachment (dmesg-2.6.24-rc3-mm1-patched)
> [...]
>> [   25.521256] scsi0 : pata_via
>> [   25.521711] scsi1 : pata_via
>> [   25.524089] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xb800 irq 
>> 14
>> [   25.524176] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xb808 irq 
>> 15
>> [   25.683141] ata1.00: ATA-5: ST340016A, 3.75, max UDMA/100
>> [   25.683208] ata1.00: 78165360 sectors, multi 16: LBA 
>> [   25.683475] ata1.01: ATA-7: Maxtor 6Y080L0, YAR41BW0, max UDMA/133
>> [   25.684116] ata1.01: 160086528 sectors, multi 16: LBA 
>> [   25.691127] ata1.00: configured for UDMA/100
>> [   25.699142] ata1.01: configured for UDMA/100
>> [   26.170860] ata2.00: ATAPI: HL-DT-ST DVDRAM GSA-4165B, DL05, max UDMA/33
>> [   26.171562] ata2.01: ATAPI: CD-950E/AKU, A4Q, max MWDMA2, CDB intr
>> [   26.330839] ata2.00: configured for UDMA/33
>> [   26.490828] ata2.01: configured for MWDMA2
>> [   26.503014] scsi 0:0:0:0: Direct-Access ATA  ST340016A 3.75 PQ: 0 
>> ANSI: 5
>> [   26.504670] scsi 0:0:1:0: Direct-Access ATA  Maxtor 6Y080L0 YAR4 
>> PQ: 0 ANSI: 5
>> [   26.509842] scsi 1:0:0:0: CD-ROMHL-DT-ST DVDRAM GSA-4165B 
>> DL05 PQ: 0 ANSI: 5
>> [   26.511673] scsi 1:0:1:0: CD-ROME-IDECD-950E/AKU A4Q  PQ: 
>> 0 ANSI: 5
> [...]
>> [   60.216113] sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> [   60.216124] end_request: I/O error, dev sda, sector 16460
> 
> I think this one's quite easy:  PATA devices in libata are queue depth 1
> (since they don't do NCQ).  Thus, they're peculiarly sensitive to the
> bug where we fail over queue depth requests.
> 
> On the other hand, I don't see how a filesystem request is getting
> REQ_FAILFAST ... unless there's a bio or readahead issue involved.
> Anyway, could you try this patch:
> 
> http://marc.info/?l=linux-scsi&m=119592627425498
> 
> Which should fix the queue depth issue, and see if the errors go away?

No, this one doesn't help...

-- 
laurent
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-24 Thread James Bottomley
On Sat, 2007-11-24 at 23:59 +0100, Laurent Riffard wrote:
> Le 24.11.2007 14:26, James Bottomley a écrit :
> > OK, could you post dmesgs again, please.  I actually tested this
> with an
> > aic79xx card, and for me it does cause Domain Validation to succeed
> > again.
> 
> James, 
> 
> Here is a dmesg produced by 2.6.24-rc3-mm1 + your patch "separates
> the 
> BLOCK and QUIESCE states
> correctly" (http://lkml.org/lkml/2007/11/24/8).
> 
> How to reproduce :
> - boot
> - switch to a text console
> - capture dmesg in a file, sync, etc. There are 3 I/O errors, but the 
>   system does work.
> - switch to X console, log in the Gnome Desktop, the system partially 
>   hangs.
> - switch back to a text console: dmesg(1) still works, it shows some 
>   additonal I/O errors. At this point, any disk access makes the
> system 
>   completely hung.
> 
> Additionnal data:
> - the I/O errors always happen on the same blocks.
> 
> plain text document attachment (dmesg-2.6.24-rc3-mm1-patched)
[...]
> [   25.521256] scsi0 : pata_via
> [   25.521711] scsi1 : pata_via
> [   25.524089] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma
> 0xb800 irq 14
> [   25.524176] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma
> 0xb808 irq 15
> [   25.683141] ata1.00: ATA-5: ST340016A, 3.75, max UDMA/100
> [   25.683208] ata1.00: 78165360 sectors, multi 16: LBA 
> [   25.683475] ata1.01: ATA-7: Maxtor 6Y080L0, YAR41BW0, max UDMA/133
> [   25.684116] ata1.01: 160086528 sectors, multi 16: LBA 
> [   25.691127] ata1.00: configured for UDMA/100
> [   25.699142] ata1.01: configured for UDMA/100
> [   26.170860] ata2.00: ATAPI: HL-DT-ST DVDRAM GSA-4165B, DL05, max
> UDMA/33
> [   26.171562] ata2.01: ATAPI: CD-950E/AKU, A4Q, max MWDMA2, CDB intr
> [   26.330839] ata2.00: configured for UDMA/33
> [   26.490828] ata2.01: configured for MWDMA2
> [   26.503014] scsi 0:0:0:0: Direct-Access ATA  ST340016A
> 3.75 PQ: 0 ANSI: 5
> [   26.504670] scsi 0:0:1:0: Direct-Access ATA  Maxtor 6Y080L0
> YAR4 PQ: 0 ANSI: 5
> [   26.509842] scsi 1:0:0:0: CD-ROMHL-DT-ST DVDRAM
> GSA-4165B DL05 PQ: 0 ANSI: 5
> [   26.511673] scsi 1:0:1:0: CD-ROME-IDECD-950E/AKU
> A4Q  PQ: 0 ANSI: 5
[...]
> [   60.216113] sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT
> driverbyte=DRIVER_OK,SUGGEST_OK
> [   60.216124] end_request: I/O error, dev sda, sector 16460

I think this one's quite easy:  PATA devices in libata are queue depth 1
(since they don't do NCQ).  Thus, they're peculiarly sensitive to the
bug where we fail over queue depth requests.

On the other hand, I don't see how a filesystem request is getting
REQ_FAILFAST ... unless there's a bio or readahead issue involved.
Anyway, could you try this patch:

http://marc.info/?l=linux-scsi&m=119592627425498

Which should fix the queue depth issue, and see if the errors go away?

Thanks,

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-24 Thread Laurent Riffard
Le 24.11.2007 14:26, James Bottomley a écrit :
> On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
>> Le 24.11.2007 07:42, James Bottomley a écrit :
>>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
 Le 23.11.2007 12:38, Hannes Reinecke a écrit :
[snip]
 I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
 does fix the problem.

>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an 
>> error where
>> I shouldn't. Checking ...
>>
> Ok, found it. We are blocking even special commands (ie requests with 
> PREEMPT not set)
> when FAILFAST is set. Which is clearly wrong. The attached patch fixes 
> this.
 Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O 
 errors.
>>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
>>> is the state that the domain validation uses and which we cannot kill
>>> fastfail on).  It's definitely wrong to kill fastfail requests when the
>>> state is QUIESCE.
>>>
>>> This patch (which is applied on top of Hannes original) separates the
>>> BLOCK and QUIESCE states correctly ... does this fix the problem?
>>
>> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
> 
> OK, could you post dmesgs again, please.  I actually tested this with an
> aic79xx card, and for me it does cause Domain Validation to succeed
> again.

James, 

Here is a dmesg produced by 2.6.24-rc3-mm1 + your patch "separates the 
BLOCK and QUIESCE states correctly" (http://lkml.org/lkml/2007/11/24/8).

How to reproduce :
- boot
- switch to a text console
- capture dmesg in a file, sync, etc. There are 3 I/O errors, but the 
  system does work.
- switch to X console, log in the Gnome Desktop, the system partially 
  hangs.
- switch back to a text console: dmesg(1) still works, it shows some 
  additonal I/O errors. At this point, any disk access makes the system 
  completely hung.

Additionnal data:
- the I/O errors always happen on the same blocks.

-- 
laurent
[0.00] Linux version 2.6.24-rc3-mm1 ([EMAIL PROTECTED]) (gcc version 
4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)) #122 PREEMPT Fri Nov 23 
18:47:58 CET 2007
[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009fc00 (usable)
[0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
[0.00]  BIOS-e820: 000f - 0010 (reserved)
[0.00]  BIOS-e820: 0010 - 1ffec000 (usable)
[0.00]  BIOS-e820: 1ffec000 - 1ffef000 (ACPI data)
[0.00]  BIOS-e820: 1ffef000 - 1000 (reserved)
[0.00]  BIOS-e820: 1000 - 2000 (ACPI NVS)
[0.00]  BIOS-e820:  - 0001 (reserved)
[0.00] 511MB LOWMEM available.
[0.00] Entering add_active_range(0, 0, 131052) 0 entries of 256 used
[0.00] sizeof(struct page) = 32
[0.00] Zone PFN ranges:
[0.00]   DMA 0 -> 4096
[0.00]   Normal   4096 ->   131052
[0.00] Movable zone start PFN for each node
[0.00] early_node_map[1] active PFN ranges
[0.00] 0:0 ->   131052
[0.00] On node 0 totalpages: 131052
[0.00] Node 0 memmap at 0xC100 size 4194304 first pfn 0xC100
[0.00]   DMA zone: 32 pages used for memmap
[0.00]   DMA zone: 0 pages reserved
[0.00]   DMA zone: 4064 pages, LIFO batch:0
[0.00]   Normal zone: 991 pages used for memmap
[0.00]   Normal zone: 125965 pages, LIFO batch:31
[0.00]   Movable zone: 0 pages used for memmap
[0.00] DMI 2.3 present.
[0.00] ACPI: RSDP 000F6A80, 0014 (r0 ASUS  )
[0.00] ACPI: RSDT 1FFEC000, 002C (r1 ASUS   A7V133-C 30303031 MSFT 
31313031)
[0.00] ACPI: FACP 1FFEC080, 0074 (r1 ASUS   A7V133-C 30303031 MSFT 
31313031)
[0.00] ACPI: DSDT 1FFEC100, 2CE1 (r1   ASUS A7V133-C 1000 MSFT  
10B)
[0.00] ACPI: FACS 1000, 0040
[0.00] ACPI: BOOT 1FFEC040, 0028 (r1 ASUS   A7V133-C 30303031 MSFT 
31313031)
[0.00] ACPI: PM-Timer IO Port: 0xe408
[0.00] Allocating PCI resources starting at 3000 (gap: 
2000:dfff)
[0.00] swsusp: Registered nosave memory region: 0009f000 - 
000a
[0.00] swsusp: Registered nosave memory region: 000a - 
000f
[0.00] swsusp: Registered nosave memory region: 000f - 
0010
[0.00] Built 1 zonelists in Zone order, mobility grouping on.  Total 
pages: 130029
[0.00] Kernel command line: root=/dev/mapper/vglinux1-lv_ubuntu2 ro 
locale=fr_FR video=radeonfb:[EMAIL PROTECTED] resume=/dev/mapper/vglinux1-lvswap
[0.00] Local APIC disabled by BIOS -- you can enable it with "lapic"
[0.00] mapped APIC to b000 (0140600

Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-24 Thread Gabriel C
Gabriel C wrote:
> James Bottomley wrote:
>> On Sat, 2007-11-24 at 18:54 +0100, Gabriel C wrote:
>>> James Bottomley wrote:
 On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
> Le 24.11.2007 07:42, James Bottomley a écrit :
>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
 Hannes Reinecke wrote:
> Laurent Riffard wrote:
>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>>> On Wed, 21 Nov 2007 22:45:22 +0100
>>> Laurent Riffard <[EMAIL PROTECTED]> wrote:
>>>
 Le 21.11.2007 05:45, Andrew Morton a écrit :
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
 Hello, 

 My system hangs shortly after I logged in Gnome desktop. SysRq-W 
 shows
 that a bunch of task are blocked in "D" state, they seem to wait 
 for
 some I/O completion. I can try to hand-copy some data if requested.

 I found these messages in dmesg:

 ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
 EXT3-fs: mounted filesystem with ordered data mode.
 sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
 driverbyte=DRIVER_OK,SUGGEST_OK
 end_request: I/O error, dev sda, sector 16460
 ReiserFS: sda7: found reiserfs format "3.6" with standard journal
 ReiserFS: sda7: using ordered data mode
 --
 ReiserFS: sda7: Using r5 hash to sort names
 sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
 driverbyte=DRIVER_OK,SUGGEST_OK
 end_request: I/O error, dev sdb, sector 19632
 sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
 driverbyte=DRIVER_OK,SUGGEST_OK
 end_request: I/O error, dev sdb, sector 40037363
 Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 
 extents:1 across:1048568k
 lp0: using parport0 (interrupt-driven).

 These errors occur *only* with 2.6.24-rc3-mm1, they are 100% 
 reproducible.
 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.

 Maybe something is broken in pata_via driver ?

>>> Could be - 
>>> libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>>> and 
>>> pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>>> touch pata_via.c.
>> None of the above...
>>
>> I did a bisection, it spotted git-scsi-misc.patch. 
>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works 
>> fine.
>>
>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do 
>> not 
>> requeue requests if REQ_FAILFAST is set" is the real culprit. The 
>> other 
>> commits are touching documentation or drivers I don't use. I'll try 
>> to revert only this one this evening.
>>> I can confirm : reverting commit 
>>> 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>>> does fix the problem.
>>>
> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an 
> error where
> I shouldn't. Checking ...
>
 Ok, found it. We are blocking even special commands (ie requests with 
 PREEMPT not set)
 when FAILFAST is set. Which is clearly wrong. The attached patch fixes 
 this.
>>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with 
>>> I/O errors.
>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
>> is the state that the domain validation uses and which we cannot kill
>> fastfail on).  It's definitely wrong to kill fastfail requests when the
>> state is QUIESCE.
>>
>> This patch (which is applied on top of Hannes original) separates the
>> BLOCK and QUIESCE states correctly ... does this fix the problem?
> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
 OK, could you post dmesgs again, please.  I actually tested this with an
 aic79xx card, and for me it does cause Domain Validation to succeed
 again.

>>> Are the patches indeed to fix that problem as well ? 
>>>
>>> http://lkml.org/lkml/2007/11/23/5
>> That dmesg is from an unknown SCSI card exhibiting Domain Validation
>> problems, so it's a reasonable probability, yes ... but you'll need the
>> additional hack I just did to prevent further intermittent failures.
> 
> My controller is:
> 
> 03:0e.0 SCSI storage controller [0100]: Adaptec AIC-7892P U160/m [9005:008f] 
> (rev 02)
> 
> I'll try the patches in a bit.

With your patches my problem(s) are solved. Domain Validation works again.

...

[   32.179521] scsi 0:0:

Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-24 Thread Gabriel C
James Bottomley wrote:
> On Sat, 2007-11-24 at 18:54 +0100, Gabriel C wrote:
>> James Bottomley wrote:
>>> On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
 Le 24.11.2007 07:42, James Bottomley a écrit :
> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
>>> Hannes Reinecke wrote:
 Laurent Riffard wrote:
> Le 21.11.2007 23:41, Andrew Morton a écrit :
>> On Wed, 21 Nov 2007 22:45:22 +0100
>> Laurent Riffard <[EMAIL PROTECTED]> wrote:
>>
>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>> Hello, 
>>>
>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W 
>>> shows
>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>> some I/O completion. I can try to hand-copy some data if requested.
>>>
>>> I found these messages in dmesg:
>>>
>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>> EXT3-fs: mounted filesystem with ordered data mode.
>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sda, sector 16460
>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>> ReiserFS: sda7: using ordered data mode
>>> --
>>> ReiserFS: sda7: Using r5 hash to sort names
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 19632
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 40037363
>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 
>>> extents:1 across:1048568k
>>> lp0: using parport0 (interrupt-driven).
>>>
>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% 
>>> reproducible.
>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>
>>> Maybe something is broken in pata_via driver ?
>>>
>> Could be - 
>> libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>> and 
>> pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>> touch pata_via.c.
> None of the above...
>
> I did a bisection, it spotted git-scsi-misc.patch. 
> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works 
> fine.
>
> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do 
> not 
> requeue requests if REQ_FAILFAST is set" is the real culprit. The 
> other 
> commits are touching documentation or drivers I don't use. I'll try 
> to revert only this one this evening.
>> I can confirm : reverting commit 
>> 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>> does fix the problem.
>>
 Hmm. Weird. I'll have a look into it. Apparently I'll be returning an 
 error where
 I shouldn't. Checking ...

>>> Ok, found it. We are blocking even special commands (ie requests with 
>>> PREEMPT not set)
>>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes 
>>> this.
>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O 
>> errors.
> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
> is the state that the domain validation uses and which we cannot kill
> fastfail on).  It's definitely wrong to kill fastfail requests when the
> state is QUIESCE.
>
> This patch (which is applied on top of Hannes original) separates the
> BLOCK and QUIESCE states correctly ... does this fix the problem?
 No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
>>> OK, could you post dmesgs again, please.  I actually tested this with an
>>> aic79xx card, and for me it does cause Domain Validation to succeed
>>> again.
>>>
>> Are the patches indeed to fix that problem as well ? 
>>
>> http://lkml.org/lkml/2007/11/23/5
> 
> That dmesg is from an unknown SCSI card exhibiting Domain Validation
> problems, so it's a reasonable probability, yes ... but you'll need the
> additional hack I just did to prevent further intermittent failures.

My controller is:

03:0e.0 SCSI storage controller [0100]: Adaptec AIC-7892P U160/m [9005:008f] 
(rev 02)

I'll try the patches in a bit.

> 
> James
> 

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.t

Re: 2.6.24-rc3-mm1 (sync is slow ?)

2007-11-24 Thread Gabriel C
kosaki wrote:
> Hi, Andrew 
> 
>>> Hi, Andrew
>>>
>>> I got following result in 'sync' command.
>>> It was too slow. (memory controller config is off ;)
>>> I attaches my .config.
>>> ==
>  (snip)
>> Well I wonder how we did that.
>>
>> It seems OK here from a quick test (i386, ext3-on-IDE).
>>
>> Maybe device driver/block breakage?

Try revert

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d

See also :
http://lkml.org/lkml/2007/11/23/5

and search for '2.6.24-rc3-mm1: I/O error, system hangs' on LKML

> 
> I tested x86, ext3-on-SATA(/dev/sda).
> It seems works well.
> 
> Hmm...

IDE/SATA is fine here as well just SCSI broke


Regards,

Gabriel 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-24 Thread James Bottomley

On Sat, 2007-11-24 at 18:54 +0100, Gabriel C wrote:
> James Bottomley wrote:
> > On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
> >> Le 24.11.2007 07:42, James Bottomley a écrit :
> >>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>  Le 23.11.2007 12:38, Hannes Reinecke a écrit :
> > Hannes Reinecke wrote:
> >> Laurent Riffard wrote:
> >>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>  On Wed, 21 Nov 2007 22:45:22 +0100
>  Laurent Riffard <[EMAIL PROTECTED]> wrote:
> 
> > Le 21.11.2007 05:45, Andrew Morton a écrit :
> >> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> > Hello, 
> >
> > My system hangs shortly after I logged in Gnome desktop. SysRq-W 
> > shows
> > that a bunch of task are blocked in "D" state, they seem to wait for
> > some I/O completion. I can try to hand-copy some data if requested.
> >
> > I found these messages in dmesg:
> >
> > ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
> > EXT3-fs: mounted filesystem with ordered data mode.
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
> > driverbyte=DRIVER_OK,SUGGEST_OK
> > end_request: I/O error, dev sda, sector 16460
> > ReiserFS: sda7: found reiserfs format "3.6" with standard journal
> > ReiserFS: sda7: using ordered data mode
> > --
> > ReiserFS: sda7: Using r5 hash to sort names
> > sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> > driverbyte=DRIVER_OK,SUGGEST_OK
> > end_request: I/O error, dev sdb, sector 19632
> > sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> > driverbyte=DRIVER_OK,SUGGEST_OK
> > end_request: I/O error, dev sdb, sector 40037363
> > Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 
> > extents:1 across:1048568k
> > lp0: using parport0 (interrupt-driven).
> >
> > These errors occur *only* with 2.6.24-rc3-mm1, they are 100% 
> > reproducible.
> > 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
> >
> > Maybe something is broken in pata_via driver ?
> >
>  Could be - 
>  libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>  and 
>  pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>  touch pata_via.c.
> >>> None of the above...
> >>>
> >>> I did a bisection, it spotted git-scsi-misc.patch. 
> >>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works 
> >>> fine.
> >>>
> >>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do 
> >>> not 
> >>> requeue requests if REQ_FAILFAST is set" is the real culprit. The 
> >>> other 
> >>> commits are touching documentation or drivers I don't use. I'll try 
> >>> to revert only this one this evening.
>  I can confirm : reverting commit 
>  8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>  does fix the problem.
> 
> >> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an 
> >> error where
> >> I shouldn't. Checking ...
> >>
> > Ok, found it. We are blocking even special commands (ie requests with 
> > PREEMPT not set)
> > when FAILFAST is set. Which is clearly wrong. The attached patch fixes 
> > this.
>  Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O 
>  errors.
> >>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
> >>> is the state that the domain validation uses and which we cannot kill
> >>> fastfail on).  It's definitely wrong to kill fastfail requests when the
> >>> state is QUIESCE.
> >>>
> >>> This patch (which is applied on top of Hannes original) separates the
> >>> BLOCK and QUIESCE states correctly ... does this fix the problem?
> >>
> >> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
> > 
> > OK, could you post dmesgs again, please.  I actually tested this with an
> > aic79xx card, and for me it does cause Domain Validation to succeed
> > again.
> > 
> 
> Are the patches indeed to fix that problem as well ? 
> 
> http://lkml.org/lkml/2007/11/23/5

That dmesg is from an unknown SCSI card exhibiting Domain Validation
problems, so it's a reasonable probability, yes ... but you'll need the
additional hack I just did to prevent further intermittent failures.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-24 Thread Gabriel C
James Bottomley wrote:
> On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
>> Le 24.11.2007 07:42, James Bottomley a écrit :
>>> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
 Le 23.11.2007 12:38, Hannes Reinecke a écrit :
> Hannes Reinecke wrote:
>> Laurent Riffard wrote:
>>> Le 21.11.2007 23:41, Andrew Morton a écrit :
 On Wed, 21 Nov 2007 22:45:22 +0100
 Laurent Riffard <[EMAIL PROTECTED]> wrote:

> Le 21.11.2007 05:45, Andrew Morton a écrit :
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> Hello, 
>
> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
> that a bunch of task are blocked in "D" state, they seem to wait for
> some I/O completion. I can try to hand-copy some data if requested.
>
> I found these messages in dmesg:
>
> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
> EXT3-fs: mounted filesystem with ordered data mode.
> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
> driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sda, sector 16460
> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
> ReiserFS: sda7: using ordered data mode
> --
> ReiserFS: sda7: Using r5 hash to sort names
> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sdb, sector 19632
> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sdb, sector 40037363
> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 
> extents:1 across:1048568k
> lp0: using parport0 (interrupt-driven).
>
> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% 
> reproducible.
> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>
> Maybe something is broken in pata_via driver ?
>
 Could be - 
 libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
 and 
 pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
 touch pata_via.c.
>>> None of the above...
>>>
>>> I did a bisection, it spotted git-scsi-misc.patch. 
>>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works 
>>> fine.
>>>
>>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
>>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
>>> commits are touching documentation or drivers I don't use. I'll try 
>>> to revert only this one this evening.
 I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
 does fix the problem.

>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an 
>> error where
>> I shouldn't. Checking ...
>>
> Ok, found it. We are blocking even special commands (ie requests with 
> PREEMPT not set)
> when FAILFAST is set. Which is clearly wrong. The attached patch fixes 
> this.
 Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O 
 errors.
>>> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
>>> is the state that the domain validation uses and which we cannot kill
>>> fastfail on).  It's definitely wrong to kill fastfail requests when the
>>> state is QUIESCE.
>>>
>>> This patch (which is applied on top of Hannes original) separates the
>>> BLOCK and QUIESCE states correctly ... does this fix the problem?
>>
>> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)
> 
> OK, could you post dmesgs again, please.  I actually tested this with an
> aic79xx card, and for me it does cause Domain Validation to succeed
> again.
> 

Are the patches indeed to fix that problem as well ? 

http://lkml.org/lkml/2007/11/23/5

> James

Gabriel 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-24 Thread James Bottomley
Probing intermittent failures in Domain Validation, even with the fixes
applied leads me to the conclusion that there are further problems with
this commit:

commit fc5eb4facedbd6d7117905e775cee1975f894e79
Author: Hannes Reinecke <[EMAIL PROTECTED]>
Date:   Tue Nov 6 09:23:40 2007 +0100

[SCSI] Do not requeue requests if REQ_FAILFAST is set
 
The essence of the problems is that you're causing REQ_FAILFAST to
terminate commands with error on requeuing conditions, some of which are
relatively common on most SCSI devices.  While this may be the correct
behaviour for multi-path, it's certainly wrong for the previously
understood meaning of REQ_FAILFAST, which was don't retry on error,
which is why domain validation and other applications use it to control
error handling, but don't expect to get failures for a simple requeue
are now spitting errors.

I honestly can't see that, even for the multi-path case, returning an
error when we're over queue depth is the correct thing to do (it may not
matter to something like a symmetrix, but an array that has a non-zero
cost associated with a path change, like a CPQ HSV or the AVT
controllers, will show fairly large slow downs if you do this).  Even if
this is the desired behaviour (and I think that's a policy issue),
DID_NO_CONNECT is almost certainly the wrong error to be sending back.

This patch fixes up domain validation to work again correctly, however,
I really think it's just a bandaid.  Do you want to rethink the above
commit?

James

Index: BUILD-2.6/drivers/scsi/scsi_lib.c
===
--- BUILD-2.6.orig/drivers/scsi/scsi_lib.c  2007-11-24 11:25:20.0 
-0600
+++ BUILD-2.6/drivers/scsi/scsi_lib.c   2007-11-24 11:26:22.0 -0600
@@ -1552,7 +1552,8 @@ static void scsi_request_fn(struct reque
break;
 
if (!scsi_dev_queue_ready(q, sdev)) {
-   if (req->cmd_flags & REQ_FAILFAST) {
+   if ((req->cmd_flags & REQ_FAILFAST) &&
+   !(req->cmd_flags & REQ_PREEMPT)) {
scsi_kill_request(req, q);
continue;
}


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-24 Thread Adrian Bunk
On Wed, Nov 21, 2007 at 10:58:21AM +0100, Sam Ravnborg wrote:
> On Wed, Nov 21, 2007 at 10:44:40AM +0200, Avi Kivity wrote:
> > Kamalesh Babulal wrote:
> > >Andrew Morton wrote:
> > >  
> > >>On Wed, 21 Nov 2007 13:54:50 +0530 Kamalesh Babulal 
> > >><[EMAIL PROTECTED]> wrote:
> > >>
> > >>
> > >>>The make headers_check fails,
> > >>>
> > >>>  CHECK   include/linux/usb/gadgetfs.h
> > >>>  CHECK   include/linux/usb/ch9.h
> > >>>  CHECK   include/linux/usb/cdc.h
> > >>>  CHECK   include/linux/usb/audio.h
> > >>>  CHECK   include/linux/kvm.h
> > >>>/root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires 
> > >>>asm/kvm.h, which does not exist in exported headers
> > >>>  
> > >>hm, works for me, on i386 and x86_64.  What's different over there?
> > >>
> > >Hi Andrew,
> > >
> > >It fails on the powerpc box, with allyesconfig option.
> > >
> > >  
> > 
> > How do we fix this?  Export linux/kvm.h only on x86?  Seems ugly.
> 
> Is kvm x86 specific? Then move the .h file to asm-x86.
> Otherwise no good idea...

What about adding a whitelist in hdrcheck.sh?

For all practical purposes in userspace the compile error due to the 
non-existing asm header should be fine, so there's no reason to change 
the code in such cases. 

>   Sam

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-24 Thread James Bottomley
On Sat, 2007-11-24 at 13:57 +0100, Laurent Riffard wrote:
> Le 24.11.2007 07:42, James Bottomley a écrit :
> > On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
> >> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
> >>> Hannes Reinecke wrote:
>  Laurent Riffard wrote:
> > Le 21.11.2007 23:41, Andrew Morton a écrit :
> >> On Wed, 21 Nov 2007 22:45:22 +0100
> >> Laurent Riffard <[EMAIL PROTECTED]> wrote:
> >>
> >>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> >>> Hello, 
> >>>
> >>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
> >>> that a bunch of task are blocked in "D" state, they seem to wait for
> >>> some I/O completion. I can try to hand-copy some data if requested.
> >>>
> >>> I found these messages in dmesg:
> >>>
> >>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
> >>> EXT3-fs: mounted filesystem with ordered data mode.
> >>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
> >>> driverbyte=DRIVER_OK,SUGGEST_OK
> >>> end_request: I/O error, dev sda, sector 16460
> >>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
> >>> ReiserFS: sda7: using ordered data mode
> >>> --
> >>> ReiserFS: sda7: Using r5 hash to sort names
> >>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> >>> driverbyte=DRIVER_OK,SUGGEST_OK
> >>> end_request: I/O error, dev sdb, sector 19632
> >>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> >>> driverbyte=DRIVER_OK,SUGGEST_OK
> >>> end_request: I/O error, dev sdb, sector 40037363
> >>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 
> >>> extents:1 across:1048568k
> >>> lp0: using parport0 (interrupt-driven).
> >>>
> >>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% 
> >>> reproducible.
> >>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
> >>>
> >>> Maybe something is broken in pata_via driver ?
> >>>
> >> Could be - 
> >> libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
> >> and 
> >> pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
> >> touch pata_via.c.
> > None of the above...
> >
> > I did a bisection, it spotted git-scsi-misc.patch. 
> > I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works 
> > fine.
> >
> > I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
> > requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
> > commits are touching documentation or drivers I don't use. I'll try 
> > to revert only this one this evening.
> >> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
> >> does fix the problem.
> >>
>  Hmm. Weird. I'll have a look into it. Apparently I'll be returning an 
>  error where
>  I shouldn't. Checking ...
> 
> >>> Ok, found it. We are blocking even special commands (ie requests with 
> >>> PREEMPT not set)
> >>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes 
> >>> this.
> >> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O 
> >> errors.
> > 
> > I think the problem is the way we treat BLOCKED and QUIESCED (the latter
> > is the state that the domain validation uses and which we cannot kill
> > fastfail on).  It's definitely wrong to kill fastfail requests when the
> > state is QUIESCE.
> > 
> > This patch (which is applied on top of Hannes original) separates the
> > BLOCK and QUIESCE states correctly ... does this fix the problem?
> 
> 
> No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)

OK, could you post dmesgs again, please.  I actually tested this with an
aic79xx card, and for me it does cause Domain Validation to succeed
again.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-24 Thread Laurent Riffard


Le 24.11.2007 07:42, James Bottomley a écrit :
> On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
>> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
>>> Hannes Reinecke wrote:
 Laurent Riffard wrote:
> Le 21.11.2007 23:41, Andrew Morton a écrit :
>> On Wed, 21 Nov 2007 22:45:22 +0100
>> Laurent Riffard <[EMAIL PROTECTED]> wrote:
>>
>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>> Hello, 
>>>
>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>> some I/O completion. I can try to hand-copy some data if requested.
>>>
>>> I found these messages in dmesg:
>>>
>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>> EXT3-fs: mounted filesystem with ordered data mode.
>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sda, sector 16460
>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>> ReiserFS: sda7: using ordered data mode
>>> --
>>> ReiserFS: sda7: Using r5 hash to sort names
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 19632
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 40037363
>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 
>>> extents:1 across:1048568k
>>> lp0: using parport0 (interrupt-driven).
>>>
>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% 
>>> reproducible.
>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>
>>> Maybe something is broken in pata_via driver ?
>>>
>> Could be - 
>> libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>> and 
>> pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>> touch pata_via.c.
> None of the above...
>
> I did a bisection, it spotted git-scsi-misc.patch. 
> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
>
> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
> commits are touching documentation or drivers I don't use. I'll try 
> to revert only this one this evening.
>> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
>> does fix the problem.
>>
 Hmm. Weird. I'll have a look into it. Apparently I'll be returning an 
 error where
 I shouldn't. Checking ...

>>> Ok, found it. We are blocking even special commands (ie requests with 
>>> PREEMPT not set)
>>> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
>> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O 
>> errors.
> 
> I think the problem is the way we treat BLOCKED and QUIESCED (the latter
> is the state that the domain validation uses and which we cannot kill
> fastfail on).  It's definitely wrong to kill fastfail requests when the
> state is QUIESCE.
> 
> This patch (which is applied on top of Hannes original) separates the
> BLOCK and QUIESCE states correctly ... does this fix the problem?


No, it doesn't help... (2.6.24-rc3-mm1 + your patch still has problems)


> James
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 13e7e09..a7cf23a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c

> @@ -1279,18 +1279,21 @@ int scsi_prep_state_check(struct scsi_device *sdev, 
> struct request *req)
>   "rejecting I/O to dead device\n");
>   ret = BLKPREP_KILL;
>   break;
> - case SDEV_QUIESCE:
>   case SDEV_BLOCK:
>   /*
> -  * If the devices is blocked we defer normal commands.
> -  */
> - if (!(req->cmd_flags & REQ_PREEMPT))
> - ret = BLKPREP_DEFER;
> - /*
>* Return failfast requests immediately
>*/
>   if (req->cmd_flags & REQ_FAILFAST)
>   ret = BLKPREP_KILL;
> +
> + /* fall through */
> +
> + case SDEV_QUIESCE:
> + /*
> +  * If the devices is blocked we defer normal commands.
> +  */
> + if (!(req->cmd_flags & REQ_PREEMPT))
> + ret = BLKPREP_DEFER;
>   break;
>   default:
>   /*
> 
-
To unsubs

Re: 2.6.24-rc3-mm1 (sync is slow ?)

2007-11-24 Thread kosaki
Hi, Andrew 

> > Hi, Andrew
> > 
> > I got following result in 'sync' command.
> > It was too slow. (memory controller config is off ;)
> > I attaches my .config.
> > ==
 (snip)
> 
> Well I wonder how we did that.
> 
> It seems OK here from a quick test (i386, ext3-on-IDE).
> 
> Maybe device driver/block breakage?

I tested x86, ext3-on-SATA(/dev/sda).
It seems works well.

Hmm...


-- 
kosaki


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-23 Thread James Bottomley

On Fri, 2007-11-23 at 18:52 +0100, Laurent Riffard wrote:
> Le 23.11.2007 12:38, Hannes Reinecke a écrit :
> > Hannes Reinecke wrote:
> >> Laurent Riffard wrote:
> >>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>  On Wed, 21 Nov 2007 22:45:22 +0100
>  Laurent Riffard <[EMAIL PROTECTED]> wrote:
> 
> > Le 21.11.2007 05:45, Andrew Morton a écrit :
> >> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> > Hello, 
> >
> > My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
> > that a bunch of task are blocked in "D" state, they seem to wait for
> > some I/O completion. I can try to hand-copy some data if requested.
> >
> > I found these messages in dmesg:
> >
> > ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
> > EXT3-fs: mounted filesystem with ordered data mode.
> > sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
> > driverbyte=DRIVER_OK,SUGGEST_OK
> > end_request: I/O error, dev sda, sector 16460
> > ReiserFS: sda7: found reiserfs format "3.6" with standard journal
> > ReiserFS: sda7: using ordered data mode
> > --
> > ReiserFS: sda7: Using r5 hash to sort names
> > sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> > driverbyte=DRIVER_OK,SUGGEST_OK
> > end_request: I/O error, dev sdb, sector 19632
> > sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> > driverbyte=DRIVER_OK,SUGGEST_OK
> > end_request: I/O error, dev sdb, sector 40037363
> > Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 
> > extents:1 across:1048568k
> > lp0: using parport0 (interrupt-driven).
> >
> > These errors occur *only* with 2.6.24-rc3-mm1, they are 100% 
> > reproducible.
> > 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
> >
> > Maybe something is broken in pata_via driver ?
> >
>  Could be - 
>  libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>  and 
>  pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>  touch pata_via.c.
> >>> None of the above...
> >>>
> >>> I did a bisection, it spotted git-scsi-misc.patch. 
> >>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
> >>>
> >>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
> >>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
> >>> commits are touching documentation or drivers I don't use. I'll try 
> >>> to revert only this one this evening.
> 
> I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
> does fix the problem.
> 
> >> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an 
> >> error where
> >> I shouldn't. Checking ...
> >>
> > Ok, found it. We are blocking even special commands (ie requests with 
> > PREEMPT not set)
> > when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.
> 
> Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O 
> errors.

I think the problem is the way we treat BLOCKED and QUIESCED (the latter
is the state that the domain validation uses and which we cannot kill
fastfail on).  It's definitely wrong to kill fastfail requests when the
state is QUIESCE.

This patch (which is applied on top of Hannes original) separates the
BLOCK and QUIESCE states correctly ... does this fix the problem?

James

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 13e7e09..a7cf23a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1279,18 +1279,21 @@ int scsi_prep_state_check(struct scsi_device *sdev, 
struct request *req)
"rejecting I/O to dead device\n");
ret = BLKPREP_KILL;
break;
-   case SDEV_QUIESCE:
case SDEV_BLOCK:
/*
-* If the devices is blocked we defer normal commands.
-*/
-   if (!(req->cmd_flags & REQ_PREEMPT))
-   ret = BLKPREP_DEFER;
-   /*
 * Return failfast requests immediately
 */
if (req->cmd_flags & REQ_FAILFAST)
ret = BLKPREP_KILL;
+
+   /* fall through */
+
+   case SDEV_QUIESCE:
+   /*
+* If the devices is blocked we defer normal commands.
+*/
+   if (!(req->cmd_flags & REQ_PREEMPT))
+   ret = BLKPREP_DEFER;
break;
default:
/*


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majord

Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-23 Thread Alexey Dobriyan
On Tue, Nov 20, 2007 at 10:18:39PM -0800, Andrew Morton wrote:
> On Wed, 21 Nov 2007 11:41:23 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
> 
> > Hi Andrew,
> > 
> > Kernel panic's across different architectures like powerpc, x86_64, 
> 
> powerpc complains about IO-APICs??
> 
> > Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
> > Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
> > Mount-cache hash table entries: 256
> > SMP alternatives: switching to UP code
> > ACPI: Core revision 20070126

Hmm. same date here. It's Asus P5B-E motheboard

> > ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> > Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> > 'noapic' kernel parameter
> 
> ACPI or x86 breakage, I guess.
> 
> Did 'noapic' work?

No! The box freezes somewhere after "Freeing unused kernel memory"...

Bisection points to git-x86.patch, though.

git-bisect start
# good: [f05092637dc0d9a3f2249c9b283b973e6e96b7d2] Linux 2.6.24-rc3
git-bisect good f05092637dc0d9a3f2249c9b283b973e6e96b7d2
# bad: [46c8c396d2c87b786a7fac615c289f85a18e53ce] w1-build-fix
git-bisect bad 46c8c396d2c87b786a7fac615c289f85a18e53ce
# bad: [4e22f4852c48e1eddfe04299e78c0456164abe86] 
frv-move-dma-macros-to-scatterlisth-for-consistency
git-bisect bad 4e22f4852c48e1eddfe04299e78c0456164abe86
# bad: [4e22f4852c48e1eddfe04299e78c0456164abe86] 
frv-move-dma-macros-to-scatterlisth-for-consistency
git-bisect bad 4e22f4852c48e1eddfe04299e78c0456164abe86
# good: [d5135f31313af2be37d8ccb71e2a42f8e221d8c4] 
ide-mm-ide-disk-extend-timeout-for-pio-out-commands
git-bisect good d5135f31313af2be37d8ccb71e2a42f8e221d8c4
# good: [6be815e83f506f4c39a46cf59014e29a95c5e6c4] 
iommu-sg-merging-call-blk_queue_segment_boundary-in-__scsi_alloc_queue
git-bisect good 6be815e83f506f4c39a46cf59014e29a95c5e6c4
# good: [6be815e83f506f4c39a46cf59014e29a95c5e6c4] 
iommu-sg-merging-call-blk_queue_segment_boundary-in-__scsi_alloc_queue
git-bisect good 6be815e83f506f4c39a46cf59014e29a95c5e6c4
# bad: [c792db6d06114a85e33a27c89e9e979f11b951c4] 
slub-fix-coding-style-violations
git-bisect bad c792db6d06114a85e33a27c89e9e979f11b951c4
# bad: [c792db6d06114a85e33a27c89e9e979f11b951c4] 
slub-fix-coding-style-violations
git-bisect bad c792db6d06114a85e33a27c89e9e979f11b951c4
# bad: [76f3939b76ff557f73720b57a16716196f04e407] 
x86_64-make-sparsemem-vmemmap-the-default-memory-model-v2
git-bisect bad 76f3939b76ff557f73720b57a16716196f04e407
# good: [b8ba611566d8799a979b190d4bb14305ca64ee0e] 
sis-fb-driver-_ioctl32_conversion-functions-do-not-exist-in-recent-kernels
git-bisect good b8ba611566d8799a979b190d4bb14305ca64ee0e
# good: [e34995928859308d2abef1709332e2b12d36db2f] git-ipwireless_cs
git-bisect good e34995928859308d2abef1709332e2b12d36db2f
# bad: [f520abbbe11bc8253714bcd34aaaf19bdf82189e] git-x86-identify_cpu-fix
git-bisect bad f520abbbe11bc8253714bcd34aaaf19bdf82189e


I honestly tried fresh #mm from x86 tree -- the one which ends at commit
70be766db1105c0fc9aed8e954d0c343c1eda067 "x86: Add the RDC machine
specific reboot fixup". FWIW, commit "x86: validate against ACPI motherboard
resources" is innocent. After "x86: make stack size configurable" damn thing
wouldn't build and applying fixets from -mm doesn't help at 3AM.

Again, it's late here, I'll recheck today.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-23 Thread Laurent Riffard
Le 23.11.2007 12:38, Hannes Reinecke a écrit :
> Hannes Reinecke wrote:
>> Laurent Riffard wrote:
>>> Le 21.11.2007 23:41, Andrew Morton a écrit :
 On Wed, 21 Nov 2007 22:45:22 +0100
 Laurent Riffard <[EMAIL PROTECTED]> wrote:

> Le 21.11.2007 05:45, Andrew Morton a écrit :
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> Hello, 
>
> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
> that a bunch of task are blocked in "D" state, they seem to wait for
> some I/O completion. I can try to hand-copy some data if requested.
>
> I found these messages in dmesg:
>
> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
> EXT3-fs: mounted filesystem with ordered data mode.
> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
> driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sda, sector 16460
> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
> ReiserFS: sda7: using ordered data mode
> --
> ReiserFS: sda7: Using r5 hash to sort names
> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sdb, sector 19632
> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sdb, sector 40037363
> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 
> extents:1 across:1048568k
> lp0: using parport0 (interrupt-driven).
>
> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>
> Maybe something is broken in pata_via driver ?
>
 Could be - 
 libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
 and 
 pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
 touch pata_via.c.
>>> None of the above...
>>>
>>> I did a bisection, it spotted git-scsi-misc.patch. 
>>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
>>>
>>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
>>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
>>> commits are touching documentation or drivers I don't use. I'll try 
>>> to revert only this one this evening.

I can confirm : reverting commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 
does fix the problem.

>> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error 
>> where
>> I shouldn't. Checking ...
>>
> Ok, found it. We are blocking even special commands (ie requests with PREEMPT 
> not set)
> when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.

Sorry, it's not enough. 2.6.24-rc3-mm1 + your patch still hangs with I/O errors.

-- 
laurent

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-23 Thread Hannes Reinecke
Hannes Reinecke wrote:
> Laurent Riffard wrote:
>> Le 21.11.2007 23:41, Andrew Morton a écrit :
>>> On Wed, 21 Nov 2007 22:45:22 +0100
>>> Laurent Riffard <[EMAIL PROTECTED]> wrote:
>>>
 Le 21.11.2007 05:45, Andrew Morton a écrit :
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
 Hello, 

 My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
 that a bunch of task are blocked in "D" state, they seem to wait for
 some I/O completion. I can try to hand-copy some data if requested.

 I found these messages in dmesg:

 ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
 EXT3-fs: mounted filesystem with ordered data mode.
 sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
 driverbyte=DRIVER_OK,SUGGEST_OK
 end_request: I/O error, dev sda, sector 16460
 ReiserFS: sda7: found reiserfs format "3.6" with standard journal
 ReiserFS: sda7: using ordered data mode
 --
 ReiserFS: sda7: Using r5 hash to sort names
 sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
 driverbyte=DRIVER_OK,SUGGEST_OK
 end_request: I/O error, dev sdb, sector 19632
 sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
 driverbyte=DRIVER_OK,SUGGEST_OK
 end_request: I/O error, dev sdb, sector 40037363
 Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 
 extents:1 across:1048568k
 lp0: using parport0 (interrupt-driven).

 These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.

 Maybe something is broken in pata_via driver ?

>>> Could be - 
>>> libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>>> and 
>>> pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>>> touch pata_via.c.
>> None of the above...
>>
>> I did a bisection, it spotted git-scsi-misc.patch. 
>> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
>>
>> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
>> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
>> commits are touching documentation or drivers I don't use. I'll try 
>> to revert only this one this evening.
>>
> Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error 
> where
> I shouldn't. Checking ...
> 
Ok, found it. We are blocking even special commands (ie requests with PREEMPT 
not set)
when FAILFAST is set. Which is clearly wrong. The attached patch fixes this.

James, please apply.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
[EMAIL PROTECTED] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
Fix SPI Domain validation

This fixes a thinko of the FAILFAST handling: when we get
a request with FAILFAST set, we still have to evaluate the
PREEMPT flag to decide if this request should be passed through.

Signed-off-by: Hannes Reinecke <[EMAIL PROTECTED]>

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 13e7e09..9ec1566 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1284,13 +1284,15 @@ int scsi_prep_state_check(struct scsi_device *sdev, 
struct request *req)
/*
 * If the devices is blocked we defer normal commands.
 */
-   if (!(req->cmd_flags & REQ_PREEMPT))
-   ret = BLKPREP_DEFER;
-   /*
-* Return failfast requests immediately
-*/
-   if (req->cmd_flags & REQ_FAILFAST)
-   ret = BLKPREP_KILL;
+   if (!(req->cmd_flags & REQ_PREEMPT)) {
+   /*
+* Return failfast requests immediately
+*/
+   if (req->cmd_flags & REQ_FAILFAST)
+   ret = BLKPREP_KILL;
+   else
+   ret = BLKPREP_DEFER;
+   }
break;
default:
/*


Re: 2.6.24-rc3-mm1

2007-11-23 Thread Andreas Herrmann
On Fri, Nov 23, 2007 at 08:05:44AM +0200, Kirill A. Shutemov wrote:
> On [Fri, 23.11.2007 01:48], Thomas Gleixner wrote:
> > On Thu, 22 Nov 2007, Andrew Morton wrote:
> > 
> > > On Thu, 22 Nov 2007 12:22:05 +0200 "Kirill A. Shutemov" <[EMAIL 
> > > PROTECTED]> wrote:
> > > 
> > > > On x86_64 'uname -m' return 'x86'.  It break many userspace programs. 
> > > > apt
> > > > and rpm for example.
> > > > 
> > > 
> > > Yes, there have been various discussions about this.  I think Sam is 
> > > cooking up
> > > a fix?
> > 
> > http://lkml.org/lkml/2007/11/19/323
> > 
> > I push it Linus wards ASAP.
> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 116b03a..7aa1dc6 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -11,10 +11,9 @@ endif
>  $(srctree)/arch/x86/Makefile%: ;
>  
>  ifeq ($(CONFIG_X86_32),y)
> +UTS_MACHINE := i386
>  include $(srctree)/arch/x86/Makefile_32
>  else
> +UTS_MACHINE := x86_64
>  include $(srctree)/arch/x86/Makefile_64
>  endif
> 
> Many programs expect i686 on Pentium II.


Yes, but this is done during boot.
Then the kernel overwrites "i386" to become "i686" for such CPUs.
That is why I've seen "x66" after boot when UTS_MACHINE at build-time
was "x86" with 'make ARCH=x86'.
For more details see:

http://marc.info/?l=linux-kernel&m=119521309415545&w=2


Regards,

Andreas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-22 Thread Hannes Reinecke
Laurent Riffard wrote:
> Le 21.11.2007 23:41, Andrew Morton a écrit :
>> On Wed, 21 Nov 2007 22:45:22 +0100
>> Laurent Riffard <[EMAIL PROTECTED]> wrote:
>>
>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>> Hello, 
>>>
>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>> some I/O completion. I can try to hand-copy some data if requested.
>>>
>>> I found these messages in dmesg:
>>>
>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>> EXT3-fs: mounted filesystem with ordered data mode.
>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sda, sector 16460
>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>> ReiserFS: sda7: using ordered data mode
>>> --
>>> ReiserFS: sda7: Using r5 hash to sort names
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 19632
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 40037363
>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 
>>> across:1048568k
>>> lp0: using parport0 (interrupt-driven).
>>>
>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>
>>> Maybe something is broken in pata_via driver ?
>>>
>> Could be - 
>> libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>> and 
>> pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>> touch pata_via.c.
> 
> None of the above...
> 
> I did a bisection, it spotted git-scsi-misc.patch. 
> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
> 
> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
> commits are touching documentation or drivers I don't use. I'll try 
> to revert only this one this evening.
> 
Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error 
where
I shouldn't. Checking ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
[EMAIL PROTECTED] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-22 Thread Laurent Riffard

Le 21.11.2007 23:41, Andrew Morton a écrit :
> On Wed, 21 Nov 2007 22:45:22 +0100
> Laurent Riffard <[EMAIL PROTECTED]> wrote:
> 
>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>> Hello, 
>>
>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>> that a bunch of task are blocked in "D" state, they seem to wait for
>> some I/O completion. I can try to hand-copy some data if requested.
>>
>> I found these messages in dmesg:
>>
>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>> EXT3-fs: mounted filesystem with ordered data mode.
>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sda, sector 16460
>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>> ReiserFS: sda7: using ordered data mode
>> --
>> ReiserFS: sda7: Using r5 hash to sort names
>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sdb, sector 19632
>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sdb, sector 40037363
>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 
>> across:1048568k
>> lp0: using parport0 (interrupt-driven).
>>
>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>
>> Maybe something is broken in pata_via driver ?
>>
> 
> Could be - 
> libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
> and 
> pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
> touch pata_via.c.

None of the above...

I did a bisection, it spotted git-scsi-misc.patch. 
I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.

I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
commits are touching documentation or drivers I don't use. I'll try 
to revert only this one this evening.

-- 
laurent


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-22 Thread Kirill A. Shutemov
On [Fri, 23.11.2007 01:48], Thomas Gleixner wrote:
> On Thu, 22 Nov 2007, Andrew Morton wrote:
> 
> > On Thu, 22 Nov 2007 12:22:05 +0200 "Kirill A. Shutemov" <[EMAIL PROTECTED]> 
> > wrote:
> > 
> > > On x86_64 'uname -m' return 'x86'.  It break many userspace programs. apt
> > > and rpm for example.
> > > 
> > 
> > Yes, there have been various discussions about this.  I think Sam is 
> > cooking up
> > a fix?
> 
> http://lkml.org/lkml/2007/11/19/323
> 
> I push it Linus wards ASAP.
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 116b03a..7aa1dc6 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -11,10 +11,9 @@ endif
 $(srctree)/arch/x86/Makefile%: ;
 
 ifeq ($(CONFIG_X86_32),y)
+UTS_MACHINE := i386
 include $(srctree)/arch/x86/Makefile_32
 else
+UTS_MACHINE := x86_64
 include $(srctree)/arch/x86/Makefile_64
 endif

Many programs expect i686 on Pentium II.

-- 
Regards,  Kirill A. Shutemov
 + Belarus, Minsk
 + Velesys LLC, http://www.velesys.com/
 + ALT Linux Team, http://www.altlinux.com/


signature.asc
Description: Digital signature


Re: 2.6.24-rc3-mm1

2007-11-22 Thread Gabriel C
Andrew Morton wrote:
> On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C <[EMAIL PROTECTED]> wrote:
> 
>> I have some warnings on each SCSI disc:
>>
>>
>> ...
>>
>> [   30.724410] scsi 0:0:0:0: Direct-Access SEAGATE  ST318406LW   
>> 0109 PQ: 0 ANSI: 3
>> [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
>> [   30.724435]  target0:0:0: Beginning Domain Validation
>> [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
>> [   30.724572]  target0:0:0: Ending Domain Validation
>> [   30.729747] scsi 0:0:1:0: Direct-Access FUJITSU  MAH3182MP
>> 0114 PQ: 0 ANSI: 4
>> [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
>> [   30.729771]  target0:0:1: Beginning Domain Validation
>> [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
>> [   30.729908]  target0:0:1: Ending Domain Validation
>>
> 
> Don't know what would have caused that.  But yes, something is wrong in
> scsi land.

Actually I'm lucky the author didn't fix that FIXME in scsi_transport_spi.c and 
I still can boot ;)

> 
>> no idea whatever this is related but buffered disk reads are 2.XX MB/sec and 
>> the box is somewhat laggy.
>>
>> hdparm -t on sda and sdb reports :
>>
>> /dev/sda:
>>  Timing buffered disk reads:8 MB in  3.26 seconds =   2.46 MB/sec
>>
>> /dev/sdb:
>>  Timing buffered disk reads:8 MB in  3.56 seconds =   2.25 MB/sec
>>
>> My IDE discs are fine.
>>
>> Please let me know if you need my config or any other informations.
>>
> 
> And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.
> 

I found the commit which cause these problems , it is in git-scsi-misc patch 
and reverting it fixes both problems for me.

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-22 Thread Andrew Morton
On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C <[EMAIL PROTECTED]> wrote:

> I have some warnings on each SCSI disc:
> 
> 
> ...
> 
> [   30.724410] scsi 0:0:0:0: Direct-Access SEAGATE  ST318406LW   0109 
> PQ: 0 ANSI: 3
> [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
> [   30.724435]  target0:0:0: Beginning Domain Validation
> [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
> [   30.724572]  target0:0:0: Ending Domain Validation
> [   30.729747] scsi 0:0:1:0: Direct-Access FUJITSU  MAH3182MP0114 
> PQ: 0 ANSI: 4
> [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
> [   30.729771]  target0:0:1: Beginning Domain Validation
> [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
> [   30.729908]  target0:0:1: Ending Domain Validation
> 

Don't know what would have caused that.  But yes, something is wrong in
scsi land.

> 
> no idea whatever this is related but buffered disk reads are 2.XX MB/sec and 
> the box is somewhat laggy.
> 
> hdparm -t on sda and sdb reports :
> 
> /dev/sda:
>  Timing buffered disk reads:8 MB in  3.26 seconds =   2.46 MB/sec
> 
> /dev/sdb:
>  Timing buffered disk reads:8 MB in  3.56 seconds =   2.25 MB/sec
> 
> My IDE discs are fine.
> 
> Please let me know if you need my config or any other informations.
> 

And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-22 Thread Gabriel C
I have some warnings on each SCSI disc:


...

[   30.724410] scsi 0:0:0:0: Direct-Access SEAGATE  ST318406LW   0109 
PQ: 0 ANSI: 3
[   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
[   30.724435]  target0:0:0: Beginning Domain Validation
[   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
[   30.724572]  target0:0:0: Ending Domain Validation
[   30.729747] scsi 0:0:1:0: Direct-Access FUJITSU  MAH3182MP0114 
PQ: 0 ANSI: 4
[   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
[   30.729771]  target0:0:1: Beginning Domain Validation
[   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
[   30.729908]  target0:0:1: Ending Domain Validation

...

no idea whatever this is related but buffered disk reads are 2.XX MB/sec and 
the box is somewhat laggy.

hdparm -t on sda and sdb reports :

/dev/sda:
 Timing buffered disk reads:8 MB in  3.26 seconds =   2.46 MB/sec

/dev/sdb:
 Timing buffered disk reads:8 MB in  3.56 seconds =   2.25 MB/sec

My IDE discs are fine.

Please let me know if you need my config or any other informations.


Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-22 Thread Thomas Gleixner
On Thu, 22 Nov 2007, Andrew Morton wrote:

> On Thu, 22 Nov 2007 12:22:05 +0200 "Kirill A. Shutemov" <[EMAIL PROTECTED]> 
> wrote:
> 
> > On x86_64 'uname -m' return 'x86'.  It break many userspace programs. apt
> > and rpm for example.
> > 
> 
> Yes, there have been various discussions about this.  I think Sam is cooking 
> up
> a fix?

http://lkml.org/lkml/2007/11/19/323

I push it Linus wards ASAP.

Thanks,

tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-22 Thread Andrew Morton
On Thu, 22 Nov 2007 12:22:05 +0200 "Kirill A. Shutemov" <[EMAIL PROTECTED]> 
wrote:

> On x86_64 'uname -m' return 'x86'.  It break many userspace programs. apt
> and rpm for example.
> 

Yes, there have been various discussions about this.  I think Sam is cooking up
a fix?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-22 Thread Kirill A. Shutemov
On x86_64 'uname -m' return 'x86'.  It break many userspace programs. apt
and rpm for example.

-- 
Regards,  Kirill A. Shutemov
 + Belarus, Minsk
 + Velesys LLC, http://www.velesys.com/
 + ALT Linux Team, http://www.altlinux.com/


signature.asc
Description: Digital signature


Re: 2.6.24-rc3-mm1: usb mouse doesn't work

2007-11-22 Thread Kirill A. Shutemov
On [Wed, 21.11.2007 14:22], Andrew Morton wrote:
> On Wed, 21 Nov 2007 20:23:46 +0200
> "Kirill A. Shutemov" <[EMAIL PROTECTED]> wrote:
> 
> > USB mouse(Logitech M-BT58) doesn't work. TouchPad works.
> > dmesg after rmmod usbcore && modprobe uhci_hcd:
> > 
> > usbcore: registered new interface driver usbfs
> > usbcore: registered new interface driver hub
> > usbcore: registered new device driver usb
> > USB Universal Host Controller Interface driver v3.0
> > ACPI: PCI Interrupt :00:1d.0[A] -> Link [LNKE] -> GSI 10 (level, low)
> > -> IRQ 10
> > PCI: Setting latency timer of device :00:1d.0 to 64
> > uhci_hcd :00:1d.0: UHCI Host Controller
> > uhci_hcd :00:1d.0: new USB bus registered, assigned bus number 1
> > uhci_hcd :00:1d.0: irq 10, io base 0xbf80
> > usb usb1: configuration #1 chosen from 1 choice
> > hub 1-0:1.0: USB hub found
> > hub 1-0:1.0: 2 ports detected
> > usb usb1: new device found, idVendor=, idProduct=
> > usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
> > usb usb1: Product: UHCI Host Controller
> > usb usb1: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
> > usb usb1: SerialNumber: :00:1d.0
> > ACPI: PCI Interrupt :00:1d.1[B] -> Link [LNKF] -> GSI 11 (level, low)
> > -> IRQ 11
> > PCI: Setting latency timer of device :00:1d.1 to 64
> > uhci_hcd :00:1d.1: UHCI Host Controller
> > uhci_hcd :00:1d.1: new USB bus registered, assigned bus number 2
> > uhci_hcd :00:1d.1: irq 11, io base 0xbf60
> > usb usb2: configuration #1 chosen from 1 choice
> > hub 2-0:1.0: USB hub found
> > hub 2-0:1.0: 2 ports detected
> > usb usb2: new device found, idVendor=, idProduct=
> > usb usb2: new device strings: Mfr=3, Product=2, SerialNumber=1
> > usb usb2: Product: UHCI Host Controller
> > usb usb2: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
> > usb usb2: SerialNumber: :00:1d.1
> > ACPI: PCI Interrupt :00:1d.2[C] -> Link [LNKG] -> GSI 9 (level, low)
> > -> IRQ 9
> > PCI: Setting latency timer of device :00:1d.2 to 64
> > uhci_hcd :00:1d.2: UHCI Host Controller
> > uhci_hcd :00:1d.2: new USB bus registered, assigned bus number 3
> > uhci_hcd :00:1d.2: irq 9, io base 0xbf40
> > usb usb3: configuration #1 chosen from 1 choice
> > hub 3-0:1.0: USB hub found
> > hub 3-0:1.0: 2 ports detected
> > usb usb3: new device found, idVendor=, idProduct=
> > usb usb3: new device strings: Mfr=3, Product=2, SerialNumber=1
> > usb usb3: Product: UHCI Host Controller
> > usb usb3: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
> > usb usb3: SerialNumber: :00:1d.2
> > ACPI: PCI Interrupt :00:1d.3[D] -> Link [LNKH] -> GSI 7 (level, low)
> > -> IRQ 7
> > PCI: Setting latency timer of device :00:1d.3 to 64
> > uhci_hcd :00:1d.3: UHCI Host Controller
> > uhci_hcd :00:1d.3: new USB bus registered, assigned bus number 4
> > uhci_hcd :00:1d.3: irq 7, io base 0xbf20
> > usb usb4: configuration #1 chosen from 1 choice
> > hub 4-0:1.0: USB hub found
> > hub 4-0:1.0: 2 ports detected
> > usb usb4: new device found, idVendor=, idProduct=
> > usb usb4: new device strings: Mfr=3, Product=2, SerialNumber=1
> > usb usb4: Product: UHCI Host Controller
> > usb usb4: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
> > usb usb4: SerialNumber: :00:1d.3
> > uhci_hcd :00:1d.3: FGR not stopped yet!
> > 
> 
> I've had some strangenesses with USB lately.  Sometimes running `lsusb'
> makes the USB system notice a newly attached device.

No. But I have new messages in dmesg:

uhci_hcd :00:1d.3: FGR not stopped yet!
uhci_hcd :00:1d.2: FGR not stopped yet!
uhci_hcd :00:1d.1: FGR not stopped yet!
uhci_hcd :00:1d.0: FGR not stopped yet!


> Is that "FGR not stopped yet!" messgae new behaviour?

It is a new message since 2.6.24-rc3. I have never try -mm tree before.

-- 
Regards,  Kirill A. Shutemov
 + Belarus, Minsk
 + Velesys LLC, http://www.velesys.com/
 + ALT Linux Team, http://www.altlinux.com/


signature.asc
Description: Digital signature


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-22 Thread Kirill A. Shutemov
On [Wed, 21.11.2007 20:33], Torsten Kaiser wrote:
> On Nov 21, 2007 10:29 AM, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Wed, 21 Nov 2007 14:52:26 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> 
> > wrote:
> >
> > > Andrew Morton wrote:
> > > > On Wed, 21 Nov 2007 11:41:23 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> 
> > > > wrote:
> > > >> ACPI: Core revision 20070126
> > > >> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> > > >> Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using 
> > > >> the 'noapic' kernel parameter
> 
> I seen an identical error.

This bug is also reproducible with qemu.

-- 
Regards,  Kirill A. Shutemov
 + Belarus, Minsk
 + Velesys LLC, http://www.velesys.com/
 + ALT Linux Team, http://www.altlinux.com/


signature.asc
Description: Digital signature


Re: 2.6.24-rc3-mm1 (sync is slow ?)

2007-11-21 Thread KAMEZAWA Hiroyuki
On Wed, 21 Nov 2007 00:49:09 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Wed, 21 Nov 2007 17:42:15 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> 
> wrote:
> 
> > Hi, Andrew
> > 
> > I got following result in 'sync' command.
> > It was too slow. (memory controller config is off ;)
> > I attaches my .config.
> > ==
> > [2.6.24-rc3-mm1]
> > [EMAIL PROTECTED] ~]$ dd if=/dev/zero of=./tmpfile bs=4096 count=10
> > 10+0 records in
> > 10+0 records out
> > 40960 bytes (410 MB) copied, 1.46706 seconds, 279 MB/s
> > [EMAIL PROTECTED] ~]$ time sync
> > 
> > real3m6.440s
> > user0m0.000s
> > sys 0m0.133s

> Well I wonder how we did that.
> 
> It seems OK here from a quick test (i386, ext3-on-IDE).
> 
> Maybe device driver/block breakage?
> 

I confirmed This slowdown is caused by git-scsi-misc.patch.
I'm sorry that I can't chase more and will be offline in this weekend.

This is scsi_mod information in /proc/modules
=
scsi_mod 409416 8 
mptctl,sg,lpfc,scsi_transport_fc,mptspi,mptscsih,scsi_transport_spi,sd_mod, 
Live 0xa00202818000
=

What information should I provide more ?

Thanks,
-Kame


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1- powerpc link failure

2007-11-21 Thread Stephen Rothwell
On Wed, 21 Nov 2007 13:36:30 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
>
> The kernel build fails on powerpc while linking,

Only for allyesconfig (or maybe some other config that builds a lot of
stuff in.

>   AS  .tmp_kallsyms3.o
>   LD  vmlinux.o
> ld: TOC section size exceeds 64k
> make: *** [vmlinux.o] Error 1
> 
> The patch posted at http://lkml.org/lkml/2007/11/13/414, solves this 
> failure.

However, that patch needs more testing especially to figure out what
performance effects it has.  i.e. not for merging, yet.

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpQkAgSmSIms.pgp
Description: PGP signature


Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-21 Thread Andrew Morton
On Wed, 21 Nov 2007 22:45:22 +0100
Laurent Riffard <[EMAIL PROTECTED]> wrote:

> Le 21.11.2007 05:45, Andrew Morton a écrit :
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
> 
> Hello, 
> 
> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
> that a bunch of task are blocked in "D" state, they seem to wait for
> some I/O completion. I can try to hand-copy some data if requested.
> 
> I found these messages in dmesg:
> 
> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
> EXT3-fs: mounted filesystem with ordered data mode.
> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
> driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sda, sector 16460
> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
> ReiserFS: sda7: using ordered data mode
> --
> ReiserFS: sda7: Using r5 hash to sort names
> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sdb, sector 19632
> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
> driverbyte=DRIVER_OK,SUGGEST_OK
> end_request: I/O error, dev sdb, sector 40037363
> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 
> across:1048568k
> lp0: using parport0 (interrupt-driven).
> 
> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
> 
> Maybe something is broken in pata_via driver ?
> 

Could be - 
libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
and pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
touch pata_via.c.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-21 Thread Andrew Morton
On Wed, 21 Nov 2007 20:35:13 +0200
"Kirill A. Shutemov" <[EMAIL PROTECTED]> wrote:

> Symbol init_level4_pgt is needed by nvidia module. Is it really need to 
> unexport it?

It's our clever way of reducing the tester base so we don't get so many
bug reports.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1: usb mouse doesn't work

2007-11-21 Thread Andrew Morton
On Wed, 21 Nov 2007 20:23:46 +0200
"Kirill A. Shutemov" <[EMAIL PROTECTED]> wrote:

> USB mouse(Logitech M-BT58) doesn't work. TouchPad works.
> dmesg after rmmod usbcore && modprobe uhci_hcd:
> 
> usbcore: registered new interface driver usbfs
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> USB Universal Host Controller Interface driver v3.0
> ACPI: PCI Interrupt :00:1d.0[A] -> Link [LNKE] -> GSI 10 (level, low)
> -> IRQ 10
> PCI: Setting latency timer of device :00:1d.0 to 64
> uhci_hcd :00:1d.0: UHCI Host Controller
> uhci_hcd :00:1d.0: new USB bus registered, assigned bus number 1
> uhci_hcd :00:1d.0: irq 10, io base 0xbf80
> usb usb1: configuration #1 chosen from 1 choice
> hub 1-0:1.0: USB hub found
> hub 1-0:1.0: 2 ports detected
> usb usb1: new device found, idVendor=, idProduct=
> usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
> usb usb1: Product: UHCI Host Controller
> usb usb1: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
> usb usb1: SerialNumber: :00:1d.0
> ACPI: PCI Interrupt :00:1d.1[B] -> Link [LNKF] -> GSI 11 (level, low)
> -> IRQ 11
> PCI: Setting latency timer of device :00:1d.1 to 64
> uhci_hcd :00:1d.1: UHCI Host Controller
> uhci_hcd :00:1d.1: new USB bus registered, assigned bus number 2
> uhci_hcd :00:1d.1: irq 11, io base 0xbf60
> usb usb2: configuration #1 chosen from 1 choice
> hub 2-0:1.0: USB hub found
> hub 2-0:1.0: 2 ports detected
> usb usb2: new device found, idVendor=, idProduct=
> usb usb2: new device strings: Mfr=3, Product=2, SerialNumber=1
> usb usb2: Product: UHCI Host Controller
> usb usb2: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
> usb usb2: SerialNumber: :00:1d.1
> ACPI: PCI Interrupt :00:1d.2[C] -> Link [LNKG] -> GSI 9 (level, low)
> -> IRQ 9
> PCI: Setting latency timer of device :00:1d.2 to 64
> uhci_hcd :00:1d.2: UHCI Host Controller
> uhci_hcd :00:1d.2: new USB bus registered, assigned bus number 3
> uhci_hcd :00:1d.2: irq 9, io base 0xbf40
> usb usb3: configuration #1 chosen from 1 choice
> hub 3-0:1.0: USB hub found
> hub 3-0:1.0: 2 ports detected
> usb usb3: new device found, idVendor=, idProduct=
> usb usb3: new device strings: Mfr=3, Product=2, SerialNumber=1
> usb usb3: Product: UHCI Host Controller
> usb usb3: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
> usb usb3: SerialNumber: :00:1d.2
> ACPI: PCI Interrupt :00:1d.3[D] -> Link [LNKH] -> GSI 7 (level, low)
> -> IRQ 7
> PCI: Setting latency timer of device :00:1d.3 to 64
> uhci_hcd :00:1d.3: UHCI Host Controller
> uhci_hcd :00:1d.3: new USB bus registered, assigned bus number 4
> uhci_hcd :00:1d.3: irq 7, io base 0xbf20
> usb usb4: configuration #1 chosen from 1 choice
> hub 4-0:1.0: USB hub found
> hub 4-0:1.0: 2 ports detected
> usb usb4: new device found, idVendor=, idProduct=
> usb usb4: new device strings: Mfr=3, Product=2, SerialNumber=1
> usb usb4: Product: UHCI Host Controller
> usb usb4: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
> usb usb4: SerialNumber: :00:1d.3
> uhci_hcd :00:1d.3: FGR not stopped yet!
> 

I've had some strangenesses with USB lately.  Sometimes running `lsusb'
makes the USB system notice a newly attached device.

Is that "FGR not stopped yet!" messgae new behaviour?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-21 Thread Torsten Kaiser
On Nov 21, 2007 8:22 PM, Len Brown <[EMAIL PROTECTED]> wrote:
> On Wednesday 21 November 2007 01:18, Andrew Morton wrote:
> > On Wed, 21 Nov 2007 11:41:23 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> 
> > wrote:
>
> > > SMP alternatives: switching to UP code
> > > ACPI: Core revision 20070126
> > > ..MP-BIOS bug: 8254 timer not connected to IO-APIC
>
> did previous kernels print this too?

Not since my last BIOS upgrade.

This is from what dmesg's I still had laying around:
2.6.22-rc6-mm1: No
2.6.23-rc1-mm1, 2.6.23-rc2-mm1, 2.6.23-rc3-mm1: Yes
2.6.23-rc3-mm1 after BIOS upgrade: No
2.6.23-rc4-mm1...2.6.24-rc2-mm1: No

> > > Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> > > 'noapic' kernel parameter
> >
> > ACPI or x86 breakage, I guess.
>
> If you suspect ACPI breakage, then try "acpi=off" or "acpi=noirq".

ACPI doesn't look guilty.
acpi=noirq:
[   39.905884] Freeing SMP alternatives: 28k freed
[   39.910674] ACPI: Core revision 20070126
[   39.916542] ACPI: setting ELCR to 0e20 (from 0c20)
[   39.921855] ExtINT not setup in hardware but reported by MP table
[   39.928244] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
[   39.934586] Kernel panic - not syncing: IO-APIC + timer doesn't
work! Try using the 'noapic' kernel parameter
[   39.934587]

acpi=off:
[0.00] Freeing SMP alternatives: 28k freed
[0.00] ExtINT not setup in hardware but reported by MP table
[0.00] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
[0.00] Kernel panic - not syncing: IO-APIC + timer doesn't
work! Try using the 'noapic' kernel parameter
[0.00]

Torsten
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-21 Thread Torsten Kaiser
On Nov 21, 2007 10:29 AM, Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Wed, 21 Nov 2007 14:52:26 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
>
> > Andrew Morton wrote:
> > > On Wed, 21 Nov 2007 11:41:23 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> 
> > > wrote:
> > >> ACPI: Core revision 20070126
> > >> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> > >> Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> > >> 'noapic' kernel parameter

I seen an identical error.

> > > ACPI or x86 breakage, I guess.
> > >
> > > Did 'noapic' work?
> >
> > Passing noapic works,
>
> OK.

Not for me. I get a similar oops, but then the kernel panics

> >  but the kernel oops's
> >
> > [   97.161103] Unable to handle kernel NULL pointer dereference at 
> > 0009 RIP:
> > [   97.193973]  [] cpu_to_allnodes_group+0x69/0x7c
[snip]
> urgh, mess.  Enabling frame pointers might help here.

CONFIG_FRAME_POINTER=y

The oops/panic that happens with noapic:
[   35.866758] Initializing CPU#3
[   35.868769] Stuck ??
[   35.874043] Inquiring remote APIC #3...
[   35.877896] ... APIC #3 ID: 0300
[   35.881523] ... APIC #3 VERSION: 80050010
[   35.885587] ... APIC #3 SPIV: 01ff
[   35.889390] Brought up 1 CPUs
[   35.892375] Unable to handle kernel NULL pointer dereference at
0009 RIP:
[   35.897868]  [] cpu_to_allnodes_group+0x4b/0x60
[   35.906464] PGD 0
[   35.908523] Oops:  [1] SMP
[   35.911757] last sysfs file:
[   35.914740] CPU 0
[   35.916798] Modules linked in:
[   35.919990] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-mm1 #2
[   35.925914] RIP: 0010:[]  []
cpu_to_allnodes_group+0x4b/0x60
[   35.934734] RSP: :81011ff2bdb0  EFLAGS: 00010282
[   35.940053] RAX: 8084d870 RBX: 810001005810 RCX: 0004
[   35.947188] RDX: 0001 RSI: 81011ff26f68 RDI: 81011ff2bdb0
[   35.954323] RBP: 81011ff2bdd0 R08:  R09: 
[   35.961457] R10: 81007ff1c200 R11: 0200 R12: 810001005800
[   35.968592] R13:  R14:  R15: 
[   35.975727] FS:  () GS:807d4000()
knlGS:
[   35.983951] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[   35.989701] CR2: 0009 CR3: 00201000 CR4: 06a0
[   35.996836] DR0:  DR1:  DR2: 
[   36.003971] DR3:  DR6: 0ff0 DR7: 0400
[   36.011105] Process swapper (pid: 1, threadinfo 81011FF2A000,
task 81007FF2A000)
[   36.019191] Stack:   807e8f98
 810001005800
[   36.027373]  81011ff2be80 80230580 8084d640
8084d6e0
[   36.034922]  8084d780 8084d800 
81011ff26f68
[   36.042247] Call Trace:
[   36.044929]  [] build_sched_domains+0x460/0x820
[   36.051701]  [] mutex_lock_nested+0x199/0x2e0
[   36.057624]  [] arch_init_sched_domains+0x51/0x60
[   36.063895]  [] sched_init_smp+0x22/0xe0
[   36.069385]  [] smp_cpus_done+0x25/0x30
[   36.074791]  [] kernel_init+0x109/0x350
[   36.080196]  [] trace_hardirqs_on+0xbf/0x160
[   36.086032]  [] trace_hardirqs_on_thunk+0x35/0x3a
[   36.092303]  [] trace_hardirqs_on+0xbf/0x160
[   36.098141]  [] child_rip+0xa/0x12
[   36.103113]  [] restore_args+0x0/0x30
[   36.108345]  [] kernel_init+0x0/0x350
[   36.113750]  [] child_rip+0x0/0x12
[   36.118722]
[   36.120236] INFO: lockdep is turned off.
[   36.124170]
[   36.124170] Code: 48 03 42 08 48 89 03 48 83 c4 18 89 c8 5b c9 c3
0f 1f 44 00
[   36.133640] RIP  [] cpu_to_allnodes_group+0x4b/0x60
[   36.140116]  RSP 
[   36.143619] CR2: 0009
[   36.146952] Kernel panic - not syncing: Attempted to kill init!

(gdb) list *0x8022fc5b
0x8022fc5b is in cpu_to_allnodes_group (kernel/sched.c:6073).
6068
6069cpus_and(nodemask, nodemask, *cpu_map);
6070group = first_cpu(nodemask);
6071
6072if (sg)
6073*sg = &per_cpu(sched_group_allnodes, group);
6074return group;
6075}
6076
6077static void init_numa_sched_groups_power(struct sched_group *group_head)

Hope this stack trace is better.

Torsten
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-21 Thread Len Brown
On Wednesday 21 November 2007 01:18, Andrew Morton wrote:
> On Wed, 21 Nov 2007 11:41:23 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

> > SMP alternatives: switching to UP code
> > ACPI: Core revision 20070126
> > ..MP-BIOS bug: 8254 timer not connected to IO-APIC

did previous kernels print this too?

> > Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> > 'noapic' kernel parameter
> 
> ACPI or x86 breakage, I guess.

If you suspect ACPI breakage, then try "acpi=off" or "acpi=noirq".

thanks,
-Len
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-21 Thread Kirill A. Shutemov
On [Tue, 20.11.2007 22:15], Andrew Morton wrote:
> On Wed, 21 Nov 2007 14:03:34 +0800 "Dave Young" <[EMAIL PROTECTED]> wrote:
> 
> > On Nov 21, 2007 2:00 PM, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > >
> > > On Wed, 21 Nov 2007 13:51:47 +0800 "Dave Young" <[EMAIL PROTECTED]> wrote:
> > >
> > > > Hi, andrew
> > > >
> > > > modpost failed for me:
> > > >   MODPOST 360 modules
> > > > ERROR: "empty_zero_page" [drivers/kvm/kvm.ko] undefined!
> > > > make[1]: *** [__modpost] Error 1
> > > > make: *** [modules] Error 2
> > > >
> > >
> > > You're a victim of the hasty unexporting fad.  Which architecture?
> > > x86_64 I guess?
> > >
> > Hi,
> > ia32 instead.
> > 
> 
> oic.  Like this, I guess.
> 
> --- a/arch/x86/kernel/i386_ksyms_32.c~git-x86-i386-export-empty_zero_page
> +++ a/arch/x86/kernel/i386_ksyms_32.c
> @@ -2,6 +2,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  EXPORT_SYMBOL(__down_failed);
>  EXPORT_SYMBOL(__down_failed_interruptible);
> @@ -22,3 +23,4 @@ EXPORT_SYMBOL(__put_user_8);
>  EXPORT_SYMBOL(strstr);
>  
>  EXPORT_SYMBOL(csum_partial);
> +EXPORT_SYMBOL(empty_zero_page);
> _

Symbol init_level4_pgt is needed by nvidia module. Is it really need to 
unexport it?

-- 
Regards,  Kirill A. Shutemov
 + Belarus, Minsk
 + Velesys LLC, http://www.velesys.com/
 + ALT Linux Team, http://www.altlinux.com/


signature.asc
Description: Digital signature


Re: 2.6.24-rc3-mm1: usb mouse doesn't work

2007-11-21 Thread Kirill A. Shutemov
USB mouse(Logitech M-BT58) doesn't work. TouchPad works.
dmesg after rmmod usbcore && modprobe uhci_hcd:

usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt :00:1d.0[A] -> Link [LNKE] -> GSI 10 (level, low)
-> IRQ 10
PCI: Setting latency timer of device :00:1d.0 to 64
uhci_hcd :00:1d.0: UHCI Host Controller
uhci_hcd :00:1d.0: new USB bus registered, assigned bus number 1
uhci_hcd :00:1d.0: irq 10, io base 0xbf80
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
usb usb1: new device found, idVendor=, idProduct=
usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: UHCI Host Controller
usb usb1: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
usb usb1: SerialNumber: :00:1d.0
ACPI: PCI Interrupt :00:1d.1[B] -> Link [LNKF] -> GSI 11 (level, low)
-> IRQ 11
PCI: Setting latency timer of device :00:1d.1 to 64
uhci_hcd :00:1d.1: UHCI Host Controller
uhci_hcd :00:1d.1: new USB bus registered, assigned bus number 2
uhci_hcd :00:1d.1: irq 11, io base 0xbf60
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
usb usb2: new device found, idVendor=, idProduct=
usb usb2: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: UHCI Host Controller
usb usb2: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
usb usb2: SerialNumber: :00:1d.1
ACPI: PCI Interrupt :00:1d.2[C] -> Link [LNKG] -> GSI 9 (level, low)
-> IRQ 9
PCI: Setting latency timer of device :00:1d.2 to 64
uhci_hcd :00:1d.2: UHCI Host Controller
uhci_hcd :00:1d.2: new USB bus registered, assigned bus number 3
uhci_hcd :00:1d.2: irq 9, io base 0xbf40
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
usb usb3: new device found, idVendor=, idProduct=
usb usb3: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb3: Product: UHCI Host Controller
usb usb3: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
usb usb3: SerialNumber: :00:1d.2
ACPI: PCI Interrupt :00:1d.3[D] -> Link [LNKH] -> GSI 7 (level, low)
-> IRQ 7
PCI: Setting latency timer of device :00:1d.3 to 64
uhci_hcd :00:1d.3: UHCI Host Controller
uhci_hcd :00:1d.3: new USB bus registered, assigned bus number 4
uhci_hcd :00:1d.3: irq 7, io base 0xbf20
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
usb usb4: new device found, idVendor=, idProduct=
usb usb4: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb4: Product: UHCI Host Controller
usb usb4: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
usb usb4: SerialNumber: :00:1d.3
uhci_hcd :00:1d.3: FGR not stopped yet!

-- 
Regards,  Kirill A. Shutemov
 + Belarus, Minsk
 + Velesys LLC, http://www.velesys.com/
 + ALT Linux Team, http://www.altlinux.com/


signature.asc
Description: Digital signature


Re: 2.6.24-rc3-mm1

2007-11-21 Thread Rene Herman

On 21-11-07 07:08, Andrew Morton wrote:


fix (for my config ?) is attached.

=
This was necessary to build.

Signed-off-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]>

 arch/ia64/lib/Makefile |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.24-rc3-mm1/arch/ia64/lib/Makefile
===
--- linux-2.6.24-rc3-mm1.orig/arch/ia64/lib/Makefile
+++ linux-2.6.24-rc3-mm1/arch/ia64/lib/Makefile
@@ -2,7 +2,7 @@
 # Makefile for ia64-specific library routines..
 #
 
-obj-y := io.o copy_page-export.o

+obj-y := io.o
 
 lib-y := __divsi3.o __udivsi3.o __modsi3.o __umodsi3.o			\

__divdi3.o __udivdi3.o __moddi3.o __umoddi3.o   \


erp.  Actually, it should be this:

--- a/arch/ia64/lib/Makefile~ia64-export-copy_page-to-modules-fix-fix
+++ a/arch/ia64/lib/Makefile
@@ -2,7 +2,7 @@
 # Makefile for ia64-specific library routines..
 #
 
-obj-y := io.o copy_page-export.o

+obj-y := io.o
 
 lib-y := __divsi3.o __udivsi3.o __modsi3.o __umodsi3.o			\

__divdi3.o __udivdi3.o __moddi3.o __umoddi3.o   \


Devil's in the invisible details? :-?

Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Robert P. J. Day
On Wed, 21 Nov 2007, Avi Kivity wrote:

> headers_check continues to complain.  Is the only recourse to add
> asm/kvm.h for all archs?

that's what's happened with other header files.  see asm-*/auxvec.h,
for example.

rday
--

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Avi Kivity

Avi Kivity wrote:
  

The make headers_check fails,

 CHECK   include/linux/usb/gadgetfs.h
 CHECK   include/linux/usb/ch9.h
 CHECK   include/linux/usb/cdc.h
 CHECK   include/linux/usb/audio.h
 CHECK   include/linux/kvm.h
/root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires 
asm/kvm.h, which does not exist in exported headers
   

hm, works for me, on i386 and x86_64.  What's different over there?
   

Hi Andrew,

It fails on the powerpc box, with allyesconfig option.

 
  

How do we fix this?  Export linux/kvm.h only on x86?  Seems ugly.



Is kvm x86 specific? Then move the .h file to asm-x86.
Otherwise no good idea...

  


kvm.h is x86 specific today, but will be s390, ppc, ia64, and x86 
specific tomorrow.


What about having a asm-generic/kvm.h with a nice #error?would 
that suit?




headers_check continues to complain.  Is the only recourse to add 
asm/kvm.h for all archs?


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Avi Kivity

Sam Ravnborg wrote:

On Wed, Nov 21, 2007 at 10:44:40AM +0200, Avi Kivity wrote:
  

Kamalesh Babulal wrote:


Andrew Morton wrote:
 
  
On Wed, 21 Nov 2007 13:54:50 +0530 Kamalesh Babulal 
<[EMAIL PROTECTED]> wrote:


   


The make headers_check fails,

 CHECK   include/linux/usb/gadgetfs.h
 CHECK   include/linux/usb/ch9.h
 CHECK   include/linux/usb/cdc.h
 CHECK   include/linux/usb/audio.h
 CHECK   include/linux/kvm.h
/root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires 
asm/kvm.h, which does not exist in exported headers
 
  

hm, works for me, on i386 and x86_64.  What's different over there?
   


Hi Andrew,

It fails on the powerpc box, with allyesconfig option.

 
  

How do we fix this?  Export linux/kvm.h only on x86?  Seems ugly.



Is kvm x86 specific? Then move the .h file to asm-x86.
Otherwise no good idea...

  


kvm.h is x86 specific today, but will be s390, ppc, ia64, and x86 
specific tomorrow.


What about having a asm-generic/kvm.h with a nice #error?would that 
suit?



--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Sam Ravnborg
On Wed, Nov 21, 2007 at 10:44:40AM +0200, Avi Kivity wrote:
> Kamalesh Babulal wrote:
> >Andrew Morton wrote:
> >  
> >>On Wed, 21 Nov 2007 13:54:50 +0530 Kamalesh Babulal 
> >><[EMAIL PROTECTED]> wrote:
> >>
> >>
> >>>The make headers_check fails,
> >>>
> >>>  CHECK   include/linux/usb/gadgetfs.h
> >>>  CHECK   include/linux/usb/ch9.h
> >>>  CHECK   include/linux/usb/cdc.h
> >>>  CHECK   include/linux/usb/audio.h
> >>>  CHECK   include/linux/kvm.h
> >>>/root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires 
> >>>asm/kvm.h, which does not exist in exported headers
> >>>  
> >>hm, works for me, on i386 and x86_64.  What's different over there?
> >>
> >Hi Andrew,
> >
> >It fails on the powerpc box, with allyesconfig option.
> >
> >  
> 
> How do we fix this?  Export linux/kvm.h only on x86?  Seems ugly.

Is kvm x86 specific? Then move the .h file to asm-x86.
Otherwise no good idea...

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-21 Thread Kamalesh Babulal
Andrew Morton wrote:
> On Wed, 21 Nov 2007 14:52:26 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
> 
>> Andrew Morton wrote:
>>> On Wed, 21 Nov 2007 11:41:23 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> 
>>> wrote:
>>>
 Hi Andrew,

 Kernel panic's across different architectures like powerpc, x86_64, 
>>> powerpc complains about IO-APICs??
>>>
 Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
 Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
 Mount-cache hash table entries: 256
 SMP alternatives: switching to UP code
 ACPI: Core revision 20070126
 ..MP-BIOS bug: 8254 timer not connected to IO-APIC
 Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
 'noapic' kernel parameter
>>> ACPI or x86 breakage, I guess.
>>>
>>> Did 'noapic' work?
>> Hi Andrew,
>>
>> Passing noapic works,
> 
> OK.
> 
>>  but the kernel oops's 
>>
>> [   97.161103] Unable to handle kernel NULL pointer dereference at 
>> 0009 RIP:
>> [   97.193973]  [] cpu_to_allnodes_group+0x69/0x7c
>> [   97.245359] PGD 0
>> [   97.257611] Oops:  [1] SMP
>> [   97.276638] last sysfs file:
>> [   97.294417] CPU 0
>> [   97.306620] Modules linked in:
>> [   97.325066] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-mm1 #1
>> [   97.360514] RIP: 0010:[]  [] 
>> cpu_to_allnodes_group+0x69/0x7c
>> [   97.413287] RSP: :81012fabb650  EFLAGS: 00010286
>> [   97.445363] RAX: 809bb060 RBX: 81012fabb650 RCX: 
>> 00ff
>> [   97.488378] RDX: 0001 RSI: 013e RDI: 
>> 0100
>> [   97.531413] RBP: 81012fabb680 R08: 81012fa88180 R09: 
>> 
>> [   97.574428] R10:  R11:  R12: 
>> 810001005f50
>> [   97.617394] R13:  R14: 81012fa88180 R15: 
>> 810001005f40
>> [   97.660421] FS:  () GS:806c3000() 
>> knlGS:
>> [   97.709327] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
>> [   97.743995] CR2: 0009 CR3: 00201000 CR4: 
>> 06a0
>> [   97.787021] DR0:  DR1:  DR2: 
>> 
>> [   97.830053] DR3:  DR6: 0ff0 DR7: 
>> 0400
>> [   97.873036] Process swapper (pid: 1, threadinfo 81012FABA000, task 
>> 81012FAB8040)
>> [   97.921993] Stack:     
>> 
>> [   97.971056]  810001005f40 81012fabb700 81012fabbdf0 
>> 80235487
>> [   98.016420]     
>> 
>> [   98.060324] Call Trace:
>> [   98.076657]  [] build_sched_domains+0x1e1/0xc19
>> [   98.113383]  [] __kernel_text_address+0x22/0x30
>> [   98.150173]  [] check_chain_key+0x9c/0x15f
>> [   98.184355]  [] mark_lock+0x3b/0x5b3
>> [   98.215406]  [] mark_held_locks+0x4a/0x6a
>> [   98.249027]  [] get_page_from_freelist+0x42a/0x77d
>> [   98.287362]  [] trace_hardirqs_on+0x198/0x1c3
>> [   98.323123]  [] get_page_from_freelist+0x75a/0x77d
>> [   98.361429]  [] mark_lock+0x3b/0x5b3
>> [   98.392427]  [] check_chain_key+0x9c/0x15f
>> [   98.426621]  [] number+0x115/0x21f
>> [   98.456594]  [] __kernel_text_address+0x22/0x30
>> [   98.493362]  [] dump_trace+0x248/0x25d
>> [   98.525493]  [] check_chain_key+0x9c/0x15f
>> [   98.559678]  [] __lock_acquire+0xdee/0xf06
>> [   98.593868]  [] check_chain_key+0x9c/0x15f
>> [   98.628038]  [] check_chain_key+0x9c/0x15f
>> [   98.662225]  [] check_chain_key+0x9c/0x15f
>> [   98.696370]  [] __lock_acquire+0xdee/0xf06
>> [   98.730563]  [] check_chain_key+0x9c/0x15f
>> [   98.764689]  [] mark_lock+0x3b/0x5b3
>> [   98.795767]  [] mark_held_locks+0x4a/0x6a
>> [   98.829432]  [] number+0x115/0x21f
>> [   98.859460]  [] kprobe_flush_task+0x63/0xa9
>> [   98.894166]  [] vsnprintf+0x58f/0x5d5
>> [   98.925739]  [] sprintf+0x68/0x6a
>> [   98.955257]  [] lock_acquire+0x72/0xe0
>> [   98.987363]  [] lock_acquired+0x57/0x1d4
>> [   99.020446]  [] lock_release+0x67/0x21a
>> [   99.053079]  [] check_chain_key+0x9c/0x15f
>> [   99.087261]  [] mark_lock+0x3b/0x5b3
>> [   99.118328]  [] mark_lock+0x3b/0x5b3
>> [   99.149394]  [] arch_init_sched_domains+0x27/0x69
>> [   99.187217]  [] dbg_redzone2+0x2a/0x52
>> [   99.219320]  [] cache_alloc_debugcheck_after+0x16e/0x1cb
>> [   99.260779]  [] kmem_cache_alloc+0x15e/0x182
>> [   99.295944]  [] arch_init_sched_domains+0x5c/0x69
>> [   99.333768]  [] sched_init_smp+0x27/0x113
>> [   99.367400]  [] __bitmap_weight+0x78/0x8d
>> [   99.401090]  [] kernel_init+0x12d/0x315
>> [   99.433718]  [] _spin_unlock_irq+0x2b/0x30
>> [   99.467842]  [] trace_hardirqs_on+0x198/0x1c3
>> [   99.503534]  [] trace_hardirqs_on+0x198/0x1c3
>> [   99.539251]  [] child_rip+0xa/0x12
>> [   99.569234]  [] restore_args+0x0/0x30
>> [   99.600845]  [] kernel_init+0x0/0x315
>> [   99.632426]  [] child_rip+0x0/0x12
>> [   9

Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-21 Thread Andrew Morton
On Wed, 21 Nov 2007 14:52:26 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

> Andrew Morton wrote:
> > On Wed, 21 Nov 2007 11:41:23 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> 
> > wrote:
> > 
> >> Hi Andrew,
> >>
> >> Kernel panic's across different architectures like powerpc, x86_64, 
> > 
> > powerpc complains about IO-APICs??
> > 
> >> Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
> >> Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
> >> Mount-cache hash table entries: 256
> >> SMP alternatives: switching to UP code
> >> ACPI: Core revision 20070126
> >> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> >> Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> >> 'noapic' kernel parameter
> > 
> > ACPI or x86 breakage, I guess.
> > 
> > Did 'noapic' work?
> Hi Andrew,
> 
> Passing noapic works,

OK.

>  but the kernel oops's 
> 
> [   97.161103] Unable to handle kernel NULL pointer dereference at 
> 0009 RIP:
> [   97.193973]  [] cpu_to_allnodes_group+0x69/0x7c
> [   97.245359] PGD 0
> [   97.257611] Oops:  [1] SMP
> [   97.276638] last sysfs file:
> [   97.294417] CPU 0
> [   97.306620] Modules linked in:
> [   97.325066] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-mm1 #1
> [   97.360514] RIP: 0010:[]  [] 
> cpu_to_allnodes_group+0x69/0x7c
> [   97.413287] RSP: :81012fabb650  EFLAGS: 00010286
> [   97.445363] RAX: 809bb060 RBX: 81012fabb650 RCX: 
> 00ff
> [   97.488378] RDX: 0001 RSI: 013e RDI: 
> 0100
> [   97.531413] RBP: 81012fabb680 R08: 81012fa88180 R09: 
> 
> [   97.574428] R10:  R11:  R12: 
> 810001005f50
> [   97.617394] R13:  R14: 81012fa88180 R15: 
> 810001005f40
> [   97.660421] FS:  () GS:806c3000() 
> knlGS:
> [   97.709327] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
> [   97.743995] CR2: 0009 CR3: 00201000 CR4: 
> 06a0
> [   97.787021] DR0:  DR1:  DR2: 
> 
> [   97.830053] DR3:  DR6: 0ff0 DR7: 
> 0400
> [   97.873036] Process swapper (pid: 1, threadinfo 81012FABA000, task 
> 81012FAB8040)
> [   97.921993] Stack:     
> 
> [   97.971056]  810001005f40 81012fabb700 81012fabbdf0 
> 80235487
> [   98.016420]     
> 
> [   98.060324] Call Trace:
> [   98.076657]  [] build_sched_domains+0x1e1/0xc19
> [   98.113383]  [] __kernel_text_address+0x22/0x30
> [   98.150173]  [] check_chain_key+0x9c/0x15f
> [   98.184355]  [] mark_lock+0x3b/0x5b3
> [   98.215406]  [] mark_held_locks+0x4a/0x6a
> [   98.249027]  [] get_page_from_freelist+0x42a/0x77d
> [   98.287362]  [] trace_hardirqs_on+0x198/0x1c3
> [   98.323123]  [] get_page_from_freelist+0x75a/0x77d
> [   98.361429]  [] mark_lock+0x3b/0x5b3
> [   98.392427]  [] check_chain_key+0x9c/0x15f
> [   98.426621]  [] number+0x115/0x21f
> [   98.456594]  [] __kernel_text_address+0x22/0x30
> [   98.493362]  [] dump_trace+0x248/0x25d
> [   98.525493]  [] check_chain_key+0x9c/0x15f
> [   98.559678]  [] __lock_acquire+0xdee/0xf06
> [   98.593868]  [] check_chain_key+0x9c/0x15f
> [   98.628038]  [] check_chain_key+0x9c/0x15f
> [   98.662225]  [] check_chain_key+0x9c/0x15f
> [   98.696370]  [] __lock_acquire+0xdee/0xf06
> [   98.730563]  [] check_chain_key+0x9c/0x15f
> [   98.764689]  [] mark_lock+0x3b/0x5b3
> [   98.795767]  [] mark_held_locks+0x4a/0x6a
> [   98.829432]  [] number+0x115/0x21f
> [   98.859460]  [] kprobe_flush_task+0x63/0xa9
> [   98.894166]  [] vsnprintf+0x58f/0x5d5
> [   98.925739]  [] sprintf+0x68/0x6a
> [   98.955257]  [] lock_acquire+0x72/0xe0
> [   98.987363]  [] lock_acquired+0x57/0x1d4
> [   99.020446]  [] lock_release+0x67/0x21a
> [   99.053079]  [] check_chain_key+0x9c/0x15f
> [   99.087261]  [] mark_lock+0x3b/0x5b3
> [   99.118328]  [] mark_lock+0x3b/0x5b3
> [   99.149394]  [] arch_init_sched_domains+0x27/0x69
> [   99.187217]  [] dbg_redzone2+0x2a/0x52
> [   99.219320]  [] cache_alloc_debugcheck_after+0x16e/0x1cb
> [   99.260779]  [] kmem_cache_alloc+0x15e/0x182
> [   99.295944]  [] arch_init_sched_domains+0x5c/0x69
> [   99.333768]  [] sched_init_smp+0x27/0x113
> [   99.367400]  [] __bitmap_weight+0x78/0x8d
> [   99.401090]  [] kernel_init+0x12d/0x315
> [   99.433718]  [] _spin_unlock_irq+0x2b/0x30
> [   99.467842]  [] trace_hardirqs_on+0x198/0x1c3
> [   99.503534]  [] trace_hardirqs_on+0x198/0x1c3
> [   99.539251]  [] child_rip+0xa/0x12
> [   99.569234]  [] restore_args+0x0/0x30
> [   99.600845]  [] kernel_init+0x0/0x315
> [   99.632426]  [] child_rip+0x0/0x12
> [   99.662455]
> [   99.671637] INFO: lockdep is turned off.
> [   99.695385]
> [   99.695385] Code: 48 03 42 08 49

Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-21 Thread Kamalesh Babulal
Andrew Morton wrote:
> On Wed, 21 Nov 2007 11:41:23 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
> 
>> Hi Andrew,
>>
>> Kernel panic's across different architectures like powerpc, x86_64, 
> 
> powerpc complains about IO-APICs??
> 
>> Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
>> Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
>> Mount-cache hash table entries: 256
>> SMP alternatives: switching to UP code
>> ACPI: Core revision 20070126
>> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
>> Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
>> 'noapic' kernel parameter
> 
> ACPI or x86 breakage, I guess.
> 
> Did 'noapic' work?
Hi Andrew,

Passing noapic works, but the kernel oops's 

[   97.161103] Unable to handle kernel NULL pointer dereference at 
0009 RIP:
[   97.193973]  [] cpu_to_allnodes_group+0x69/0x7c
[   97.245359] PGD 0
[   97.257611] Oops:  [1] SMP
[   97.276638] last sysfs file:
[   97.294417] CPU 0
[   97.306620] Modules linked in:
[   97.325066] Pid: 1, comm: swapper Not tainted 2.6.24-rc3-mm1 #1
[   97.360514] RIP: 0010:[]  [] 
cpu_to_allnodes_group+0x69/0x7c
[   97.413287] RSP: :81012fabb650  EFLAGS: 00010286
[   97.445363] RAX: 809bb060 RBX: 81012fabb650 RCX: 00ff
[   97.488378] RDX: 0001 RSI: 013e RDI: 0100
[   97.531413] RBP: 81012fabb680 R08: 81012fa88180 R09: 
[   97.574428] R10:  R11:  R12: 810001005f50
[   97.617394] R13:  R14: 81012fa88180 R15: 810001005f40
[   97.660421] FS:  () GS:806c3000() 
knlGS:
[   97.709327] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[   97.743995] CR2: 0009 CR3: 00201000 CR4: 06a0
[   97.787021] DR0:  DR1:  DR2: 
[   97.830053] DR3:  DR6: 0ff0 DR7: 0400
[   97.873036] Process swapper (pid: 1, threadinfo 81012FABA000, task 
81012FAB8040)
[   97.921993] Stack:     

[   97.971056]  810001005f40 81012fabb700 81012fabbdf0 
80235487
[   98.016420]     

[   98.060324] Call Trace:
[   98.076657]  [] build_sched_domains+0x1e1/0xc19
[   98.113383]  [] __kernel_text_address+0x22/0x30
[   98.150173]  [] check_chain_key+0x9c/0x15f
[   98.184355]  [] mark_lock+0x3b/0x5b3
[   98.215406]  [] mark_held_locks+0x4a/0x6a
[   98.249027]  [] get_page_from_freelist+0x42a/0x77d
[   98.287362]  [] trace_hardirqs_on+0x198/0x1c3
[   98.323123]  [] get_page_from_freelist+0x75a/0x77d
[   98.361429]  [] mark_lock+0x3b/0x5b3
[   98.392427]  [] check_chain_key+0x9c/0x15f
[   98.426621]  [] number+0x115/0x21f
[   98.456594]  [] __kernel_text_address+0x22/0x30
[   98.493362]  [] dump_trace+0x248/0x25d
[   98.525493]  [] check_chain_key+0x9c/0x15f
[   98.559678]  [] __lock_acquire+0xdee/0xf06
[   98.593868]  [] check_chain_key+0x9c/0x15f
[   98.628038]  [] check_chain_key+0x9c/0x15f
[   98.662225]  [] check_chain_key+0x9c/0x15f
[   98.696370]  [] __lock_acquire+0xdee/0xf06
[   98.730563]  [] check_chain_key+0x9c/0x15f
[   98.764689]  [] mark_lock+0x3b/0x5b3
[   98.795767]  [] mark_held_locks+0x4a/0x6a
[   98.829432]  [] number+0x115/0x21f
[   98.859460]  [] kprobe_flush_task+0x63/0xa9
[   98.894166]  [] vsnprintf+0x58f/0x5d5
[   98.925739]  [] sprintf+0x68/0x6a
[   98.955257]  [] lock_acquire+0x72/0xe0
[   98.987363]  [] lock_acquired+0x57/0x1d4
[   99.020446]  [] lock_release+0x67/0x21a
[   99.053079]  [] check_chain_key+0x9c/0x15f
[   99.087261]  [] mark_lock+0x3b/0x5b3
[   99.118328]  [] mark_lock+0x3b/0x5b3
[   99.149394]  [] arch_init_sched_domains+0x27/0x69
[   99.187217]  [] dbg_redzone2+0x2a/0x52
[   99.219320]  [] cache_alloc_debugcheck_after+0x16e/0x1cb
[   99.260779]  [] kmem_cache_alloc+0x15e/0x182
[   99.295944]  [] arch_init_sched_domains+0x5c/0x69
[   99.333768]  [] sched_init_smp+0x27/0x113
[   99.367400]  [] __bitmap_weight+0x78/0x8d
[   99.401090]  [] kernel_init+0x12d/0x315
[   99.433718]  [] _spin_unlock_irq+0x2b/0x30
[   99.467842]  [] trace_hardirqs_on+0x198/0x1c3
[   99.503534]  [] trace_hardirqs_on+0x198/0x1c3
[   99.539251]  [] child_rip+0xa/0x12
[   99.569234]  [] restore_args+0x0/0x30
[   99.600845]  [] kernel_init+0x0/0x315
[   99.632426]  [] child_rip+0x0/0x12
[   99.662455]
[   99.671637] INFO: lockdep is turned off.
[   99.695385]
[   99.695385] Code: 48 03 42 08 49 89 04 24 48 83 c4 20 89 c8 5b 41 5c c9 c3 55
[   99.750603] RIP  [] cpu_to_allnodes_group+0x69/0x7c
[   99.789632]  RSP 

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo 

Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Robert P. J. Day
On Wed, 21 Nov 2007, Andrew Morton wrote:

> On Wed, 21 Nov 2007 03:52:08 -0500 (EST) "Robert P. J. Day" <[EMAIL 
> PROTECTED]> wrote:
... snip ...
> > i'm sure i'm going to humiliate myself for asking this, but shouldn't
> > i be able to reproduce the above by just running:
> >
> >   $ make ARCH=powerpc headers_install/headers_check
> >
> > we've sort of had this discussion before where, IIRC, you should be
> > able to generate the appropriate arch-specific headers without having
> > the corresponding toolchain, no?  so why can't i reproduce that error
> > on my x86 box?
> >
>
> I can.
>
> setenv ARCH powerpc
> make mrproper
> make allmodconfig
> make headers_check

ack.  never mind, i just noticed that this is with the rc3-mm1 tree.
i was confused since, in the latest git tree, there is absolutely *no*
inclusion of  anywhere in the tree, so clearly something
like that has been added in the mm tree.

sorry for the noise.

rday
--


Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Andrew Morton
On Wed, 21 Nov 2007 03:52:08 -0500 (EST) "Robert P. J. Day" <[EMAIL PROTECTED]> 
wrote:

> On Wed, 21 Nov 2007, Avi Kivity wrote:
> 
> > Kamalesh Babulal wrote:
> > > Andrew Morton wrote:
> > >
> > > > On Wed, 21 Nov 2007 13:54:50 +0530 Kamalesh Babulal
> > > > <[EMAIL PROTECTED]> wrote:
> > > >
> > > >
> > > > > The make headers_check fails,
> > > > >
> > > > >   CHECK   include/linux/usb/gadgetfs.h
> > > > >   CHECK   include/linux/usb/ch9.h
> > > > >   CHECK   include/linux/usb/cdc.h
> > > > >   CHECK   include/linux/usb/audio.h
> > > > >   CHECK   include/linux/kvm.h
> > > > > /root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires
> > > > > asm/kvm.h, which does not exist in exported headers
> > > > >
> > > > hm, works for me, on i386 and x86_64.  What's different over there?
> > > >
> > > Hi Andrew,
> > >
> > > It fails on the powerpc box, with allyesconfig option.
> >
> > How do we fix this?  Export linux/kvm.h only on x86?  Seems ugly.
> 
> i'm sure i'm going to humiliate myself for asking this, but shouldn't
> i be able to reproduce the above by just running:
> 
>   $ make ARCH=powerpc headers_install/headers_check
> 
> we've sort of had this discussion before where, IIRC, you should be
> able to generate the appropriate arch-specific headers without having
> the corresponding toolchain, no?  so why can't i reproduce that error
> on my x86 box?
> 

I can.

setenv ARCH powerpc
make mrproper
make allmodconfig
make headers_check
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1- powerpc link failure

2007-11-21 Thread Kamalesh Babulal
Hi Andrew,

The kernel build fails on powerpc while linking,

  AS  .tmp_kallsyms3.o
  LD  vmlinux.o
ld: TOC section size exceeds 64k
make: *** [vmlinux.o] Error 1

The patch posted at http://lkml.org/lkml/2007/11/13/414, solves this 
failure.

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Robert P. J. Day
On Wed, 21 Nov 2007, Avi Kivity wrote:

> Kamalesh Babulal wrote:
> > Andrew Morton wrote:
> >
> > > On Wed, 21 Nov 2007 13:54:50 +0530 Kamalesh Babulal
> > > <[EMAIL PROTECTED]> wrote:
> > >
> > >
> > > > The make headers_check fails,
> > > >
> > > >   CHECK   include/linux/usb/gadgetfs.h
> > > >   CHECK   include/linux/usb/ch9.h
> > > >   CHECK   include/linux/usb/cdc.h
> > > >   CHECK   include/linux/usb/audio.h
> > > >   CHECK   include/linux/kvm.h
> > > > /root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires
> > > > asm/kvm.h, which does not exist in exported headers
> > > >
> > > hm, works for me, on i386 and x86_64.  What's different over there?
> > >
> > Hi Andrew,
> >
> > It fails on the powerpc box, with allyesconfig option.
>
> How do we fix this?  Export linux/kvm.h only on x86?  Seems ugly.

i'm sure i'm going to humiliate myself for asking this, but shouldn't
i be able to reproduce the above by just running:

  $ make ARCH=powerpc headers_install/headers_check

we've sort of had this discussion before where, IIRC, you should be
able to generate the appropriate arch-specific headers without having
the corresponding toolchain, no?  so why can't i reproduce that error
on my x86 box?

rday
--


Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 (sync is slow ?)

2007-11-21 Thread Andrew Morton
On Wed, 21 Nov 2007 17:42:15 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote:

> Hi, Andrew
> 
> I got following result in 'sync' command.
> It was too slow. (memory controller config is off ;)
> I attaches my .config.
> ==
> [2.6.24-rc3-mm1]
> [EMAIL PROTECTED] ~]$ dd if=/dev/zero of=./tmpfile bs=4096 count=10
> 10+0 records in
> 10+0 records out
> 40960 bytes (410 MB) copied, 1.46706 seconds, 279 MB/s
> [EMAIL PROTECTED] ~]$ time sync
> 
> real3m6.440s
> user0m0.000s
> sys 0m0.133s
> 
> 
> on, 2.6.23-rc2-mm1, 2.6.23-rc3 there was no problem.
> ==
> [2.6.24-rc3]
> [EMAIL PROTECTED] ~]$ dd if=/dev/zero of=tmpfile bs=4096 count=10
> 10+0 records in
> 10+0 records out
> 40960 bytes (410 MB) copied, 2.07717 seconds, 197 MB/s
> [EMAIL PROTECTED] ~]$ time sync
> 
> real0m9.935s
> user0m0.001s
> sys 0m0.113s
> 
> [2.6.24-rc3]
> [EMAIL PROTECTED] ~]$ dd if=/dev/zero of=./tmpfile bs=4096 count=10
> 10+0 records in
> 10+0 records out
> 40960 bytes (410 MB) copied, 1.37147 seconds, 299 MB/s
> [EMAIL PROTECTED] ~]$ time sync[2.6.24-rc2-mm1]
> 
> 
> real0m11.718s
> user0m0.000s
> sys 0m0.138s

Well I wonder how we did that.

It seems OK here from a quick test (i386, ext3-on-IDE).

Maybe device driver/block breakage?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 (sync is slow ?)

2007-11-21 Thread KAMEZAWA Hiroyuki
On Wed, 21 Nov 2007 17:42:15 +0900
KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote:

> Hi, Andrew
> 
> I got following result in 'sync' command.
> It was too slow. (memory controller config is off ;)
> I attaches my .config.
> ==
> [2.6.24-rc3-mm1]
> [EMAIL PROTECTED] ~]$ dd if=/dev/zero of=./tmpfile bs=4096 count=10
> 10+0 records in
> 10+0 records out
> 40960 bytes (410 MB) copied, 1.46706 seconds, 279 MB/s
> [EMAIL PROTECTED] ~]$ time sync
> 
> real3m6.440s
> user0m0.000s
> sys 0m0.133s
> 
Ah, one of cpu shows 100% iowait in 'top' command while this.

-Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Avi Kivity

Kamalesh Babulal wrote:

Andrew Morton wrote:
  

On Wed, 21 Nov 2007 13:54:50 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:



The make headers_check fails,

  CHECK   include/linux/usb/gadgetfs.h
  CHECK   include/linux/usb/ch9.h
  CHECK   include/linux/usb/cdc.h
  CHECK   include/linux/usb/audio.h
  CHECK   include/linux/kvm.h
/root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires asm/kvm.h, 
which does not exist in exported headers
  

hm, works for me, on i386 and x86_64.  What's different over there?


Hi Andrew,

It fails on the powerpc box, with allyesconfig option.

  


How do we fix this?  Export linux/kvm.h only on x86?  Seems ugly.

--
Any sufficiently difficult bug is indistinguishable from a feature.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Kamalesh Babulal
Andrew Morton wrote:
> On Wed, 21 Nov 2007 13:54:50 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
> 
>> The make headers_check fails,
>>
>>   CHECK   include/linux/usb/gadgetfs.h
>>   CHECK   include/linux/usb/ch9.h
>>   CHECK   include/linux/usb/cdc.h
>>   CHECK   include/linux/usb/audio.h
>>   CHECK   include/linux/kvm.h
>> /root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires asm/kvm.h, 
>> which does not exist in exported headers
> 
> hm, works for me, on i386 and x86_64.  What's different over there?
Hi Andrew,

It fails on the powerpc box, with allyesconfig option.

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Andrew Morton
On Wed, 21 Nov 2007 13:54:50 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

> The make headers_check fails,
> 
>   CHECK   include/linux/usb/gadgetfs.h
>   CHECK   include/linux/usb/ch9.h
>   CHECK   include/linux/usb/cdc.h
>   CHECK   include/linux/usb/audio.h
>   CHECK   include/linux/kvm.h
> /root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires asm/kvm.h, 
> which does not exist in exported headers

hm, works for me, on i386 and x86_64.  What's different over there?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 make headers_check fails

2007-11-21 Thread Kamalesh Babulal
Hi Andrew,

The make headers_check fails,

  CHECK   include/linux/usb/gadgetfs.h
  CHECK   include/linux/usb/ch9.h
  CHECK   include/linux/usb/cdc.h
  CHECK   include/linux/usb/audio.h
  CHECK   include/linux/kvm.h
/root/kernels/linux-2.6.24-rc3/usr/include/linux/kvm.h requires asm/kvm.h, 
which does not exist in exported headers
make[2]: *** [/root/kernels/linux-2.6.24-rc3/usr/include/linux/.check.kvm.h] 
Error 1
make[1]: *** [linux] Error 2
make: *** [headers_check] Error 2


-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-20 Thread Dave Young
On Nov 21, 2007 2:15 PM, Andrew Morton <[EMAIL PROTECTED]> wrote:
>
> On Wed, 21 Nov 2007 14:03:34 +0800 "Dave Young" <[EMAIL PROTECTED]> wrote:
>
> > On Nov 21, 2007 2:00 PM, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > >
> > > On Wed, 21 Nov 2007 13:51:47 +0800 "Dave Young" <[EMAIL PROTECTED]> wrote:
> > >
> > > > Hi, andrew
> > > >
> > > > modpost failed for me:
> > > >   MODPOST 360 modules
> > > > ERROR: "empty_zero_page" [drivers/kvm/kvm.ko] undefined!
> > > > make[1]: *** [__modpost] Error 1
> > > > make: *** [modules] Error 2
> > > >
> > >
> > > You're a victim of the hasty unexporting fad.  Which architecture?
> > > x86_64 I guess?
> > >
> > Hi,
> > ia32 instead.
> >
>
> oic.  Like this, I guess.
>
> --- a/arch/x86/kernel/i386_ksyms_32.c~git-x86-i386-export-empty_zero_page
> +++ a/arch/x86/kernel/i386_ksyms_32.c
> @@ -2,6 +2,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  EXPORT_SYMBOL(__down_failed);
>  EXPORT_SYMBOL(__down_failed_interruptible);
> @@ -22,3 +23,4 @@ EXPORT_SYMBOL(__put_user_8);
>  EXPORT_SYMBOL(strstr);
>
>  EXPORT_SYMBOL(csum_partial);
> +EXPORT_SYMBOL(empty_zero_page);
> _
>

Yes, passed :)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-20 Thread Andrew Morton
On Wed, 21 Nov 2007 11:41:23 +0530 Kamalesh Babulal <[EMAIL PROTECTED]> wrote:

> Hi Andrew,
> 
> Kernel panic's across different architectures like powerpc, x86_64, 

powerpc complains about IO-APICs??

> Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
> Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
> Mount-cache hash table entries: 256
> SMP alternatives: switching to UP code
> ACPI: Core revision 20070126
> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
> 'noapic' kernel parameter

ACPI or x86 breakage, I guess.

Did 'noapic' work?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1

2007-11-20 Thread Andrew Morton
On Wed, 21 Nov 2007 14:03:34 +0800 "Dave Young" <[EMAIL PROTECTED]> wrote:

> On Nov 21, 2007 2:00 PM, Andrew Morton <[EMAIL PROTECTED]> wrote:
> >
> > On Wed, 21 Nov 2007 13:51:47 +0800 "Dave Young" <[EMAIL PROTECTED]> wrote:
> >
> > > Hi, andrew
> > >
> > > modpost failed for me:
> > >   MODPOST 360 modules
> > > ERROR: "empty_zero_page" [drivers/kvm/kvm.ko] undefined!
> > > make[1]: *** [__modpost] Error 1
> > > make: *** [modules] Error 2
> > >
> >
> > You're a victim of the hasty unexporting fad.  Which architecture?
> > x86_64 I guess?
> >
> Hi,
> ia32 instead.
> 

oic.  Like this, I guess.

--- a/arch/x86/kernel/i386_ksyms_32.c~git-x86-i386-export-empty_zero_page
+++ a/arch/x86/kernel/i386_ksyms_32.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 
 EXPORT_SYMBOL(__down_failed);
 EXPORT_SYMBOL(__down_failed_interruptible);
@@ -22,3 +23,4 @@ EXPORT_SYMBOL(__put_user_8);
 EXPORT_SYMBOL(strstr);
 
 EXPORT_SYMBOL(csum_partial);
+EXPORT_SYMBOL(empty_zero_page);
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc3-mm1 - Kernel Panic on IO-APIC

2007-11-20 Thread Kamalesh Babulal
Hi Andrew,

Kernel panic's across different architectures like powerpc, x86_64, 

Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
Mount-cache hash table entries: 256
SMP alternatives: switching to UP code
ACPI: Core revision 20070126
..MP-BIOS bug: 8254 timer not connected to IO-APIC
Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 
'noapic' kernel parameter

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >