Re: 2.6.13-rc3-mm3

2005-08-08 Thread Christoph Lameter
On Mon, 8 Aug 2005, Richard Purdie wrote: > The following patch (against -mm) cleared the problem up but I'm not > sure how correct it is: Almost. The new entry needs to be made dirty. new_entry is already made young. entry is not. --- Set dirty bit correctly in handle_pte_fault new_entry is

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Dipankar Sarma
On Mon, Aug 08, 2005 at 10:25:59AM -0700, Andrew Morton wrote: > Dipankar Sarma <[EMAIL PROTECTED]> wrote: > > > > > But: IIRC the counters were moved to the ctor/dtor for performance > > > reasons, I'd guess mbligh ran into cache line trashing on the > > > filp_count_lock spinlock with reaim or

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Richard Purdie
I've done a bit of analysis: cmpxchg fail fault mm=c3945b20 vma=c304ad84 addr=402cb000 write=2048 ptep=c2af5b2c pmd=c2bc5008 entry=886c0f7 new=886c077 current=886c077 Note the Dirty bit is set on entry and not new where it probably should be... ptep_cmpxchg(mm, address, pte, entry, new_entry)

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Richard Purdie
On Mon, 2005-08-08 at 09:48 -0700, Christoph Lameter wrote: > Ok. So we cannot set the dirty bit. > > Here is a patch that also prints the pte status immediately before > ptep_cmpxchg. I guess this will show that dirty bit is already set. > > Does the ARM have some hardware capability to set

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Andrew Morton
Dipankar Sarma <[EMAIL PROTECTED]> wrote: > > On Mon, Aug 08, 2005 at 06:31:52PM +0200, Manfred Spraul wrote: > > Dipankar Sarma wrote: > > > > >Hugh, could you please try this with the experimental patch below ? > > >Manfred, is it safe to decrement nr_files in file_free() > > >instead of the

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Christoph Lameter
On Mon, 8 Aug 2005, Russell King wrote: > ARM doesn't have cmpxchg nor does it have hardware access nor dirty > bits. They're simulated in software. Even the cmpxchg is simulated. > What's the problem you're trying to solve? A hang when starting X on ARM with rc4-mm1 which contains the page

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Russell King
On Mon, Aug 08, 2005 at 09:48:22AM -0700, Christoph Lameter wrote: > On Sun, 7 Aug 2005, Richard Purdie wrote: > > > > > We know the the failure case can be identified by the > > > > cmpxchg_fail_flag_update condition being met. Can you provide me with a > > > > patch to dump useful debugging

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Dipankar Sarma
On Mon, Aug 08, 2005 at 06:31:52PM +0200, Manfred Spraul wrote: > Dipankar Sarma wrote: > > >Hugh, could you please try this with the experimental patch below ? > >Manfred, is it safe to decrement nr_files in file_free() > >instead of the destructor ? I can't see any problem. > > > > > > > The

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Christoph Lameter
On Sun, 7 Aug 2005, Richard Purdie wrote: > > > We know the the failure case can be identified by the > > > cmpxchg_fail_flag_update condition being met. Can you provide me with a > > > patch to dump useful debugging information when that occurs? > > Ok, this results in an infinite loop of one

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Manfred Spraul
Dipankar Sarma wrote: Hugh, could you please try this with the experimental patch below ? Manfred, is it safe to decrement nr_files in file_free() instead of the destructor ? I can't see any problem. The ctor/dtor are only called when new objects are created, not on every

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Hugh Dickins
On Mon, 8 Aug 2005, Dipankar Sarma wrote: > On Wed, Aug 03, 2005 at 09:56:44AM +1000, Andrew Morton forwarded from Hugh: > > > > Subject: two 2.6.13-rc3-mm3 oddities > > > > One time my tmpfs-and-looped-tmpfs-kernel-builds collapsed with lots of > > VFS: file-max

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Dipankar Sarma
I am ccing this to linux-kernel for a wider audience. On Wed, Aug 03, 2005 at 09:56:44AM +1000, Andrew Morton wrote: > > Subject: two 2.6.13-rc3-mm3 oddities > > Just wanted to record a couple of oddities I noticed with 2.6.13-rc3-mm3 > (maybe there before: I hardly tested -mm1

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Dipankar Sarma
I am ccing this to linux-kernel for a wider audience. On Wed, Aug 03, 2005 at 09:56:44AM +1000, Andrew Morton wrote: Subject: two 2.6.13-rc3-mm3 oddities Just wanted to record a couple of oddities I noticed with 2.6.13-rc3-mm3 (maybe there before: I hardly tested -mm1 and didn't even

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Hugh Dickins
On Mon, 8 Aug 2005, Dipankar Sarma wrote: On Wed, Aug 03, 2005 at 09:56:44AM +1000, Andrew Morton forwarded from Hugh: Subject: two 2.6.13-rc3-mm3 oddities One time my tmpfs-and-looped-tmpfs-kernel-builds collapsed with lots of VFS: file-max limit 49778 reached messages, which I

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Manfred Spraul
Dipankar Sarma wrote: Hugh, could you please try this with the experimental patch below ? Manfred, is it safe to decrement nr_files in file_free() instead of the destructor ? I can't see any problem. The ctor/dtor are only called when new objects are created, not on every

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Christoph Lameter
On Sun, 7 Aug 2005, Richard Purdie wrote: We know the the failure case can be identified by the cmpxchg_fail_flag_update condition being met. Can you provide me with a patch to dump useful debugging information when that occurs? Ok, this results in an infinite loop of one message with

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Dipankar Sarma
On Mon, Aug 08, 2005 at 06:31:52PM +0200, Manfred Spraul wrote: Dipankar Sarma wrote: Hugh, could you please try this with the experimental patch below ? Manfred, is it safe to decrement nr_files in file_free() instead of the destructor ? I can't see any problem. The ctor/dtor are

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Russell King
On Mon, Aug 08, 2005 at 09:48:22AM -0700, Christoph Lameter wrote: On Sun, 7 Aug 2005, Richard Purdie wrote: We know the the failure case can be identified by the cmpxchg_fail_flag_update condition being met. Can you provide me with a patch to dump useful debugging information when

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Christoph Lameter
On Mon, 8 Aug 2005, Russell King wrote: ARM doesn't have cmpxchg nor does it have hardware access nor dirty bits. They're simulated in software. Even the cmpxchg is simulated. What's the problem you're trying to solve? A hang when starting X on ARM with rc4-mm1 which contains the page

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Andrew Morton
Dipankar Sarma [EMAIL PROTECTED] wrote: On Mon, Aug 08, 2005 at 06:31:52PM +0200, Manfred Spraul wrote: Dipankar Sarma wrote: Hugh, could you please try this with the experimental patch below ? Manfred, is it safe to decrement nr_files in file_free() instead of the destructor ? I

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Richard Purdie
On Mon, 2005-08-08 at 09:48 -0700, Christoph Lameter wrote: Ok. So we cannot set the dirty bit. Here is a patch that also prints the pte status immediately before ptep_cmpxchg. I guess this will show that dirty bit is already set. Does the ARM have some hardware capability to set dirty

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Richard Purdie
I've done a bit of analysis: cmpxchg fail fault mm=c3945b20 vma=c304ad84 addr=402cb000 write=2048 ptep=c2af5b2c pmd=c2bc5008 entry=886c0f7 new=886c077 current=886c077 Note the Dirty bit is set on entry and not new where it probably should be... ptep_cmpxchg(mm, address, pte, entry, new_entry)

Re: Fw: two 2.6.13-rc3-mm3 oddities

2005-08-08 Thread Dipankar Sarma
On Mon, Aug 08, 2005 at 10:25:59AM -0700, Andrew Morton wrote: Dipankar Sarma [EMAIL PROTECTED] wrote: But: IIRC the counters were moved to the ctor/dtor for performance reasons, I'd guess mbligh ran into cache line trashing on the filp_count_lock spinlock with reaim or something

Re: 2.6.13-rc3-mm3

2005-08-08 Thread Christoph Lameter
On Mon, 8 Aug 2005, Richard Purdie wrote: The following patch (against -mm) cleared the problem up but I'm not sure how correct it is: Almost. The new entry needs to be made dirty. new_entry is already made young. entry is not. --- Set dirty bit correctly in handle_pte_fault new_entry is

Re: 2.6.13-rc3-mm3

2005-08-07 Thread Martin J. Bligh
--"Martin J. Bligh" <[EMAIL PROTECTED]> wrote (on Tuesday, August 02, 2005 21:21:30 -0700): > --"Martin J. Bligh" <[EMAIL PROTECTED]> wrote (on Tuesday, August 02, 2005 > 18:17:33 -0700): >> --Andrew Morton <[EMAIL PROTECTED]> wrote (on Thursday, July 28, 2005 >> 23:10:29 -0700): >> >>>

Re: 2.6.13-rc3-mm3

2005-08-07 Thread Richard Purdie
On Fri, 2005-08-05 at 08:17 -0700, Christoph Lameter wrote: > On Thu, 4 Aug 2005, Richard Purdie wrote: > > > > We know the the failure case can be identified by the > > cmpxchg_fail_flag_update condition being met. Can you provide me with a > > patch to dump useful debugging information when

Re: 2.6.13-rc3-mm3

2005-08-07 Thread Richard Purdie
On Fri, 2005-08-05 at 08:17 -0700, Christoph Lameter wrote: On Thu, 4 Aug 2005, Richard Purdie wrote: We know the the failure case can be identified by the cmpxchg_fail_flag_update condition being met. Can you provide me with a patch to dump useful debugging information when that occurs?

Re: 2.6.13-rc3-mm3

2005-08-07 Thread Martin J. Bligh
--Martin J. Bligh [EMAIL PROTECTED] wrote (on Tuesday, August 02, 2005 21:21:30 -0700): --Martin J. Bligh [EMAIL PROTECTED] wrote (on Tuesday, August 02, 2005 18:17:33 -0700): --Andrew Morton [EMAIL PROTECTED] wrote (on Thursday, July 28, 2005 23:10:29 -0700): Martin J. Bligh [EMAIL

Re: 2.6.13-rc3-mm3

2005-08-05 Thread Christoph Lameter
On Thu, 4 Aug 2005, Richard Purdie wrote: > I'm at a disadvantage here as the linux mm system is one area I've > avoided getting too deeply involved with so far. My knowledge is > therefore limited and I won't know what correct or incorrect behaviour > would look like. > > We know the the

Re: [ACPI] Re: 2.6.13-rc3-mm3

2005-08-05 Thread Michael Thonke
Hello Andrew, Andrew Morton wrote: Michael, I'm assuming that a) this problem remains in those -mm kernels which include git-acpi.patch and that b) the problems are not present in 2.6.13-rc5 or 2.6.13-rc6, yes? a.) I don't have any problems in 2.6.13-rc5-git[1-3] and 2.6.13-rc4-mm1 they

Re: [ACPI] Re: 2.6.13-rc3-mm3

2005-08-05 Thread Michael Thonke
Hello Andrew, Andrew Morton wrote: Michael, I'm assuming that a) this problem remains in those -mm kernels which include git-acpi.patch and that b) the problems are not present in 2.6.13-rc5 or 2.6.13-rc6, yes? a.) I don't have any problems in 2.6.13-rc5-git[1-3] and 2.6.13-rc4-mm1 they

Re: 2.6.13-rc3-mm3

2005-08-05 Thread Christoph Lameter
On Thu, 4 Aug 2005, Richard Purdie wrote: I'm at a disadvantage here as the linux mm system is one area I've avoided getting too deeply involved with so far. My knowledge is therefore limited and I won't know what correct or incorrect behaviour would look like. We know the the failure case

Re: [ACPI] Re: 2.6.13-rc3-mm3

2005-08-04 Thread Andrew Morton
Michael Thonke <[EMAIL PROTECTED]> wrote: > > Moore, Robert schrieb: > > >+ACPI-0287: *** Error: Region SystemMemory(0) has no handler > >+ACPI-0127: *** Error: acpi_load_tables: Could not load namespace: > >AE_NOT_EXIST > >+ACPI-0136: *** Error: acpi_load_tables: Could not load

Re: 2.6.13-rc3-mm3

2005-08-04 Thread Richard Purdie
On Thu, 2005-08-04 at 07:04 -0700, Christoph Lameter wrote: > On Thu, 4 Aug 2005, Richard Purdie wrote: > > > On Wed, 2005-08-03 at 17:19 -0700, Christoph Lameter wrote: > > > Could you try the following patch? I think the problem was that higher > > > addressses were not mappable via the page

Re: 2.6.13-rc3-mm3

2005-08-04 Thread Christoph Lameter
On Thu, 4 Aug 2005, Richard Purdie wrote: > On Wed, 2005-08-03 at 17:19 -0700, Christoph Lameter wrote: > > Could you try the following patch? I think the problem was that higher > > addressses were not mappable via the page fault handler. This patch > > inserts the pmd entry into the pgd as

Re: 2.6.13-rc3-mm3

2005-08-04 Thread Richard Purdie
On Wed, 2005-08-03 at 17:19 -0700, Christoph Lameter wrote: > Could you try the following patch? I think the problem was that higher > addressses were not mappable via the page fault handler. This patch > inserts the pmd entry into the pgd as necessary if the pud level is > folded. I tried

Re: 2.6.13-rc3-mm3

2005-08-04 Thread Richard Purdie
On Wed, 2005-08-03 at 17:19 -0700, Christoph Lameter wrote: Could you try the following patch? I think the problem was that higher addressses were not mappable via the page fault handler. This patch inserts the pmd entry into the pgd as necessary if the pud level is folded. I tried this

Re: 2.6.13-rc3-mm3

2005-08-04 Thread Christoph Lameter
On Thu, 4 Aug 2005, Richard Purdie wrote: On Wed, 2005-08-03 at 17:19 -0700, Christoph Lameter wrote: Could you try the following patch? I think the problem was that higher addressses were not mappable via the page fault handler. This patch inserts the pmd entry into the pgd as necessary

Re: 2.6.13-rc3-mm3

2005-08-04 Thread Richard Purdie
On Thu, 2005-08-04 at 07:04 -0700, Christoph Lameter wrote: On Thu, 4 Aug 2005, Richard Purdie wrote: On Wed, 2005-08-03 at 17:19 -0700, Christoph Lameter wrote: Could you try the following patch? I think the problem was that higher addressses were not mappable via the page fault

Re: [ACPI] Re: 2.6.13-rc3-mm3

2005-08-04 Thread Andrew Morton
Michael Thonke [EMAIL PROTECTED] wrote: Moore, Robert schrieb: +ACPI-0287: *** Error: Region SystemMemory(0) has no handler +ACPI-0127: *** Error: acpi_load_tables: Could not load namespace: AE_NOT_EXIST +ACPI-0136: *** Error: acpi_load_tables: Could not load tables: This

Re: 2.6.13-rc3-mm3

2005-08-03 Thread Christoph Lameter
Could you try the following patch? I think the problem was that higher addressses were not mappable via the page fault handler. This patch inserts the pmd entry into the pgd as necessary if the pud level is folded. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Index:

Re: 2.6.13-rc3-mm3

2005-08-03 Thread Christoph Lameter
Could you try the following patch? I think the problem was that higher addressses were not mappable via the page fault handler. This patch inserts the pmd entry into the pgd as necessary if the pud level is folded. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] Index:

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Martin J. Bligh
--"Martin J. Bligh" <[EMAIL PROTECTED]> wrote (on Tuesday, August 02, 2005 18:17:33 -0700): > --Andrew Morton <[EMAIL PROTECTED]> wrote (on Thursday, July 28, 2005 > 23:10:29 -0700): > >> "Martin J. Bligh" <[EMAIL PROTECTED]> wrote: >>> >>> NUMA-Q boxes are still crashing on boot with -mm BTW.

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Martin J. Bligh
--Andrew Morton <[EMAIL PROTECTED]> wrote (on Thursday, July 28, 2005 23:10:29 -0700): > "Martin J. Bligh" <[EMAIL PROTECTED]> wrote: >> >> NUMA-Q boxes are still crashing on boot with -mm BTW. Is the thing we >> identified earlier with the sched patches ... >> >>

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Linus Torvalds
On Wed, 3 Aug 2005, Ivan Kokshaysky wrote: > > On Tue, Aug 02, 2005 at 02:21:44PM -0700, Greg KH wrote: > > Nice, care to make up a single patch with these two changes in it? > > Yep, I'll do it shortly, plus some minor additions as separate > patches. Actually, since everybody seems to like

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Ivan Kokshaysky
On Tue, Aug 02, 2005 at 02:21:44PM -0700, Greg KH wrote: > Nice, care to make up a single patch with these two changes in it? Yep, I'll do it shortly, plus some minor additions as separate patches. Ivan. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Ivan Kokshaysky
On Tue, Aug 02, 2005 at 10:11:40AM -0700, Linus Torvalds wrote: > So I think it would be much easier to just make the change in > "pci_bus_alloc_resource()", and say that if the parent resource that we're > testing starts at some non-zero value, we just use that instead of "min" > when we call

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Greg KH
On Wed, Aug 03, 2005 at 01:13:37AM +0400, Ivan Kokshaysky wrote: > On Tue, Aug 02, 2005 at 10:11:40AM -0700, Linus Torvalds wrote: > > So I think it would be much easier to just make the change in > > "pci_bus_alloc_resource()", and say that if the parent resource that we're > > testing starts at

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Linus Torvalds
On Tue, 2 Aug 2005, Ivan Kokshaysky wrote: > > Right, and this hurts the cardbus as well... > But it should be pretty easy to learn the PCI layer to allocate above > PCIBIOS_MIN_IO _only_ when we allocate on the root bus. > Something like this (completely untested)? I think you'd have to

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Ivan Kokshaysky
On Tue, Aug 02, 2005 at 08:48:21AM -0700, Linus Torvalds wrote: > The problem with this is that it only papers over the bug. > > I don't mind trying to allocate at higher addresses per se: we used to > have the starting point be 0x4000 at some point, and that part is fine. > The problem is

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Linus Torvalds
On Tue, 2 Aug 2005, Ivan Kokshaysky wrote: > > Does the patch in appended message fix that? The problem with this is that it only papers over the bug. I don't mind trying to allocate at higher addresses per se: we used to have the starting point be 0x4000 at some point, and that part is

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Manuel Lauss
On Tue, Aug 02, 2005 at 03:40:22PM +0400, Ivan Kokshaysky wrote: > On Tue, Aug 02, 2005 at 12:32:26PM +0200, Manuel Lauss wrote: > > Does not work on -rc4-mm1. The IO-ports pre-reserved message appears, > > though. The 2 io-regions are still located under the "CardBus #03" > > device. Re-Applying

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Ivan Kokshaysky
On Tue, Aug 02, 2005 at 12:32:26PM +0200, Manuel Lauss wrote: > Does not work on -rc4-mm1. The IO-ports pre-reserved message appears, > though. The 2 io-regions are still located under the "CardBus #03" > device. Re-Applying > "revert-gregkh-pci-pci-assign-unassigned-resources.patch" makes it >

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Manuel Lauss
On Tue, Aug 02, 2005 at 11:49:28AM +0200, Stelian Pop wrote: > Le lundi 01 ao??t 2005 ?? 16:37 +0200, Stelian Pop a ??crit : > > > > Also, it looks like sonypi really is pretty nasty to probe for, so it's > > > not enough to just say "oh, it's a sony VAIO, let's reserve that region". > > >

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Stelian Pop
Le lundi 01 août 2005 à 16:37 +0200, Stelian Pop a écrit : > > Also, it looks like sonypi really is pretty nasty to probe for, so it's > > not enough to just say "oh, it's a sony VAIO, let's reserve that region". > > Otherwise I'd just suggest adding a "dmi_check_system()" table to > >

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Stelian Pop
Le lundi 01 août 2005 à 16:37 +0200, Stelian Pop a écrit : Also, it looks like sonypi really is pretty nasty to probe for, so it's not enough to just say oh, it's a sony VAIO, let's reserve that region. Otherwise I'd just suggest adding a dmi_check_system() table to

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Manuel Lauss
On Tue, Aug 02, 2005 at 11:49:28AM +0200, Stelian Pop wrote: Le lundi 01 ao??t 2005 ?? 16:37 +0200, Stelian Pop a ??crit : Also, it looks like sonypi really is pretty nasty to probe for, so it's not enough to just say oh, it's a sony VAIO, let's reserve that region. Otherwise I'd

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Ivan Kokshaysky
On Tue, Aug 02, 2005 at 12:32:26PM +0200, Manuel Lauss wrote: Does not work on -rc4-mm1. The IO-ports pre-reserved message appears, though. The 2 io-regions are still located under the CardBus #03 device. Re-Applying revert-gregkh-pci-pci-assign-unassigned-resources.patch makes it work again.

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Manuel Lauss
On Tue, Aug 02, 2005 at 03:40:22PM +0400, Ivan Kokshaysky wrote: On Tue, Aug 02, 2005 at 12:32:26PM +0200, Manuel Lauss wrote: Does not work on -rc4-mm1. The IO-ports pre-reserved message appears, though. The 2 io-regions are still located under the CardBus #03 device. Re-Applying

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Ivan Kokshaysky
On Tue, Aug 02, 2005 at 08:48:21AM -0700, Linus Torvalds wrote: The problem with this is that it only papers over the bug. I don't mind trying to allocate at higher addresses per se: we used to have the starting point be 0x4000 at some point, and that part is fine. The problem is that

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Linus Torvalds
On Tue, 2 Aug 2005, Ivan Kokshaysky wrote: Right, and this hurts the cardbus as well... But it should be pretty easy to learn the PCI layer to allocate above PCIBIOS_MIN_IO _only_ when we allocate on the root bus. Something like this (completely untested)? I think you'd have to follow the

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Greg KH
On Wed, Aug 03, 2005 at 01:13:37AM +0400, Ivan Kokshaysky wrote: On Tue, Aug 02, 2005 at 10:11:40AM -0700, Linus Torvalds wrote: So I think it would be much easier to just make the change in pci_bus_alloc_resource(), and say that if the parent resource that we're testing starts at some

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Ivan Kokshaysky
On Tue, Aug 02, 2005 at 10:11:40AM -0700, Linus Torvalds wrote: So I think it would be much easier to just make the change in pci_bus_alloc_resource(), and say that if the parent resource that we're testing starts at some non-zero value, we just use that instead of min when we call down to

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Ivan Kokshaysky
On Tue, Aug 02, 2005 at 02:21:44PM -0700, Greg KH wrote: Nice, care to make up a single patch with these two changes in it? Yep, I'll do it shortly, plus some minor additions as separate patches. Ivan. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Martin J. Bligh
--Andrew Morton [EMAIL PROTECTED] wrote (on Thursday, July 28, 2005 23:10:29 -0700): Martin J. Bligh [EMAIL PROTECTED] wrote: NUMA-Q boxes are still crashing on boot with -mm BTW. Is the thing we identified earlier with the sched patches ...

Re: 2.6.13-rc3-mm3

2005-08-02 Thread Martin J. Bligh
--Martin J. Bligh [EMAIL PROTECTED] wrote (on Tuesday, August 02, 2005 18:17:33 -0700): --Andrew Morton [EMAIL PROTECTED] wrote (on Thursday, July 28, 2005 23:10:29 -0700): Martin J. Bligh [EMAIL PROTECTED] wrote: NUMA-Q boxes are still crashing on boot with -mm BTW. Is the thing we

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 16:16 -0700, Christoph Lameter wrote: > Hmmm. this should have returned the behavior to normal. Ah. Need to use > new_entry instead of entry. Try this (is there any way that I could get > access to the sytem? I am on IRC (freenode.net nick o-o) or on skype). > > +#ifdef

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Tue, 2 Aug 2005, Richard Purdie wrote: > > + update_mmu_cache(vma, address, entry); > > + lazy_mmu_prot_update(entry); > > +#endif > > This locks the system up after the "INIT: version 2.86 booting" message. > SysRq still responds but that's about it. Hmmm. this should have returned the

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 15:19 -0700, Christoph Lameter wrote: > On Mon, 1 Aug 2005, Richard Purdie wrote: > > That number rapidly increases and so it looks like something is failing > > and looping... > > Maybe we better restore the pte flags changes the way they were if > CONFIG_ATOMIC_TABLE_OPS

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 13:36 -0700, Christoph Lameter wrote: > Could you get me some more information about the hang? A stacktrace would > be useful. I've attached gdb to it and its stuck in memcpy (from glibc). The rest of the trace is junk as glibc's arm memcpy implementation will have

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: > That number rapidly increases and so it looks like something is failing > and looping... Maybe we better restore the pte flags changes the way they were if CONFIG_ATOMIC_TABLE_OPS is not defined. Try this instead. If this works then we need two

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: > cmpxchg_fail_flag_update 1359210189 > > That number rapidly increases and so it looks like something is failing > and looping... That looks like some trouble with the MMU. The time between pte read and write has been shortened through the page fault

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 14:40 -0700, Christoph Lameter wrote: > Can you run kgdb on it to figure out what is going on? Maybe, depending on how easily kgdb cross compiles and how quickly I can learn to use it... > There are some variables in /proc/vmstat that may help: > > spurious_page_faults 0 >

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: > > IP Not changing? Could it be in a loop doing faults for the same memory > > location that you cannot observe with gdb? Or is there some hardware fault > > that has stopped the processor? > > I'm not the worlds most experienced user of gdb but I

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 14:16 -0700, Christoph Lameter wrote: > On Mon, 1 Aug 2005, Richard Purdie wrote: > > I've attached gdb to it and its stuck in memcpy (from glibc). The rest > > of the trace is junk as glibc's arm memcpy implementation will have > > destroyed the frame pointer. The current

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: > On Mon, 2005-08-01 at 13:36 -0700, Christoph Lameter wrote: > > Could you get me some more information about the hang? A stacktrace would > > be useful. > > I've attached gdb to it and its stuck in memcpy (from glibc). The rest > of the trace is junk

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: > > Is this related to the size of the process? Can you do a successful kernel > > compile w/o X? > > Its an embedded device and lacks development tools to test that. I ran > some programs which abuse malloc and the process would quite happily hit > oom

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 09:10 -0700, Christoph Lameter wrote: > On Mon, 1 Aug 2005, Richard Purdie wrote: > > > The system appears to be ok and boots happily to a console but if you > > load any graphical UI, the screen will blank and the process stops > > working (tested with opie and and

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: > The system appears to be ok and boots happily to a console but if you > load any graphical UI, the screen will blank and the process stops > working (tested with opie and and xserver+GPE). You can kill -9 the > process but you can't regain the console

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Bjorn Helgaas
On Friday 29 July 2005 5:17 pm, Andrew Morton wrote: > Khalid Aziz <[EMAIL PROTECTED]> wrote: > > > > Serial console is broken on ia64 on an HP rx2600 machine on > > 2.6.13-rc3-mm3. When kernel is booted up with "console=ttyS,...", no > > output ever appe

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Stelian Pop
[Sorry all for the duplicate, LKML slipped somehow from the CC: line so I'm sending this again] Le dimanche 31 juillet 2005 à 16:22 -0700, Linus Torvalds a écrit : > Also, it looks like sonypi really is pretty nasty to probe for, so it's > not enough to just say "oh, it's a sony VAIO, let's

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Stelian Pop
[Sorry all for the duplicate, LKML slipped somehow from the CC: line so I'm sending this again] Le dimanche 31 juillet 2005 à 16:22 -0700, Linus Torvalds a écrit : Also, it looks like sonypi really is pretty nasty to probe for, so it's not enough to just say oh, it's a sony VAIO, let's

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Bjorn Helgaas
On Friday 29 July 2005 5:17 pm, Andrew Morton wrote: Khalid Aziz [EMAIL PROTECTED] wrote: Serial console is broken on ia64 on an HP rx2600 machine on 2.6.13-rc3-mm3. When kernel is booted up with console=ttyS,..., no output ever appears on the console and system is hung. So I booted

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: The system appears to be ok and boots happily to a console but if you load any graphical UI, the screen will blank and the process stops working (tested with opie and and xserver+GPE). You can kill -9 the process but you can't regain the console

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 09:10 -0700, Christoph Lameter wrote: On Mon, 1 Aug 2005, Richard Purdie wrote: The system appears to be ok and boots happily to a console but if you load any graphical UI, the screen will blank and the process stops working (tested with opie and and xserver+GPE).

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: Is this related to the size of the process? Can you do a successful kernel compile w/o X? Its an embedded device and lacks development tools to test that. I ran some programs which abuse malloc and the process would quite happily hit oom so it

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: On Mon, 2005-08-01 at 13:36 -0700, Christoph Lameter wrote: Could you get me some more information about the hang? A stacktrace would be useful. I've attached gdb to it and its stuck in memcpy (from glibc). The rest of the trace is junk as

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 14:16 -0700, Christoph Lameter wrote: On Mon, 1 Aug 2005, Richard Purdie wrote: I've attached gdb to it and its stuck in memcpy (from glibc). The rest of the trace is junk as glibc's arm memcpy implementation will have destroyed the frame pointer. The current

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: IP Not changing? Could it be in a loop doing faults for the same memory location that you cannot observe with gdb? Or is there some hardware fault that has stopped the processor? I'm not the worlds most experienced user of gdb but I can't see

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 14:40 -0700, Christoph Lameter wrote: Can you run kgdb on it to figure out what is going on? Maybe, depending on how easily kgdb cross compiles and how quickly I can learn to use it... There are some variables in /proc/vmstat that may help: spurious_page_faults 0

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: cmpxchg_fail_flag_update 1359210189 That number rapidly increases and so it looks like something is failing and looping... That looks like some trouble with the MMU. The time between pte read and write has been shortened through the page fault

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Mon, 1 Aug 2005, Richard Purdie wrote: That number rapidly increases and so it looks like something is failing and looping... Maybe we better restore the pte flags changes the way they were if CONFIG_ATOMIC_TABLE_OPS is not defined. Try this instead. If this works then we need two

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 13:36 -0700, Christoph Lameter wrote: Could you get me some more information about the hang? A stacktrace would be useful. I've attached gdb to it and its stuck in memcpy (from glibc). The rest of the trace is junk as glibc's arm memcpy implementation will have destroyed

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 15:19 -0700, Christoph Lameter wrote: On Mon, 1 Aug 2005, Richard Purdie wrote: That number rapidly increases and so it looks like something is failing and looping... Maybe we better restore the pte flags changes the way they were if CONFIG_ATOMIC_TABLE_OPS is not

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Christoph Lameter
On Tue, 2 Aug 2005, Richard Purdie wrote: + update_mmu_cache(vma, address, entry); + lazy_mmu_prot_update(entry); +#endif This locks the system up after the INIT: version 2.86 booting message. SysRq still responds but that's about it. Hmmm. this should have returned the behavior to

Re: 2.6.13-rc3-mm3

2005-08-01 Thread Richard Purdie
On Mon, 2005-08-01 at 16:16 -0700, Christoph Lameter wrote: Hmmm. this should have returned the behavior to normal. Ah. Need to use new_entry instead of entry. Try this (is there any way that I could get access to the sytem? I am on IRC (freenode.net nick o-o) or on skype). +#ifdef

Re: 2.6.13-rc3-mm3

2005-07-31 Thread Richard Purdie
On Thu, 2005-07-28 at 02:58 -0700, Andrew Morton wrote: > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm3/ I'm seeing a problem on ARM with -rc3-mm3 and -rc4-mm1. -rc3-mm2 and -rc4 are fine and looking for the problem reveals the problems start af

Re: 2.6.13-rc3-mm3

2005-07-31 Thread Stelian Pop
Le dimanche 31 juillet 2005 à 11:25 -0700, Linus Torvalds a écrit : > - The SonyPI driver just allocates IO regions in random areas. Those are not really random, the list of IO regions available is given in the ACPI SPIC device specification. The list is hardcoded here because the driver does

Re: 2.6.13-rc3-mm3

2005-07-31 Thread Linus Torvalds
On Sun, 31 Jul 2005, Manuel Lauss wrote: > > Linus Torvalds wrote: > > > > > - The SonyPI driver just allocates IO regions in random areas. It's got a > >list of places to try allocating in, and the 1080 area just happens to > >be the first on the list, and since it's not used by

Re: 2.6.13-rc3-mm3

2005-07-31 Thread Manuel Lauss
Linus Torvalds wrote: > > - The SonyPI driver just allocates IO regions in random areas. It's got a >list of places to try allocating in, and the 1080 area just happens to >be the first on the list, and since it's not used by anything else, it >succeeds (never mind that it's on

  1   2   3   >