Re: [PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
On Mon, 2009-08-03 at 12:57 -0500, Dave Kleikamp wrote: cpu_last_thread_in_core(cpu) is a moving target. You want something like: cpu = cpu_first_thread_in_core(cpu); last = cpu_last_thread_in_core(cpu); while (cpu = last) { __clear_bit(id, stale_map[cpu]); cpu++; } Or, keeping the for loop: for (cpu = cpu_first_thread_in_core(cpu), last = cpu_last_thread_in_core(cpu); cpu = last; cpu++) cpu++; Yeah, whatever form is good, I had a brain fart and didn't see that in the end of loop, cpu would have actually crossed the boundary to the next core and so cpu_last_thread_in_core() would change. Just some short circuit in a neuron somewhere. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
On Aug 2, 2009, at 9:03 PM, Michael Ellerman wrote: On Sat, 2009-08-01 at 08:29 +1000, Benjamin Herrenschmidt wrote: On Thu, 2009-07-30 at 22:35 -0500, Kumar Gala wrote: /* XXX This clear should ultimately be part of local_flush_tlb_mm */ - __clear_bit(id, stale_map[cpu]); + for (cpu = cpu_first_thread_in_core(cpu); + cpu = cpu_last_thread_in_core(cpu); cpu++) + __clear_bit(id, stale_map[cpu]); } This looks a bit dodgy. using 'cpu' as both the loop variable and what you are computing to determine loop start/end.. Hrm... I would have thought that it was still correct... do you see any reason why the above code is wrong ? because if not we may be hitting a gcc issue... IE. At loop init, cpu gets clamped down to the first thread in the core, which should be fine. Then, we compare CPU to the last thread in core for the current CPU which should always return the same value. So I'm very interested to know what is actually wrong, ie, either I'm just missing something obvious, or you are just pushing a bug under the carpet which could come back and bit us later :-) for (cpu = cpu_first_thread_in_core(cpu); cpu = cpu_last_thread_in_core(cpu); cpu++) __clear_bit(id, stale_map[cpu]); == cpu = cpu_first_thread_in_core(cpu); while (cpu = cpu_last_thread_in_core(cpu)) { __clear_bit(id, stale_map[cpu]); cpu++; } cpu = 0 cpu = 1 cpu++ (1) cpu = 1 cpu++ (2) cpu = 3 ... Which is pretty much what I see, in a dual core setup, I get an oops because we are trying to clear cpu #2 (which clearly doesn't exist) cpu = 1 (in loop) clearing 1 clearing 2 OOPS - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
On Mon, 2009-08-03 at 11:21 -0500, Kumar Gala wrote: On Aug 2, 2009, at 9:03 PM, Michael Ellerman wrote: On Sat, 2009-08-01 at 08:29 +1000, Benjamin Herrenschmidt wrote: On Thu, 2009-07-30 at 22:35 -0500, Kumar Gala wrote: /* XXX This clear should ultimately be part of local_flush_tlb_mm */ - __clear_bit(id, stale_map[cpu]); + for (cpu = cpu_first_thread_in_core(cpu); + cpu = cpu_last_thread_in_core(cpu); cpu++) + __clear_bit(id, stale_map[cpu]); } This looks a bit dodgy. using 'cpu' as both the loop variable and what you are computing to determine loop start/end.. Hrm... I would have thought that it was still correct... do you see any reason why the above code is wrong ? because if not we may be hitting a gcc issue... IE. At loop init, cpu gets clamped down to the first thread in the core, which should be fine. Then, we compare CPU to the last thread in core for the current CPU which should always return the same value. So I'm very interested to know what is actually wrong, ie, either I'm just missing something obvious, or you are just pushing a bug under the carpet which could come back and bit us later :-) for (cpu = cpu_first_thread_in_core(cpu); cpu = cpu_last_thread_in_core(cpu); cpu++) __clear_bit(id, stale_map[cpu]); == cpu = cpu_first_thread_in_core(cpu); while (cpu = cpu_last_thread_in_core(cpu)) { __clear_bit(id, stale_map[cpu]); cpu++; } cpu_last_thread_in_core(cpu) is a moving target. You want something like: cpu = cpu_first_thread_in_core(cpu); last = cpu_last_thread_in_core(cpu); while (cpu = last) { __clear_bit(id, stale_map[cpu]); cpu++; } cpu = 0 cpu = 1 cpu++ (1) cpu = 1 cpu++ (2) cpu = 3 ... Which is pretty much what I see, in a dual core setup, I get an oops because we are trying to clear cpu #2 (which clearly doesn't exist) cpu = 1 (in loop) clearing 1 clearing 2 OOPS - k -- David Kleikamp IBM Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
On Mon, 2009-08-03 at 12:06 -0500, Dave Kleikamp wrote: On Mon, 2009-08-03 at 11:21 -0500, Kumar Gala wrote: On Aug 2, 2009, at 9:03 PM, Michael Ellerman wrote: for (cpu = cpu_first_thread_in_core(cpu); cpu = cpu_last_thread_in_core(cpu); cpu++) __clear_bit(id, stale_map[cpu]); == cpu = cpu_first_thread_in_core(cpu); while (cpu = cpu_last_thread_in_core(cpu)) { __clear_bit(id, stale_map[cpu]); cpu++; } cpu_last_thread_in_core(cpu) is a moving target. You want something like: cpu = cpu_first_thread_in_core(cpu); last = cpu_last_thread_in_core(cpu); while (cpu = last) { __clear_bit(id, stale_map[cpu]); cpu++; } Or, keeping the for loop: for (cpu = cpu_first_thread_in_core(cpu), last = cpu_last_thread_in_core(cpu); cpu = last; cpu++) cpu++; cpu = 0 cpu = 1 cpu++ (1) cpu = 1 cpu++ (2) cpu = 3 ... Which is pretty much what I see, in a dual core setup, I get an oops because we are trying to clear cpu #2 (which clearly doesn't exist) cpu = 1 (in loop) clearing 1 clearing 2 OOPS - k -- David Kleikamp IBM Linux Technology Center ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
for (cpu = cpu_first_thread_in_core(cpu); cpu = cpu_last_thread_in_core(cpu); cpu++) __clear_bit(id, stale_map[cpu]); == cpu = cpu_first_thread_in_core(cpu); while (cpu = cpu_last_thread_in_core(cpu)) { __clear_bit(id, stale_map[cpu]); cpu++; } cpu = 0 cpu = 1 cpu++ (1) cpu = 1 cpu++ (2) cpu = 3 ... Ah right, /me takes snow out of his eyes... indeed, the upper bound is fubar. Hrm. Allright, we'll use a temp. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
On Sat, 2009-08-01 at 08:29 +1000, Benjamin Herrenschmidt wrote: On Thu, 2009-07-30 at 22:35 -0500, Kumar Gala wrote: /* XXX This clear should ultimately be part of local_flush_tlb_mm */ - __clear_bit(id, stale_map[cpu]); + for (cpu = cpu_first_thread_in_core(cpu); + cpu = cpu_last_thread_in_core(cpu); cpu++) + __clear_bit(id, stale_map[cpu]); } This looks a bit dodgy. using 'cpu' as both the loop variable and what you are computing to determine loop start/end.. Hrm... I would have thought that it was still correct... do you see any reason why the above code is wrong ? because if not we may be hitting a gcc issue... IE. At loop init, cpu gets clamped down to the first thread in the core, which should be fine. Then, we compare CPU to the last thread in core for the current CPU which should always return the same value. So I'm very interested to know what is actually wrong, ie, either I'm just missing something obvious, or you are just pushing a bug under the carpet which could come back and bit us later :-) for (cpu = cpu_first_thread_in_core(cpu); cpu = cpu_last_thread_in_core(cpu); cpu++) __clear_bit(id, stale_map[cpu]); == cpu = cpu_first_thread_in_core(cpu); while (cpu = cpu_last_thread_in_core(cpu)) { __clear_bit(id, stale_map[cpu]); cpu++; } cpu = 0 cpu = 1 cpu++ (1) cpu = 1 cpu++ (2) cpu = 3 ... :) cheers signature.asc Description: This is a digitally signed message part ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
On Thu, 2009-07-30 at 22:35 -0500, Kumar Gala wrote: /* XXX This clear should ultimately be part of local_flush_tlb_mm */ - __clear_bit(id, stale_map[cpu]); + for (cpu = cpu_first_thread_in_core(cpu); + cpu = cpu_last_thread_in_core(cpu); cpu++) + __clear_bit(id, stale_map[cpu]); } This looks a bit dodgy. using 'cpu' as both the loop variable and what you are computing to determine loop start/end.. Hrm... I would have thought that it was still correct... do you see any reason why the above code is wrong ? because if not we may be hitting a gcc issue... IE. At loop init, cpu gets clamped down to the first thread in the core, which should be fine. Then, we compare CPU to the last thread in core for the current CPU which should always return the same value. So I'm very interested to know what is actually wrong, ie, either I'm just missing something obvious, or you are just pushing a bug under the carpet which could come back and bit us later :-) Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
On Jul 24, 2009, at 4:15 AM, Benjamin Herrenschmidt wrote: The current no hash MMU context management code is written with the assumption that one CPU == one TLB. This is not the case on implementations that support HW multithreading, where several linux CPUs can share the same TLB. This adds some basic support for this to our context management and our TLB flushing code. It also cleans up the optional debugging output a bit Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- I'm getting this nice oops on 32-bit book-e SMP (and I'm guessing its because of this patch) Unable to handle kernel paging request for data at address 0x Faulting instruction address: 0xc0016dac Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=8 MPC8572 DS Modules linked in: NIP: c0016dac LR: c0016d58 CTR: 001e REGS: eed77ce0 TRAP: 0300 Not tainted (2.6.31-rc4-00442-gdb4c9c5) MSR: 00021000 ME,CE CR: 24288482 XER: 2000 DEAR: , ESR: TASK = eecfe140[1581] 'msgctl08' THREAD: eed76000 CPU: 0 GPR00: 0040 eed77d90 eecfe140 0001 c05bf074 c05c0cf4 GPR08: 0003 0002 ff7f 9b05 1004f894 c05bdd24 0001 GPR16: c05ab890 c05c0ce8 c04e0f58 c04da364 c05c c04cfa04 GPR24: 0002 c05c0cd8 0080 ef056380 0017 NIP [c0016dac] switch_mmu_context+0x15c/0x520 LR [c0016d58] switch_mmu_context+0x108/0x520 Call Trace: [eed77d90] [c0016d58] switch_mmu_context+0x108/0x520 (unreliable) [eed77df0] [c040efec] schedule+0x2bc/0x800 [eed77e70] [c01b9268] do_msgrcv+0x198/0x420 [eed77ef0] [c01b9520] sys_msgrcv+0x30/0xa0 [eed77f10] [c0003fe8] sys_ipc+0x1a8/0x2c0 [eed77f40] [c00116c4] ret_from_syscall+0x0/0x3c Instruction dump: 57402834 7c00f850 3920fffe 5d2a003e 397b0010 5500103a 7ceb0214 6000 6000 8167 39080001 38e70004 7c0be82e 7c005038 7c0be92e 8126 ---[ end trace 3c4c3106446e1bd8 ]--- - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
On Jul 30, 2009, at 10:12 PM, Kumar Gala wrote: On Jul 24, 2009, at 4:15 AM, Benjamin Herrenschmidt wrote: The current no hash MMU context management code is written with the assumption that one CPU == one TLB. This is not the case on implementations that support HW multithreading, where several linux CPUs can share the same TLB. This adds some basic support for this to our context management and our TLB flushing code. It also cleans up the optional debugging output a bit Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- I'm getting this nice oops on 32-bit book-e SMP (and I'm guessing its because of this patch) Unable to handle kernel paging request for data at address 0x Faulting instruction address: 0xc0016dac Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=8 MPC8572 DS Modules linked in: NIP: c0016dac LR: c0016d58 CTR: 001e REGS: eed77ce0 TRAP: 0300 Not tainted (2.6.31-rc4-00442-gdb4c9c5) MSR: 00021000 ME,CE CR: 24288482 XER: 2000 DEAR: , ESR: TASK = eecfe140[1581] 'msgctl08' THREAD: eed76000 CPU: 0 GPR00: 0040 eed77d90 eecfe140 0001 c05bf074 c05c0cf4 GPR08: 0003 0002 ff7f 9b05 1004f894 c05bdd24 0001 GPR16: c05ab890 c05c0ce8 c04e0f58 c04da364 c05c c04cfa04 GPR24: 0002 c05c0cd8 0080 ef056380 0017 NIP [c0016dac] switch_mmu_context+0x15c/0x520 LR [c0016d58] switch_mmu_context+0x108/0x520 Call Trace: [eed77d90] [c0016d58] switch_mmu_context+0x108/0x520 (unreliable) [eed77df0] [c040efec] schedule+0x2bc/0x800 [eed77e70] [c01b9268] do_msgrcv+0x198/0x420 [eed77ef0] [c01b9520] sys_msgrcv+0x30/0xa0 [eed77f10] [c0003fe8] sys_ipc+0x1a8/0x2c0 [eed77f40] [c00116c4] ret_from_syscall+0x0/0x3c Instruction dump: 57402834 7c00f850 3920fffe 5d2a003e 397b0010 5500103a 7ceb0214 6000 6000 8167 39080001 38e70004 7c0be82e 7c005038 7c0be92e 8126 ---[ end trace 3c4c3106446e1bd8 ]--- On Jul 24, 2009, at 4:15 AM, Benjamin Herrenschmidt wrote: @@ -247,15 +261,20 @@ void switch_mmu_context(struct mm_struct * local TLB for it and unmark it before we use it */ if (test_bit(id, stale_map[cpu])) { - pr_devel([%d] flushing stale context %d for mm @%p !\n, -cpu, id, next); + pr_hardcont( | stale flush %d [%d..%d], + id, cpu_first_thread_in_core(cpu), + cpu_last_thread_in_core(cpu)); + local_flush_tlb_mm(next); /* XXX This clear should ultimately be part of local_flush_tlb_mm */ - __clear_bit(id, stale_map[cpu]); + for (cpu = cpu_first_thread_in_core(cpu); +cpu = cpu_last_thread_in_core(cpu); cpu++) + __clear_bit(id, stale_map[cpu]); } This looks a bit dodgy. using 'cpu' as both the loop variable and what you are computing to determine loop start/end.. Changing this to: unsigned int i; ... for (i = cpu_first_thread_in_core(cpu); i = cpu_last_thread_in_core(cpu); i++) __clear_bit(id, stale_map[i]); seems to clear up the oops. - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
The current no hash MMU context management code is written with the assumption that one CPU == one TLB. This is not the case on implementations that support HW multithreading, where several linux CPUs can share the same TLB. This adds some basic support for this to our context management and our TLB flushing code. It also cleans up the optional debugging output a bit Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- arch/powerpc/include/asm/cputhreads.h | 16 + arch/powerpc/mm/mmu_context_nohash.c | 93 ++ arch/powerpc/mm/tlb_nohash.c | 10 ++- 3 files changed, 86 insertions(+), 33 deletions(-) --- linux-work.orig/arch/powerpc/mm/mmu_context_nohash.c2009-07-21 12:43:27.0 +1000 +++ linux-work/arch/powerpc/mm/mmu_context_nohash.c 2009-07-21 12:56:16.0 +1000 @@ -25,10 +25,20 @@ * also clear mm-cpu_vm_mask bits when processes are migrated */ -#undef DEBUG -#define DEBUG_STEAL_ONLY -#undef DEBUG_MAP_CONSISTENCY -/*#define DEBUG_CLAMP_LAST_CONTEXT 15 */ +#define DEBUG_MAP_CONSISTENCY +#define DEBUG_CLAMP_LAST_CONTEXT 31 +//#define DEBUG_HARDER + +/* We don't use DEBUG because it tends to be compiled in always nowadays + * and this would generate way too much output + */ +#ifdef DEBUG_HARDER +#define pr_hard(args...) printk(KERN_DEBUG args) +#define pr_hardcont(args...) printk(KERN_CONT args) +#else +#define pr_hard(args...) do { } while(0) +#define pr_hardcont(args...) do { } while(0) +#endif #include linux/kernel.h #include linux/mm.h @@ -71,7 +81,7 @@ static DEFINE_SPINLOCK(context_lock); static unsigned int steal_context_smp(unsigned int id) { struct mm_struct *mm; - unsigned int cpu, max; + unsigned int cpu, max, i; max = last_context - first_context; @@ -89,15 +99,22 @@ static unsigned int steal_context_smp(un id = first_context; continue; } - pr_devel([%d] steal context %d from mm @%p\n, -smp_processor_id(), id, mm); + pr_hardcont( | steal %d from 0x%p, id, mm); /* Mark this mm has having no context anymore */ mm-context.id = MMU_NO_CONTEXT; - /* Mark it stale on all CPUs that used this mm */ - for_each_cpu(cpu, mm_cpumask(mm)) - __set_bit(id, stale_map[cpu]); + /* Mark it stale on all CPUs that used this mm. For threaded +* implementations, we set it on all threads on each core +* represented in the mask. A future implementation will use +* a core map instead but this will do for now. +*/ + for_each_cpu(cpu, mm_cpumask(mm)) { + for (i = cpu_first_thread_in_core(cpu); +i = cpu_last_thread_in_core(cpu); i++) + __set_bit(id, stale_map[i]); + cpu = i - 1; + } return id; } @@ -126,7 +143,7 @@ static unsigned int steal_context_up(uns /* Pick up the victim mm */ mm = context_mm[id]; - pr_devel([%d] steal context %d from mm @%p\n, cpu, id, mm); + pr_hardcont( | steal %d from 0x%p, id, mm); /* Flush the TLB for that context */ local_flush_tlb_mm(mm); @@ -179,19 +196,14 @@ void switch_mmu_context(struct mm_struct /* No lockless fast path .. yet */ spin_lock(context_lock); -#ifndef DEBUG_STEAL_ONLY - pr_devel([%d] activating context for mm @%p, active=%d, id=%d\n, -cpu, next, next-context.active, next-context.id); -#endif + pr_hard([%d] activating context for mm @%p, active=%d, id=%d, + cpu, next, next-context.active, next-context.id); #ifdef CONFIG_SMP /* Mark us active and the previous one not anymore */ next-context.active++; if (prev) { -#ifndef DEBUG_STEAL_ONLY - pr_devel( old context %p active was: %d\n, -prev, prev-context.active); -#endif + pr_hardcont( (old=0x%p a=%d), prev, prev-context.active); WARN_ON(prev-context.active 1); prev-context.active--; } @@ -201,8 +213,14 @@ void switch_mmu_context(struct mm_struct /* If we already have a valid assigned context, skip all that */ id = next-context.id; - if (likely(id != MMU_NO_CONTEXT)) + if (likely(id != MMU_NO_CONTEXT)) { +#ifdef DEBUG_MAP_CONSISTENCY + if (context_mm[id] != next) + pr_err(MMU: mm 0x%p has id %d but context_mm[%d] says 0x%p\n, + next, id, id, context_mm[id]); +#endif goto ctxt_ok; + } /* We really don't have a context, let's try to acquire one */ id = next_context; @@ -234,11
[PATCH 3/20] powerpc/mm: Add HW threads support to no_hash TLB management
The current no hash MMU context management code is written with the assumption that one CPU == one TLB. This is not the case on implementations that support HW multithreading, where several linux CPUs can share the same TLB. This adds some basic support for this to our context management and our TLB flushing code. It also cleans up the optional debugging output a bit Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- arch/powerpc/include/asm/cputhreads.h | 16 + arch/powerpc/mm/mmu_context_nohash.c | 93 ++ arch/powerpc/mm/tlb_nohash.c | 10 ++- 3 files changed, 86 insertions(+), 33 deletions(-) --- linux-work.orig/arch/powerpc/mm/mmu_context_nohash.c2009-07-21 12:43:27.0 +1000 +++ linux-work/arch/powerpc/mm/mmu_context_nohash.c 2009-07-21 12:56:16.0 +1000 @@ -25,10 +25,20 @@ * also clear mm-cpu_vm_mask bits when processes are migrated */ -#undef DEBUG -#define DEBUG_STEAL_ONLY -#undef DEBUG_MAP_CONSISTENCY -/*#define DEBUG_CLAMP_LAST_CONTEXT 15 */ +#define DEBUG_MAP_CONSISTENCY +#define DEBUG_CLAMP_LAST_CONTEXT 31 +//#define DEBUG_HARDER + +/* We don't use DEBUG because it tends to be compiled in always nowadays + * and this would generate way too much output + */ +#ifdef DEBUG_HARDER +#define pr_hard(args...) printk(KERN_DEBUG args) +#define pr_hardcont(args...) printk(KERN_CONT args) +#else +#define pr_hard(args...) do { } while(0) +#define pr_hardcont(args...) do { } while(0) +#endif #include linux/kernel.h #include linux/mm.h @@ -71,7 +81,7 @@ static DEFINE_SPINLOCK(context_lock); static unsigned int steal_context_smp(unsigned int id) { struct mm_struct *mm; - unsigned int cpu, max; + unsigned int cpu, max, i; max = last_context - first_context; @@ -89,15 +99,22 @@ static unsigned int steal_context_smp(un id = first_context; continue; } - pr_devel([%d] steal context %d from mm @%p\n, -smp_processor_id(), id, mm); + pr_hardcont( | steal %d from 0x%p, id, mm); /* Mark this mm has having no context anymore */ mm-context.id = MMU_NO_CONTEXT; - /* Mark it stale on all CPUs that used this mm */ - for_each_cpu(cpu, mm_cpumask(mm)) - __set_bit(id, stale_map[cpu]); + /* Mark it stale on all CPUs that used this mm. For threaded +* implementations, we set it on all threads on each core +* represented in the mask. A future implementation will use +* a core map instead but this will do for now. +*/ + for_each_cpu(cpu, mm_cpumask(mm)) { + for (i = cpu_first_thread_in_core(cpu); +i = cpu_last_thread_in_core(cpu); i++) + __set_bit(id, stale_map[i]); + cpu = i - 1; + } return id; } @@ -126,7 +143,7 @@ static unsigned int steal_context_up(uns /* Pick up the victim mm */ mm = context_mm[id]; - pr_devel([%d] steal context %d from mm @%p\n, cpu, id, mm); + pr_hardcont( | steal %d from 0x%p, id, mm); /* Flush the TLB for that context */ local_flush_tlb_mm(mm); @@ -179,19 +196,14 @@ void switch_mmu_context(struct mm_struct /* No lockless fast path .. yet */ spin_lock(context_lock); -#ifndef DEBUG_STEAL_ONLY - pr_devel([%d] activating context for mm @%p, active=%d, id=%d\n, -cpu, next, next-context.active, next-context.id); -#endif + pr_hard([%d] activating context for mm @%p, active=%d, id=%d, + cpu, next, next-context.active, next-context.id); #ifdef CONFIG_SMP /* Mark us active and the previous one not anymore */ next-context.active++; if (prev) { -#ifndef DEBUG_STEAL_ONLY - pr_devel( old context %p active was: %d\n, -prev, prev-context.active); -#endif + pr_hardcont( (old=0x%p a=%d), prev, prev-context.active); WARN_ON(prev-context.active 1); prev-context.active--; } @@ -201,8 +213,14 @@ void switch_mmu_context(struct mm_struct /* If we already have a valid assigned context, skip all that */ id = next-context.id; - if (likely(id != MMU_NO_CONTEXT)) + if (likely(id != MMU_NO_CONTEXT)) { +#ifdef DEBUG_MAP_CONSISTENCY + if (context_mm[id] != next) + pr_err(MMU: mm 0x%p has id %d but context_mm[%d] says 0x%p\n, + next, id, id, context_mm[id]); +#endif goto ctxt_ok; + } /* We really don't have a context, let's try to acquire one */ id = next_context; @@ -234,11