[PATCH v3 0/9] Fixes for common mistakes w/ for_each_process and task-mm
Hi all, This is another resend of several task-mm fixes, the bugs I found during LMK code audit. Architectures were traverse the tasklist in an unsafe manner, plus there are a few cases of unsafe access to task-mm in general. There were no objections on the previous resend, and the final words were somewhere along the patches are fine line. In v3: - Dropped a controversal 'Make find_lock_task_mm() sparse-aware' patch; - Reword arm and sh commit messages, per Oleg Nesterov's suggestions; - Added an optimization trick in clear_tasks_mm_cpumask(): take only the rcu read lock, no need for the whole tasklist_lock. Suggested by Peter Zijlstra. In v2: - introduced a small helper in cpu.c: most arches duplicate the same [buggy] code snippet, so it's better to fix it and move the logic into a common function. Thanks, -- Anton Vorontsov Email: cbouatmai...@gmail.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/9] cpu: Introduce clear_tasks_mm_cpumask() helper
Many architectures clear tasks' mm_cpumask like this: read_lock(tasklist_lock); for_each_process(p) { if (p-mm) cpumask_clear_cpu(cpu, mm_cpumask(p-mm)); } read_unlock(tasklist_lock); Depending on the context, the code above may have several problems, such as: 1. Working with task-mm w/o getting mm or grabing the task lock is dangerous as -mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). 2. Checking for process-mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. This patch implements a small helper function that does things correctly, i.e.: 1. We take the task's lock while whe handle its mm (we can't use get_task_mm()/mmput() pair as mmput() might sleep); 2. To catch exited main thread case, we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held). Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in the new helper, instead we take the rcu read lock. We can do this because the function is called after the cpu is taken down and marked offline, so no new tasks will get this cpu set in their mm mask. Signed-off-by: Anton Vorontsov anton.voront...@linaro.org --- include/linux/cpu.h |1 + kernel/cpu.c| 26 ++ 2 files changed, 27 insertions(+) diff --git a/include/linux/cpu.h b/include/linux/cpu.h index ee28844..d2ca49f 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -179,6 +179,7 @@ extern void put_online_cpus(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) +void clear_tasks_mm_cpumask(int cpu); int cpu_down(unsigned int cpu); #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE diff --git a/kernel/cpu.c b/kernel/cpu.c index 2060c6e..ecdf499 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -10,6 +10,8 @@ #include linux/sched.h #include linux/unistd.h #include linux/cpu.h +#include linux/oom.h +#include linux/rcupdate.h #include linux/export.h #include linux/kthread.h #include linux/stop_machine.h @@ -171,6 +173,30 @@ void __ref unregister_cpu_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_cpu_notifier); +void clear_tasks_mm_cpumask(int cpu) +{ + struct task_struct *p; + + /* +* This function is called after the cpu is taken down and marked +* offline, so its not like new tasks will ever get this cpu set in +* their mm mask. -- Peter Zijlstra +* Thus, we may use rcu_read_lock() here, instead of grabbing +* full-fledged tasklist_lock. +*/ + rcu_read_lock(); + for_each_process(p) { + struct task_struct *t; + + t = find_lock_task_mm(p); + if (!t) + continue; + cpumask_clear_cpu(cpu, mm_cpumask(t-mm)); + task_unlock(t); + } + rcu_read_unlock(); +} + static inline void check_for_tasks(int cpu) { struct task_struct *p; -- 1.7.9.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/9] arm: Use clear_tasks_mm_cpumask()
Checking for process-mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() has this issue fixed, so let's use it. Suggested-by: Oleg Nesterov o...@redhat.com Signed-off-by: Anton Vorontsov anton.voront...@linaro.org --- arch/arm/kernel/smp.c |8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index addbbe8..e824ee4 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -130,7 +130,6 @@ static void percpu_timer_stop(void); int __cpu_disable(void) { unsigned int cpu = smp_processor_id(); - struct task_struct *p; int ret; ret = platform_cpu_disable(cpu); @@ -160,12 +159,7 @@ int __cpu_disable(void) flush_cache_all(); local_flush_tlb_all(); - read_lock(tasklist_lock); - for_each_process(p) { - if (p-mm) - cpumask_clear_cpu(cpu, mm_cpumask(p-mm)); - } - read_unlock(tasklist_lock); + clear_tasks_mm_cpumask(cpu); return 0; } -- 1.7.9.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 3/9] powerpc: Use clear_tasks_mm_cpumask()
Current CPU hotplug code has some task-mm handling issues: 1. Working with task-mm w/o getting mm or grabing the task lock is dangerous as -mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we must take the task lock while handle its mm. 2. Checking for process-mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() has all the issues fixed, so let's use it. Suggested-by: Oleg Nesterov o...@redhat.com Signed-off-by: Anton Vorontsov anton.voront...@linaro.org --- arch/powerpc/mm/mmu_context_nohash.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/mm/mmu_context_nohash.c b/arch/powerpc/mm/mmu_context_nohash.c index 5b63bd3..e779642 100644 --- a/arch/powerpc/mm/mmu_context_nohash.c +++ b/arch/powerpc/mm/mmu_context_nohash.c @@ -333,9 +333,7 @@ static int __cpuinit mmu_context_cpu_notify(struct notifier_block *self, unsigned long action, void *hcpu) { unsigned int cpu = (unsigned int)(long)hcpu; -#ifdef CONFIG_HOTPLUG_CPU - struct task_struct *p; -#endif + /* We don't touch CPU 0 map, it's allocated at aboot and kept * around forever */ @@ -358,12 +356,7 @@ static int __cpuinit mmu_context_cpu_notify(struct notifier_block *self, stale_map[cpu] = NULL; /* We also clear the cpu_vm_mask bits of CPUs going away */ - read_lock(tasklist_lock); - for_each_process(p) { - if (p-mm) - cpumask_clear_cpu(cpu, mm_cpumask(p-mm)); - } - read_unlock(tasklist_lock); + clear_tasks_mm_cpumask(cpu); break; #endif /* CONFIG_HOTPLUG_CPU */ } -- 1.7.9.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 4/9] sh: Use clear_tasks_mm_cpumask()
Checking for process-mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() has the issue fixed, so let's use it. Suggested-by: Oleg Nesterov o...@redhat.com Signed-off-by: Anton Vorontsov anton.voront...@linaro.org --- arch/sh/kernel/smp.c |7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c index eaebdf6..4664f76 100644 --- a/arch/sh/kernel/smp.c +++ b/arch/sh/kernel/smp.c @@ -123,7 +123,6 @@ void native_play_dead(void) int __cpu_disable(void) { unsigned int cpu = smp_processor_id(); - struct task_struct *p; int ret; ret = mp_ops-cpu_disable(cpu); @@ -153,11 +152,7 @@ int __cpu_disable(void) flush_cache_all(); local_flush_tlb_all(); - read_lock(tasklist_lock); - for_each_process(p) - if (p-mm) - cpumask_clear_cpu(cpu, mm_cpumask(p-mm)); - read_unlock(tasklist_lock); + clear_tasks_mm_cpumask(cpu); return 0; } -- 1.7.9.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 5/9] blackfin: A couple of task-mm handling fixes
The patch fixes two problems: 1. Working with task-mm w/o getting mm or grabing the task lock is dangerous as -mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we have to take the task lock while handle its mm. 2. Checking for process-mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held). Suggested-by: Oleg Nesterov o...@redhat.com Signed-off-by: Anton Vorontsov anton.voront...@linaro.org --- arch/blackfin/kernel/trace.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/arch/blackfin/kernel/trace.c b/arch/blackfin/kernel/trace.c index 44bbf2f..d08f0e3 100644 --- a/arch/blackfin/kernel/trace.c +++ b/arch/blackfin/kernel/trace.c @@ -10,6 +10,8 @@ #include linux/hardirq.h #include linux/thread_info.h #include linux/mm.h +#include linux/oom.h +#include linux/sched.h #include linux/uaccess.h #include linux/module.h #include linux/kallsyms.h @@ -28,7 +30,6 @@ void decode_address(char *buf, unsigned long address) struct task_struct *p; struct mm_struct *mm; unsigned long flags, offset; - unsigned char in_atomic = (bfin_read_IPEND() 0x10) || in_atomic(); struct rb_node *n; #ifdef CONFIG_KALLSYMS @@ -114,15 +115,15 @@ void decode_address(char *buf, unsigned long address) */ write_lock_irqsave(tasklist_lock, flags); for_each_process(p) { - mm = (in_atomic ? p-mm : get_task_mm(p)); - if (!mm) - continue; + struct task_struct *t; - if (!down_read_trylock(mm-mmap_sem)) { - if (!in_atomic) - mmput(mm); + t = find_lock_task_mm(p); + if (!t) continue; - } + + mm = t-mm; + if (!down_read_trylock(mm-mmap_sem)) + goto __continue; for (n = rb_first(mm-mm_rb); n; n = rb_next(n)) { struct vm_area_struct *vma; @@ -131,7 +132,7 @@ void decode_address(char *buf, unsigned long address) if (address = vma-vm_start address vma-vm_end) { char _tmpbuf[256]; - char *name = p-comm; + char *name = t-comm; struct file *file = vma-vm_file; if (file) { @@ -164,8 +165,7 @@ void decode_address(char *buf, unsigned long address) name, vma-vm_start, vma-vm_end); up_read(mm-mmap_sem); - if (!in_atomic) - mmput(mm); + task_unlock(t); if (buf[0] == '\0') sprintf(buf, [ %s ] dynamic memory, name); @@ -175,8 +175,8 @@ void decode_address(char *buf, unsigned long address) } up_read(mm-mmap_sem); - if (!in_atomic) - mmput(mm); +__continue: + task_unlock(t); } /* -- 1.7.9.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 6/9] blackfin: Fix possible deadlock in decode_address()
Oleg Nesterov found an interesting deadlock possibility: sysrq_showregs_othercpus() does smp_call_function(showacpu) and showacpu() show_stack()-decode_address(). Now suppose that IPI interrupts the task holding read_lock(tasklist). To fix this, blackfin should not grab the write_ variant of the tasklist lock, read_ one is enough. Suggested-by: Oleg Nesterov o...@redhat.com Signed-off-by: Anton Vorontsov anton.voront...@linaro.org --- arch/blackfin/kernel/trace.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/blackfin/kernel/trace.c b/arch/blackfin/kernel/trace.c index d08f0e3..f7f7a18 100644 --- a/arch/blackfin/kernel/trace.c +++ b/arch/blackfin/kernel/trace.c @@ -29,7 +29,7 @@ void decode_address(char *buf, unsigned long address) { struct task_struct *p; struct mm_struct *mm; - unsigned long flags, offset; + unsigned long offset; struct rb_node *n; #ifdef CONFIG_KALLSYMS @@ -113,7 +113,7 @@ void decode_address(char *buf, unsigned long address) * mappings of all our processes and see if we can't be a whee * bit more specific */ - write_lock_irqsave(tasklist_lock, flags); + read_lock(tasklist_lock); for_each_process(p) { struct task_struct *t; @@ -186,7 +186,7 @@ __continue: sprintf(buf, /* kernel dynamic memory */); done: - write_unlock_irqrestore(tasklist_lock, flags); + read_unlock(tasklist_lock); } #define EXPAND_LEN ((1 CONFIG_DEBUG_BFIN_HWTRACE_EXPAND_LEN) * 256 - 1) -- 1.7.9.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 7/9] um: Should hold tasklist_lock while traversing processes
Traversing the tasks requires holding tasklist_lock, otherwise it is unsafe. p.s. However, I'm not sure that calling os_kill_ptraced_process() in the atomic context is correct. It seem to work, but please take a closer look. Signed-off-by: Anton Vorontsov anton.voront...@linaro.org --- arch/um/kernel/reboot.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c index 4d93dff..66d754c 100644 --- a/arch/um/kernel/reboot.c +++ b/arch/um/kernel/reboot.c @@ -4,6 +4,7 @@ */ #include linux/sched.h +#include linux/spinlock.h #include linux/slab.h #include kern_util.h #include os.h @@ -22,6 +23,7 @@ static void kill_off_processes(void) struct task_struct *p; int pid; + read_lock(tasklist_lock); for_each_process(p) { if (p-mm == NULL) continue; @@ -29,6 +31,7 @@ static void kill_off_processes(void) pid = p-mm-context.id.u.pid; os_kill_ptraced_process(pid, 1); } + read_unlock(tasklist_lock); } } -- 1.7.9.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 8/9] um: Fix possible race on task-mm
Checking for task-mm is dangerous as -mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so let's take the task lock while we care about its mm. Note that we should also use find_lock_task_mm() to check all process' threads for a valid mm, but for uml we'll do it in a separate patch. Signed-off-by: Anton Vorontsov anton.voront...@linaro.org --- arch/um/kernel/reboot.c |7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c index 66d754c..1411f4e 100644 --- a/arch/um/kernel/reboot.c +++ b/arch/um/kernel/reboot.c @@ -25,10 +25,13 @@ static void kill_off_processes(void) read_lock(tasklist_lock); for_each_process(p) { - if (p-mm == NULL) + task_lock(p); + if (!p-mm) { + task_unlock(p); continue; - + } pid = p-mm-context.id.u.pid; + task_unlock(p); os_kill_ptraced_process(pid, 1); } read_unlock(tasklist_lock); -- 1.7.9.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 9/9] um: Properly check all process' threads for a live mm
kill_off_processes() might miss a valid process, this is because checking for process-mm is not enough. Process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held). Suggested-by: Oleg Nesterov o...@redhat.com Signed-off-by: Anton Vorontsov anton.voront...@linaro.org --- arch/um/kernel/reboot.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c index 1411f4e..3d15243 100644 --- a/arch/um/kernel/reboot.c +++ b/arch/um/kernel/reboot.c @@ -6,6 +6,7 @@ #include linux/sched.h #include linux/spinlock.h #include linux/slab.h +#include linux/oom.h #include kern_util.h #include os.h #include skas.h @@ -25,13 +26,13 @@ static void kill_off_processes(void) read_lock(tasklist_lock); for_each_process(p) { - task_lock(p); - if (!p-mm) { - task_unlock(p); + struct task_struct *t; + + t = find_lock_task_mm(p); + if (!t) continue; - } - pid = p-mm-context.id.u.pid; - task_unlock(p); + pid = t-mm-context.id.u.pid; + task_unlock(t); os_kill_ptraced_process(pid, 1); } read_unlock(tasklist_lock); -- 1.7.9.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH][2/3][RFC] TDM Framework
-Original Message- From: Aggrwal Poonam-B10812 Sent: Saturday, March 10, 2012 6:27 PM To: linuxppc-dev@lists.ozlabs.org Cc: Singh Sandeep-B37400; Aggrwal Poonam-B10812 Subject: [PATCH][2/3][RFC] TDM Framework Any feedback on this patchset? From: Sandeep Singh sand...@freescale.com TDM Framework is an attempt to provide a platform independent layer which can offer a standard interface for TDM access to different client modules. Beneath, the framework layer can house different types of TDM drivers to handle various TDM devices, the hardware intricacies of the devices being completely taken care by TDM drivers. This framework layer will allow any type of TDM device to hook with it. For example Freescale controller as on MPC8315, UCC based TDM controller, or any other controller. The main functions of this Framework are: - provides interface to TDM clients to access TDM functionalities. - provides standard interface for TDM drivers to hook with the framework. - handles various data handling stuff and buffer management. In future this Framework will be extended to provide Interface for Line control devices also. For example SLIC, E1/T1 Framers etc. Limitations/Future Work --- 1. Presently the framework supports only Single Port channelised mode. 2. Also the configurability options are limited which will be extended later on. 3. Only kernel mode TDM clients are supported currently. Support for User mode clients will be added later. Signed-off-by: Sandeep Singh sand...@freescale.com Signed-off-by: Poonam Aggrwal poonam.aggr...@freescale.com --- A couple of todos' are left in the patch, we are working on it and will be addressed in the updated patch set. drivers/Kconfig |1 + drivers/Makefile|1 + drivers/tdm/Kconfig | 25 + drivers/tdm/tdm-core.c | 1146 +++ include/linux/mod_devicetable.h | 11 + include/linux/tdm.h | 347 6 files changed, 1531 insertions(+), 0 deletions(-) create mode 100644 drivers/tdm/Kconfig create mode 100644 drivers/tdm/tdm-core.c create mode 100644 include/linux/tdm.h diff --git a/drivers/Kconfig b/drivers/Kconfig index ad6c1eb..25f7f5b 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -130,4 +130,5 @@ source drivers/virt/Kconfig source drivers/net/dpa/NetCommSw/Kconfig +source drivers/tdm/Kconfig endmenu diff --git a/drivers/Makefile b/drivers/Makefile index cd546eb..362b5ed 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -102,6 +102,7 @@ obj-$(CONFIG_INFINIBAND) += infiniband/ obj-$(CONFIG_SGI_SN) += sn/ obj-y+= firmware/ obj-$(CONFIG_CRYPTO) += crypto/ +obj-$(CONFIG_TDM)+= tdm/ obj-$(CONFIG_SUPERH) += sh/ obj-$(CONFIG_ARCH_SHMOBILE) += sh/ ifndef CONFIG_ARCH_USES_GETTIMEOFFSET diff --git a/drivers/tdm/Kconfig b/drivers/tdm/Kconfig new file mode 100644 index 000..8db2b05 --- /dev/null +++ b/drivers/tdm/Kconfig @@ -0,0 +1,25 @@ +# +# TDM subsystem configuration +# + +menuconfig TDM + tristate TDM support + ---help--- + More information is contained in the directory file:Documentation/tdm/, + especially in the file called summary there. + If you want TDM support, you should say Y here and also to the + specific driver for your bus adapter(s) below. + + This TDM support can also be built as a module. If so, the module + will be called tdm-core. + +if TDM + +config TDM_DEBUG_CORE + bool TDM Core debugging messages + help + Say Y here if you want the TDM core to produce a bunch of debug + messages to the system log. Select this if you are having a + problem with TDM support and want to see more of what is going on. + +endif # TDM diff --git a/drivers/tdm/tdm-core.c b/drivers/tdm/tdm-core.c new file mode 100644 index 000..cdda260 --- /dev/null +++ b/drivers/tdm/tdm-core.c @@ -0,0 +1,1146 @@ +/* driver/tdm/tdm-core.c + * + * Copyright (C) 2012 Freescale Semiconductor, Inc, All rights reserved. + * + * TDM core is the interface between TDM clients and TDM devices. + * It is also intended to serve as an interface for line controld + * devices later on. + * + * Author:Hemant Agrawal hem...@freescale.com + * Rajesh Gumasta rajesh.guma...@freescale.com + * + * Modified by Sandeep Kr Singh sand...@freescale.com + * Poonam Aggarwal poonam.aggar...@freescale.com + * 1. Added framework based initialization of device. + * 2. All the init/run time configuration is now done by framework. + * 3. Added channel level operations. + * + * This program is free software; you can redistribute it and/or +modify it + * under the terms of the GNU General Public License as published by +the + * Free Software Foundation; either
Re: [PATCH 5/9] dmaengine: provide a common function for completing a dma descriptor
On Mon, Apr 23, 2012 at 06:40:06PM +0900, Boojin Kim wrote: I met a problem on DMA cyclic mode (DMA_CYCLIC) for sound playback. Kernel BUG occurs during DMA transfer with DMA cyclic mode. This patch makes the cookies into zero. But, cookies should be kept during cyclic mode because cyclic mode re-uses the cookies. The protection is there to prevent cookies being accidentally re-used. If you're running a cyclic transfer, even then you shouldn't be completing the same cookie time and time again - I think Vinod also concurs with this. I think our preference is for cyclic transfers to entire remain uncompleted, or to get a new cookie each time they allegedly complete. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: PowerPC radeon KMS - is it possible?
On Fre, 2012-04-20 at 18:14 +0200, Gerhard Pircher wrote: Von: Michel Dänzer mic...@daenzer.net On Fre, 2012-04-20 at 13:15 +0200, Gerhard Pircher wrote: Von: Michel Dänzer mic...@daenzer.net On Don, 2012-04-19 at 13:48 +0200, Gerhard Pircher wrote: The former case is an explanation, why I see data corruption with my AGPGART driver (more or less a copy of the uninorth driver) on my non-coherent platform. There are no cache flushes done for writes to already mapped pages. As I said, the radeon driver always maps AGP memory uncacheable for the CPU, so no such CPU cache flushes should be necessary. I know. We also discussed this topic over two years ago. :-) What I didn't understand yet is how this uncacheable memory is allocated (well, I never took the time to look at this again). The functions in ttm_page_alloc.c seem to allocate normal cacheable memory and try to set the page flags with set_pages_array_uc(), which is more or less a no-op on powerpc. ttm_page_alloc_dma.c on the other side is only used with SWIOTLB!? [...] Could it be that the memory is finally mapped uncacheable by radeon_bo_kmap()/ ttm_bo_kmap()/..some other TTM functions../vmap()? Yeah, AFAICT, basically ttm_io_prot() defines the mapping attributes, and vmap() is used to enforce them for kernel mappings. Okay, that sounds like the approach used by arch/powerpc/mm/dma- noncoherent.c in my (green) ears. What about the PCIGART mode? Is the driver free to use cached memory in this mode? Yes, it assumes PCI(e) GART to be CPU cache coherent. [5.878809] [drm:radeon_test_moves] *ERROR* Incorrect VRAM-GTT copy 0: Got 0xf1416ec0, expected 0xf1570ec0 (VRAM map 0xf148-0xf158) [...] The VRAM-GTT copy totally puzzles me, as it returns a wrong page address, but the offset is fine!? Maybe it's still the values from the GTT-VRAM test, i.e. either the GPU writes didn't make it to the memory mapped into the AGP GART (some AGP bridges are known to have issues with that) or the CPU doesn't see it. What is the workaround for such an AGP bridge? If there is one at all... The only workaround short of not using AGP would be not doing GPU-AGP transfers, but that's not implemented or even planned at all. BTW, does your driver set cant_use_aperture, or is the linear aperture accessible by the CPU? The driver sets cant_use_aperture. I couldn't get it working at all without it. Does the hardware really not allow the CPU to access the linear aperture though? Because if it does, I wonder if that might be more reliable. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 5/9] dmaengine: provide a common function for completing a dma descriptor
On Mon, 2012-04-23 at 10:50 +0100, Russell King - ARM Linux wrote: On Mon, Apr 23, 2012 at 06:40:06PM +0900, Boojin Kim wrote: I met a problem on DMA cyclic mode (DMA_CYCLIC) for sound playback. Kernel BUG occurs during DMA transfer with DMA cyclic mode. This patch makes the cookies into zero. But, cookies should be kept during cyclic mode because cyclic mode re-uses the cookies. The protection is there to prevent cookies being accidentally re-used. If you're running a cyclic transfer, even then you shouldn't be completing the same cookie time and time again - I think Vinod also concurs with this. Right :) I recently committed patch for imx-dma which doesn't mark the cyclic descriptor as complete. Descriptor represents a transaction and makes no sense to complete t if the transaction is still continuing. I think our preference is for cyclic transfers to entire remain uncompleted, or to get a new cookie each time they allegedly complete. No it is not complete. Cyclic never completes, it aborts when user wants. The notification interrupt is for updating the counters/notifying (timestamp/periods elapsed in sound), and shouldn't be used for anything else -- ~Vinod ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 5/9] dmaengine: provide a common function for completing a dma descriptor
Russell King - ARM Linux wrote: Sent: Wednesday, March 07, 2012 7:35 AM To: Dan Williams; Vinod Koul Cc: Stephen Warren; Linus Walleij; Srinidhi Kasagar; Li Yang; linuxppc-dev@lists.ozlabs.org; linux-arm-ker...@lists.infradead.org Subject: [PATCH 5/9] dmaengine: provide a common function for completing a dma descriptor Provide a common function to do the cookie mechanics for completing a DMA descriptor. Dear Russell, I met a problem on DMA cyclic mode (DMA_CYCLIC) for sound playback. Kernel BUG occurs during DMA transfer with DMA cyclic mode. This patch makes the cookies into zero. But, cookies should be kept during cyclic mode because cyclic mode re-uses the cookies. So, Error occurs on BUG_ON(tx-cookie DMA_MIN_COOKIE) lines on dma_cookie_complete(). Please see following error. [ cut here ] kernel BUG at drivers/dma/dmaengine.h:53! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0Tainted: GW (3.4.0-rc2-00596-gc7d7a63 #5) PC is at pl330_tasklet+0x58c/0x59c LR is at _raw_spin_lock_irqsave+0x10/0x14 pc : [c0238a84]lr : [c052a3f4]psr: 68000193 sp : c06efde0 ip : fp : c06efe4c r10: r9 : r8 : c06efe18 r7 : d881b414 r6 : d881b410 r5 : d881b450 r4 : 0003 r3 : d8814bcc r2 : d8814b60 r1 : r0 : d8814b60 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel Control: 10c5387d Table: 57eb806a DAC: 0015 I think the completing a dma descriptor without setting zero to cookies is required for cyclic mode. Do I make new macro or modify dma_cookie_complete() for it? Thanks Boojin. Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk --- drivers/dma/amba-pl08x.c|2 +- drivers/dma/at_hdmac.c |2 +- drivers/dma/coh901318.c |2 +- drivers/dma/dmaengine.h | 18 ++ drivers/dma/dw_dmac.c |2 +- drivers/dma/ep93xx_dma.c|2 +- drivers/dma/fsldma.c|2 +- drivers/dma/imx-dma.c |2 +- drivers/dma/imx-sdma.c |2 +- drivers/dma/intel_mid_dma.c |2 +- drivers/dma/ioat/dma.c |3 +-- drivers/dma/ioat/dma_v2.c |3 +-- drivers/dma/ioat/dma_v3.c |3 +-- drivers/dma/ipu/ipu_idmac.c |2 +- drivers/dma/mxs-dma.c |2 +- drivers/dma/pl330.c |2 +- drivers/dma/ste_dma40.c |2 +- drivers/dma/timb_dma.c |2 +- drivers/dma/txx9dmac.c |2 +- 19 files changed, 36 insertions(+), 21 deletions(-) diff --git a/drivers/dma/amba-pl08x.c b/drivers/dma/amba-pl08x.c index 5996386..f0888c1 100644 --- a/drivers/dma/amba-pl08x.c +++ b/drivers/dma/amba-pl08x.c @@ -1540,7 +1540,7 @@ static void pl08x_tasklet(unsigned long data) if (txd) { /* Update last completed */ - plchan-chan.completed_cookie = txd-tx.cookie; + dma_cookie_complete(txd-tx); } /* If a new descriptor is queued, set it up plchan-at is NULL here */ diff --git a/drivers/dma/at_hdmac.c b/drivers/dma/at_hdmac.c index df47e7d..b282630 100644 --- a/drivers/dma/at_hdmac.c +++ b/drivers/dma/at_hdmac.c @@ -249,7 +249,7 @@ atc_chain_complete(struct at_dma_chan *atchan, struct at_desc *desc) dev_vdbg(chan2dev(atchan-chan_common), descriptor %u complete\n, txd-cookie); - atchan-chan_common.completed_cookie = txd-cookie; + dma_cookie_complete(txd); /* move children to free_list */ list_splice_init(desc-tx_list, atchan-free_list); diff --git a/drivers/dma/coh901318.c b/drivers/dma/coh901318.c index 843a1a3..24837d7 100644 --- a/drivers/dma/coh901318.c +++ b/drivers/dma/coh901318.c @@ -691,7 +691,7 @@ static void dma_tasklet(unsigned long data) callback_param = cohd_fin-desc.callback_param; /* sign this job as completed on the channel */ - cohc-chan.completed_cookie = cohd_fin-desc.cookie; + dma_cookie_complete(cohd_fin-desc); /* release the lli allocation and remove the descriptor */ coh901318_lli_free(cohc-base-pool, cohd_fin-lli); diff --git a/drivers/dma/dmaengine.h b/drivers/dma/dmaengine.h index 7692c86..47e0997 100644 --- a/drivers/dma/dmaengine.h +++ b/drivers/dma/dmaengine.h @@ -5,6 +5,7 @@ #ifndef DMAENGINE_H #define DMAENGINE_H +#include linux/bug.h #include linux/dmaengine.h /** @@ -27,4 +28,21 @@ static inline dma_cookie_t dma_cookie_assign(struct dma_async_tx_descriptor *tx) return cookie; } +/** + * dma_cookie_complete - complete a descriptor + * @tx: descriptor to complete + * + * Mark this descriptor complete by updating the channels completed + * cookie marker. Zero the descriptors cookie to prevent accidental + * repeated completions. + * + * Note: caller is expected to hold a lock to prevent concurrency. + */ +static inline void dma_cookie_complete(struct dma_async_tx_descriptor *tx) +{ + BUG_ON(tx-cookie DMA_MIN_COOKIE);
RE: [PATCH 5/9] dmaengine: provide a common function for completing a dma descriptor
Vinod Koul wrote: Sent: Monday, April 23, 2012 7:01 PM To: Russell King - ARM Linux Cc: 'Stephen Warren'; 'Linus Walleij'; 'Srinidhi Kasagar'; Boojin Kim; 'Dan Williams'; 'Li Yang'; linuxppc-dev@lists.ozlabs.org; linux-arm-ker...@lists.infradead.org Subject: Re: [PATCH 5/9] dmaengine: provide a common function for completing a dma descriptor On Mon, 2012-04-23 at 10:50 +0100, Russell King - ARM Linux wrote: On Mon, Apr 23, 2012 at 06:40:06PM +0900, Boojin Kim wrote: I met a problem on DMA cyclic mode (DMA_CYCLIC) for sound playback. Kernel BUG occurs during DMA transfer with DMA cyclic mode. This patch makes the cookies into zero. But, cookies should be kept during cyclic mode because cyclic mode re-uses the cookies. The protection is there to prevent cookies being accidentally re-used. If you're running a cyclic transfer, even then you shouldn't be completing the same cookie time and time again - I think Vinod also concurs with this. Right :) I recently committed patch for imx-dma which doesn't mark the cyclic descriptor as complete. Descriptor represents a transaction and makes no sense to complete t if the transaction is still continuing. Dear Vinod, you already fixed it. :) thanks. And I have other question. (Actually, It doesn't relate to this patch.) I met the DMA probing fail problem on Linux 3.4. It's because the return value on regulator_get() is changed from ENODEV to EPROBE_DEFER in case not to supply a vcore regulator. So, I try to change the check value about the return value of regulator_get() in amba_get_enable_vcore()from ENODEV to EPROBE_DEFER. How about it ? Do you already fix it too? Thanks, Boojin I think our preference is for cyclic transfers to entire remain uncompleted, or to get a new cookie each time they allegedly complete. No it is not complete. Cyclic never completes, it aborts when user wants. The notification interrupt is for updating the counters/notifying (timestamp/periods elapsed in sound), and shouldn't be used for anything else -- ~Vinod ___ linux-arm-kernel mailing list linux-arm-ker...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 5/9] dmaengine: provide a common function for completing a dma descriptor
On Mon, Apr 23, 2012 at 08:06:36PM +0900, Boojin Kim wrote: I met the DMA probing fail problem on Linux 3.4. It's because the return value on regulator_get() is changed from ENODEV to EPROBE_DEFER in case not to supply a vcore regulator. So, I try to change the check value about the return value of regulator_get() in amba_get_enable_vcore()from ENODEV to EPROBE_DEFER. How about it ? Do you already fix it too? A fix has already been committed and sent upstream - about a week ago. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/9] um: Should hold tasklist_lock while traversing processes
On 23.04.2012 09:09, Anton Vorontsov wrote: Traversing the tasks requires holding tasklist_lock, otherwise it is unsafe. p.s. However, I'm not sure that calling os_kill_ptraced_process() in the atomic context is correct. It seem to work, but please take a closer look. Signed-off-by: Anton Vorontsovanton.voront...@linaro.org --- You forgot my Ack and I've already explained why os_kill_ptraced_process() is fine. Thanks, //richard ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 7/9] um: Should hold tasklist_lock while traversing processes
On Mon, Apr 23, 2012 at 04:57:54PM +0200, Richard Weinberger wrote: On 23.04.2012 09:09, Anton Vorontsov wrote: Traversing the tasks requires holding tasklist_lock, otherwise it is unsafe. p.s. However, I'm not sure that calling os_kill_ptraced_process() in the atomic context is correct. It seem to work, but please take a closer look. Signed-off-by: Anton Vorontsovanton.voront...@linaro.org --- You forgot my Ack and I've already explained why os_kill_ptraced_process() is fine. Ouch, sorry! -- Anton Vorontsov Email: cbouatmai...@gmail.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: pci node question
-Original Message- From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org] Sent: Friday, April 20, 2012 4:11 PM To: Kumar Gala Cc: Yoder Stuart-B08248; linuxppc-dev@lists.ozlabs.org Subject: Re: pci node question On Fri, 2012-04-20 at 13:53 -0500, Kumar Gala wrote: On Apr 20, 2012, at 1:37 PM, Yoder Stuart-B08248 wrote: There was refactoring change a while back that moved the interrupt map down into the virtual pci bridge. example: 42 /* controller at 0x20 */ 43 pci0 { 44 compatible = fsl,p2041-pcie, fsl,qoriq-pcie-v2.2; 45 device_type = pci; 46 #size-cells = 2; 47 #address-cells = 3; 48 bus-range = 0x0 0xff; 49 clock-frequency = ; 50 interrupts = 16 2 1 15; 51 pcie@0 { 52 reg = 0 0 0 0 0; 53 #interrupt-cells = 1; 54 #size-cells = 2; 55 #address-cells = 3; 56 device_type = pci; 57 interrupts = 16 2 1 15; 58 interrupt-map-mask = 0xf800 0 0 7; 59 interrupt-map = 60 /* IDSEL 0x0 */ 61 0 0 1 mpic 40 1 0 0 62 0 0 2 mpic 1 1 0 0 63 0 0 3 mpic 2 1 0 0 64 0 0 4 mpic 3 1 0 0 65 ; 66 }; 67 }; Why was the interrupt-map moved here? Its been a while, but I think i moved it down because of which node is used for interrupt handling in linux. Yeah, it would swizzle if going accross the virtual P2P bridge. On the other hand, if it's PCIe, the children of the vP2P should always be at device 0 and thus the swizzling should be a NOP no ? So we -could- leave it in the PHB unless I'm mistaken... On the other hand, we probably want to fix the mapping code to handle things like SR-IOV PFs with different device numbers... do those get swizzled ? In that case we might want to keep things down the virtual bridge. Do we really need the error interrupt specified twice? I put it twice because it has multiple purposes, one has to do with interrupts defined by the PCI spec vs ones defined via FSL controller. Why is there a zeroed out reg property: reg = 0 0 0 0 0 ?? scratching my head, what happens if you remove it? If you remove it, the kernel will fail to match the vP2P with the corresponding struct pci_dev It's not zeroed out. It contains a valid bus/dev/fn ... of 0,0,0 :-) Hmmm... I vaguely understand what you are saying. But why is the length zero, that is what doesn't make sense on the surface? We really should put a comment on that reg property-- can you suggest a short comment that would capture what the reg with all zeros is for? ...I'll submit a patch. Thanks, Stuart ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: PowerPC radeon KMS - is it possible?
Original-Nachricht Datum: Mon, 23 Apr 2012 11:56:06 +0200 Von: Michel Dänzer mic...@daenzer.net An: Gerhard Pircher gerhard_pirc...@gmx.net CC: linuxppc-dev@lists.ozlabs.org Betreff: Re: PowerPC radeon KMS - is it possible? On Fre, 2012-04-20 at 18:14 +0200, Gerhard Pircher wrote: Von: Michel Dänzer mic...@daenzer.net On Fre, 2012-04-20 at 13:15 +0200, Gerhard Pircher wrote: Von: Michel Dänzer mic...@daenzer.net On Don, 2012-04-19 at 13:48 +0200, Gerhard Pircher wrote: The former case is an explanation, why I see data corruption with my AGPGART driver (more or less a copy of the uninorth driver) on my non-coherent platform. There are no cache flushes done for writes to already mapped pages. As I said, the radeon driver always maps AGP memory uncacheable for the CPU, so no such CPU cache flushes should be necessary. I know. We also discussed this topic over two years ago. :-) What I didn't understand yet is how this uncacheable memory is allocated (well, I never took the time to look at this again). The functions in ttm_page_alloc.c seem to allocate normal cacheable memory and try to set the page flags with set_pages_array_uc(), which is more or less a no-op on powerpc. ttm_page_alloc_dma.c on the other side is only used with SWIOTLB!? [...] Could it be that the memory is finally mapped uncacheable by radeon_bo_kmap()/ ttm_bo_kmap()/..some other TTM functions../vmap()? Yeah, AFAICT, basically ttm_io_prot() defines the mapping attributes, and vmap() is used to enforce them for kernel mappings. Okay, that sounds like the approach used by arch/powerpc/mm/dma- noncoherent.c in my (green) ears. What about the PCIGART mode? Is the driver free to use cached memory in this mode? Yes, it assumes PCI(e) GART to be CPU cache coherent. Okay. I guess it should be possible to modify it so that it makes use of uncacheable memory - just for testing!? PCIGART was working somehow on my platform up to the ~2.6.39 kernel, i.e. I could login to GNOME and open a program until the machine locked-up. :-) [5.878809] [drm:radeon_test_moves] *ERROR* Incorrect VRAM-GTT copy 0: Got 0xf1416ec0, expected 0xf1570ec0 (VRAM map 0xf148-0xf158) [...] The VRAM-GTT copy totally puzzles me, as it returns a wrong page address, but the offset is fine!? Maybe it's still the values from the GTT-VRAM test, i.e. either the GPU writes didn't make it to the memory mapped into the AGP That's indeed the case. I changed the code so that gtt_map is reinitialized with some magic value before the VRAM-GTT copy and the debug output shows that the GPU writes don't make it to the memory - or they are going to the wrong memory location, but that's harder to track. GART (some AGP bridges are known to have issues with that) or the CPU doesn't see it. What is the workaround for such an AGP bridge? If there is one at all... The only workaround short of not using AGP would be not doing GPU-AGP transfers, but that's not implemented or even planned at all. Okay. BTW, does your driver set cant_use_aperture, or is the linear aperture accessible by the CPU? The driver sets cant_use_aperture. I couldn't get it working at all without it. Does the hardware really not allow the CPU to access the linear aperture though? Because if it does, I wonder if that might be more reliable. I'm afraid no, but I could try it out again. BTW: I see that the uninorth driver defines needs_scratch_page. What is this actually good for? Thanks! Gerhard -- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] powerpc/mpic: Fix confusion between hw_irq and virq
On Fri, 20 Apr 2012 13:29:34 +1000, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: mpic_is_ipi() takes a virq and immediately converts it to a hw_irq. However, one of the two call sites calls it with a ... hw_irq. The other call site also happens to have the hw_irq at hand, so let's change it to just take that as an argument. Also change mpic_is_tm() for consistency. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org Looks good to me. Are you pushing these up to Linus for v3.4? g. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: pci node question
On Mon, 2012-04-23 at 15:48 +, Yoder Stuart-B08248 wrote: Hmmm... I vaguely understand what you are saying. But why is the length zero, that is what doesn't make sense on the surface? Ah indeed, on would think that we could represent the real size of the config space there... but I've always seen this as 0 in real OF implementations as well, at least the G5 device-tree I'm looking at now has it that way. We really should put a comment on that reg property-- can you suggest a short comment that would capture what the reg with all zeros is for? ...I'll submit a patch. Hrm.. config address of the device: 0,0,0 ? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] powerpc/mpic: Fix confusion between hw_irq and virq
On Mon, 2012-04-23 at 11:32 -0600, Grant Likely wrote: On Fri, 20 Apr 2012 13:29:34 +1000, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: mpic_is_ipi() takes a virq and immediately converts it to a hw_irq. However, one of the two call sites calls it with a ... hw_irq. The other call site also happens to have the hw_irq at hand, so let's change it to just take that as an argument. Also change mpic_is_tm() for consistency. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org Looks good to me. Are you pushing these up to Linus for v3.4? Already did :-) Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: pci node question
-Original Message- From: Benjamin Herrenschmidt [mailto:b...@kernel.crashing.org] Sent: Monday, April 23, 2012 3:22 PM To: Yoder Stuart-B08248 Cc: Kumar Gala; linuxppc-dev@lists.ozlabs.org Subject: RE: pci node question On Mon, 2012-04-23 at 15:48 +, Yoder Stuart-B08248 wrote: Hmmm... I vaguely understand what you are saying. But why is the length zero, that is what doesn't make sense on the surface? Ah indeed, on would think that we could represent the real size of the config space there... but I've always seen this as 0 in real OF implementations as well, at least the G5 device-tree I'm looking at now has it that way. We really should put a comment on that reg property-- can you suggest a short comment that would capture what the reg with all zeros is for? ...I'll submit a patch. Hrm.. config address of the device: 0,0,0 ? How about: /* config space address of the device 0,0,0. OF convention is that the size of config space is 0. */ ...is that accurate? Stuart ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: pci node question
On Mon, 2012-04-23 at 20:32 +, Yoder Stuart-B08248 wrote: /* config space address of the device 0,0,0. OF convention is that the size of config space is 0. */ ...is that accurate? Dunno about OF convention here, but it's what I've seen done. In any case, anything with a PCI device in the .dts will have that, I don't know whether that's worth putting such a large comment in every .dts around... Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Get rid of PowerPC users of NR_IRQS
Hi Ben, Here's the 2nd version of the NR_IRQS cleanup patches for PowerPC. I've addressed your comments and compile tested them, but I've not tried to boot. If they check out for you then they should probably go into v3.4. g. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 1/2] powerpc/8xx: Fix NR_IRQ bugs and refactor 8xx interrupt controller
The mpc8xx driver uses a reference to NR_IRQS that is buggy. It uses NR_IRQs for the array size of the ppc_cached_irq_mask bitmap, but NR_IRQs could be smaller than the number of hardware irqs that ppc_cached_irq_mask tracks. Also, while fixing that problem, it became apparent that the interrupt controller only supports 32 interrupt numbers, but it is written as if it supports multiple register banks which is more complicated. This patch pulls out the buggy reference to NR_IRQs and fixes the size of the ppc_cached_irq_mask to match the number of HW irqs. It also drops the now-unnecessary code since ppc_cached_irq_mask is no longer an array. v2: - Rename ppc_cached_irq_mask to mpc8xx_cached_irq_mask - Drop duplicate forward declaration Signed-off-by: Grant Likely grant.lik...@secretlab.ca Cc: Benjamin Herrenschmidt b...@kernel.crashing.org --- arch/powerpc/sysdev/mpc8xx_pic.c | 61 +- 1 file changed, 20 insertions(+), 41 deletions(-) diff --git a/arch/powerpc/sysdev/mpc8xx_pic.c b/arch/powerpc/sysdev/mpc8xx_pic.c index d5f5416..b724622 100644 --- a/arch/powerpc/sysdev/mpc8xx_pic.c +++ b/arch/powerpc/sysdev/mpc8xx_pic.c @@ -18,69 +18,45 @@ extern int cpm_get_irq(struct pt_regs *regs); static struct irq_domain *mpc8xx_pic_host; -#define NR_MASK_WORDS ((NR_IRQS + 31) / 32) -static unsigned long ppc_cached_irq_mask[NR_MASK_WORDS]; +static unsigned long mpc8xx_cached_irq_mask; static sysconf8xx_t __iomem *siu_reg; -int cpm_get_irq(struct pt_regs *regs); +static inline unsigned long mpc8xx_irqd_to_bit(struct irq_data *d) +{ + return 0x8000 irqd_to_hwirq(d); +} static void mpc8xx_unmask_irq(struct irq_data *d) { - int bit, word; - unsigned int irq_nr = (unsigned int)irqd_to_hwirq(d); - - bit = irq_nr 0x1f; - word = irq_nr 5; - - ppc_cached_irq_mask[word] |= (1 (31-bit)); - out_be32(siu_reg-sc_simask, ppc_cached_irq_mask[word]); + mpc8xx_cached_irq_mask |= mpc8xx_irqd_to_bit(d); + out_be32(siu_reg-sc_simask, mpc8xx_cached_irq_mask); } static void mpc8xx_mask_irq(struct irq_data *d) { - int bit, word; - unsigned int irq_nr = (unsigned int)irqd_to_hwirq(d); - - bit = irq_nr 0x1f; - word = irq_nr 5; - - ppc_cached_irq_mask[word] = ~(1 (31-bit)); - out_be32(siu_reg-sc_simask, ppc_cached_irq_mask[word]); + mpc8xx_cached_irq_mask = ~mpc8xx_irqd_to_bit(d); + out_be32(siu_reg-sc_simask, mpc8xx_cached_irq_mask); } static void mpc8xx_ack(struct irq_data *d) { - int bit; - unsigned int irq_nr = (unsigned int)irqd_to_hwirq(d); - - bit = irq_nr 0x1f; - out_be32(siu_reg-sc_sipend, 1 (31-bit)); + out_be32(siu_reg-sc_sipend, mpc8xx_irqd_to_bit(d)); } static void mpc8xx_end_irq(struct irq_data *d) { - int bit, word; - unsigned int irq_nr = (unsigned int)irqd_to_hwirq(d); - - bit = irq_nr 0x1f; - word = irq_nr 5; - - ppc_cached_irq_mask[word] |= (1 (31-bit)); - out_be32(siu_reg-sc_simask, ppc_cached_irq_mask[word]); + mpc8xx_cached_irq_mask |= mpc8xx_irqd_to_bit(d); + out_be32(siu_reg-sc_simask, mpc8xx_cached_irq_mask); } static int mpc8xx_set_irq_type(struct irq_data *d, unsigned int flow_type) { - if (flow_type IRQ_TYPE_EDGE_FALLING) { - irq_hw_number_t hw = (unsigned int)irqd_to_hwirq(d); + /* only external IRQ senses are programmable */ + if ((flow_type IRQ_TYPE_EDGE_FALLING) !(irqd_to_hwirq(d) 1)) { unsigned int siel = in_be32(siu_reg-sc_siel); - - /* only external IRQ senses are programmable */ - if ((hw 1) == 0) { - siel |= (0x8000 hw); - out_be32(siu_reg-sc_siel, siel); - __irq_set_handler_locked(d-irq, handle_edge_irq); - } + siel |= mpc8xx_irqd_to_bit(d); + out_be32(siu_reg-sc_siel, siel); + __irq_set_handler_locked(d-irq, handle_edge_irq); } return 0; } @@ -132,6 +108,9 @@ static int mpc8xx_pic_host_xlate(struct irq_domain *h, struct device_node *ct, IRQ_TYPE_EDGE_FALLING, }; + if (intspec[0] 0x1f) + return 0; + *out_hwirq = intspec[0]; if (intsize 1 intspec[1] 4) *out_flags = map_pic_senses[intspec[1]]; -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2 2/2] irqdomain/powerpc: Fix broken NR_IRQ references
The switch from using irq_map to irq_alloc_desc*() for managing irq number allocations introduced new bugs in some of the powerpc interrupt code. Several functions rely on the value of NR_IRQS to determine the maximum irq number that could get allocated. However, with sparse_irq and using irq_alloc_desc*() the maximum possible irq number is now specified with 'nr_irqs' which may be a number larger than NR_IRQS. This has caused breakage on powermac when CONFIG_NR_IRQS is set to 32. This patch removes most of the direct references to NR_IRQS in the powerpc code and replaces them with either a nr_irqs reference or by using the common for_each_irq_desc() macro. The powerpc-specific for_each_irq() macro is removed at the same time. Also, the Cell axon_msi driver is refactored to remove the global build assumption on the size of NR_IRQS and instead add a limit to the maximum irq number when calling irq_domain_add_nomap(). v2: Reinstate comment on 65536 irq limit deleted in v1. Signed-off-by: Grant Likely grant.lik...@secretlab.ca Cc: Benjamin Herrenschmidt b...@kernel.crashing.org --- arch/powerpc/include/asm/irq.h |4 arch/powerpc/kernel/irq.c|6 +- arch/powerpc/kernel/machine_kexec.c |7 ++- arch/powerpc/platforms/cell/axon_msi.c |8 +++- arch/powerpc/platforms/cell/beat_interrupt.c |2 +- arch/powerpc/platforms/powermac/pic.c|6 +++--- arch/powerpc/sysdev/cpm2_pic.c |3 +-- arch/powerpc/sysdev/xics/xics-common.c |7 +++ 8 files changed, 14 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h index e648af9..0e40843 100644 --- a/arch/powerpc/include/asm/irq.h +++ b/arch/powerpc/include/asm/irq.h @@ -18,10 +18,6 @@ #include linux/atomic.h -/* Define a way to iterate across irqs. */ -#define for_each_irq(i) \ - for ((i) = 0; (i) NR_IRQS; ++(i)) - extern atomic_t ppc_n_lost_interrupts; /* This number is used when no interrupt has been assigned */ diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index 5ec1b23..43eb74f 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -330,14 +330,10 @@ void migrate_irqs(void) alloc_cpumask_var(mask, GFP_KERNEL); - for_each_irq(irq) { + for_each_irq_desc(irq, desc) { struct irq_data *data; struct irq_chip *chip; - desc = irq_to_desc(irq); - if (!desc) - continue; - data = irq_desc_get_irq_data(desc); if (irqd_is_per_cpu(data)) continue; diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c index c957b12..5df 100644 --- a/arch/powerpc/kernel/machine_kexec.c +++ b/arch/powerpc/kernel/machine_kexec.c @@ -23,14 +23,11 @@ void machine_kexec_mask_interrupts(void) { unsigned int i; + struct irq_desc *desc; - for_each_irq(i) { - struct irq_desc *desc = irq_to_desc(i); + for_each_irq_desc(i, desc) { struct irq_chip *chip; - if (!desc) - continue; - chip = irq_desc_get_chip(desc); if (!chip) continue; diff --git a/arch/powerpc/platforms/cell/axon_msi.c b/arch/powerpc/platforms/cell/axon_msi.c index d09f3e8..85825b5 100644 --- a/arch/powerpc/platforms/cell/axon_msi.c +++ b/arch/powerpc/platforms/cell/axon_msi.c @@ -114,7 +114,7 @@ static void axon_msi_cascade(unsigned int irq, struct irq_desc *desc) pr_devel(axon_msi: woff %x roff %x msi %x\n, write_offset, msic-read_offset, msi); - if (msi NR_IRQS irq_get_chip_data(msi) == msic) { + if (msi nr_irqs irq_get_chip_data(msi) == msic) { generic_handle_irq(msi); msic-fifo_virt[idx] = cpu_to_le32(0x); } else { @@ -276,9 +276,6 @@ static int axon_msi_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) if (rc) return rc; - /* We rely on being able to stash a virq in a u16 */ - BUILD_BUG_ON(NR_IRQS 65536); - list_for_each_entry(entry, dev-msi_list, list) { virq = irq_create_direct_mapping(msic-irq_domain); if (virq == NO_IRQ) { @@ -392,7 +389,8 @@ static int axon_msi_probe(struct platform_device *device) } memset(msic-fifo_virt, 0xff, MSIC_FIFO_SIZE_BYTES); - msic-irq_domain = irq_domain_add_nomap(dn, 0, msic_host_ops, msic); + /* We rely on being able to stash a virq in a u16, so limit irqs to 65536 */ + msic-irq_domain = irq_domain_add_nomap(dn, 65536, msic_host_ops, msic); if (!msic-irq_domain) { printk(KERN_ERR axon_msi: couldn't allocate irq_domain for
[PATCH v2 0/2] Get rid of PowerPC users of NR_IRQS
Oops. Flubbed the subject on the below email. This is the cover letter for the NR_IRQS patches, so the subject should have included [PATCH v2 0/2] On Mon, Apr 23, 2012 at 4:30 PM, Grant Likely grant.lik...@secretlab.ca wrote: Hi Ben, Here's the 2nd version of the NR_IRQS cleanup patches for PowerPC. I've addressed your comments and compile tested them, but I've not tried to boot. If they check out for you then they should probably go into v3.4. g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH][2/3][RFC] TDM Framework
On 03/10/2012 06:57 AM, Poonam Aggrwal wrote: diff --git a/drivers/Kconfig b/drivers/Kconfig index ad6c1eb..25f7f5b 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -130,4 +130,5 @@ source drivers/virt/Kconfig source drivers/net/dpa/NetCommSw/Kconfig +source drivers/tdm/Kconfig endmenu When posting patches to this list, please ensure that they are based on an upstream Linux kernel and not a Freescale BSP kernel. +if TDM + +config TDM_DEBUG_CORE + bool TDM Core debugging messages + help + Say Y here if you want the TDM core to produce a bunch of debug + messages to the system log. Select this if you are having a + problem with TDM support and want to see more of what is going on. + +endif # TDM Please use the normal kernel mechanisms for controlling debug messages. diff --git a/drivers/tdm/tdm-core.c b/drivers/tdm/tdm-core.c new file mode 100644 index 000..cdda260 --- /dev/null +++ b/drivers/tdm/tdm-core.c @@ -0,0 +1,1146 @@ +/* driver/tdm/tdm-core.c + * + * Copyright (C) 2012 Freescale Semiconductor, Inc, All rights reserved. + * + * TDM core is the interface between TDM clients and TDM devices. + * It is also intended to serve as an interface for line controld + * devices later on. + * + * Author:Hemant Agrawal hem...@freescale.com + * Rajesh Gumasta rajesh.guma...@freescale.com + * + * Modified by Sandeep Kr Singh sand...@freescale.com + * Poonam Aggarwal poonam.aggar...@freescale.com + * 1. Added framework based initialization of device. + * 2. All the init/run time configuration is now done by framework. + * 3. Added channel level operations. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +/* if read write debug required */ +#undef TDM_CORE_DEBUG + +#include linux/module.h +#include linux/kernel.h +#include linux/errno.h +#include linux/slab.h +#include linux/tdm.h +#include linux/init.h +#include linux/idr.h +#include linux/mutex.h +#include linux/completion.h +#include linux/hardirq.h +#include linux/irqflags.h +#include linux/list.h +#include linux/uaccess.h +#include linux/io.h +#include device/tdm_fsl.h What is a reference to tdm_fsl.h doing in the presumably hardware-independent tdm-core.c? +static DEFINE_MUTEX(tdm_core_lock); +static DEFINE_IDR(tdm_adapter_idr); +/* List of TDM adapters registered with TDM framework */ +LIST_HEAD(adapter_list); + +/* List of TDM clients registered with TDM framework */ +LIST_HEAD(driver_list); + +/* In case the previous data is not fetched by the client driver, the + * de-interleaving function will discard the old data and rewrite the + * new data */ +static int use_latest_tdm_data = 1; Describe when one would want to change this, and provide a better way to change it (e.g. module parameter or sysfs knob). /* * Linux kernel * multi-line comment style * is like this. */ +/* this tasklet is created for each adapter instance */ +static void tdm_data_tasklet_fn(unsigned long); + +/* tries to match client driver with the adapter */ +static int tdm_device_match(struct tdm_driver *driver, struct tdm_adapter *adap) +{ + /* match on an id table if there is one */ + if (driver-id_table driver-id_table-name[0]) { + if (!(strcmp(driver-id_table-name, adap-name))) + return (int)driver-id_table; + } + return TDM_E_OK; +} s/TDM_E_OK/0/g +static int tdm_attach_driver_adap(struct tdm_driver *driver, + struct tdm_adapter *adap) +{ + /* if driver is already attached to any other adapter, return*/ + if (driver-adapter (driver-adapter != adap)) + return TDM_E_OK; Why can't one driver service multiple adapters? How would multiple drivers service one adapter? Are there sub-devices that need individual controlling? + driver-adapter = adap; + + if (driver-attach_adapter) { + if (driver-attach_adapter(adap) 0) + /* We ignore the return code; if it fails, too bad */ + pr_err(attach_adapter failed for driver [%s]\n, + driver-name); Why ignore errors? +/* Detach client driver and adapter */ +static int tdm_detach_driver_adap(struct
Re: [PATCH v2 1/2] powerpc/8xx: Fix NR_IRQ bugs and refactor 8xx interrupt controller
On Mon, 2012-04-23 at 16:30 -0600, Grant Likely wrote: The mpc8xx driver uses a reference to NR_IRQS that is buggy. It uses NR_IRQs for the array size of the ppc_cached_irq_mask bitmap, but NR_IRQs could be smaller than the number of hardware irqs that ppc_cached_irq_mask tracks. Joakim, any chance you can test this ASAP ? :-) Thanks ! Cheers, Ben. Also, while fixing that problem, it became apparent that the interrupt controller only supports 32 interrupt numbers, but it is written as if it supports multiple register banks which is more complicated. This patch pulls out the buggy reference to NR_IRQs and fixes the size of the ppc_cached_irq_mask to match the number of HW irqs. It also drops the now-unnecessary code since ppc_cached_irq_mask is no longer an array. v2: - Rename ppc_cached_irq_mask to mpc8xx_cached_irq_mask - Drop duplicate forward declaration Signed-off-by: Grant Likely grant.lik...@secretlab.ca Cc: Benjamin Herrenschmidt b...@kernel.crashing.org --- arch/powerpc/sysdev/mpc8xx_pic.c | 61 +- 1 file changed, 20 insertions(+), 41 deletions(-) diff --git a/arch/powerpc/sysdev/mpc8xx_pic.c b/arch/powerpc/sysdev/mpc8xx_pic.c index d5f5416..b724622 100644 --- a/arch/powerpc/sysdev/mpc8xx_pic.c +++ b/arch/powerpc/sysdev/mpc8xx_pic.c @@ -18,69 +18,45 @@ extern int cpm_get_irq(struct pt_regs *regs); static struct irq_domain *mpc8xx_pic_host; -#define NR_MASK_WORDS ((NR_IRQS + 31) / 32) -static unsigned long ppc_cached_irq_mask[NR_MASK_WORDS]; +static unsigned long mpc8xx_cached_irq_mask; static sysconf8xx_t __iomem *siu_reg; -int cpm_get_irq(struct pt_regs *regs); +static inline unsigned long mpc8xx_irqd_to_bit(struct irq_data *d) +{ + return 0x8000 irqd_to_hwirq(d); +} static void mpc8xx_unmask_irq(struct irq_data *d) { - int bit, word; - unsigned int irq_nr = (unsigned int)irqd_to_hwirq(d); - - bit = irq_nr 0x1f; - word = irq_nr 5; - - ppc_cached_irq_mask[word] |= (1 (31-bit)); - out_be32(siu_reg-sc_simask, ppc_cached_irq_mask[word]); + mpc8xx_cached_irq_mask |= mpc8xx_irqd_to_bit(d); + out_be32(siu_reg-sc_simask, mpc8xx_cached_irq_mask); } static void mpc8xx_mask_irq(struct irq_data *d) { - int bit, word; - unsigned int irq_nr = (unsigned int)irqd_to_hwirq(d); - - bit = irq_nr 0x1f; - word = irq_nr 5; - - ppc_cached_irq_mask[word] = ~(1 (31-bit)); - out_be32(siu_reg-sc_simask, ppc_cached_irq_mask[word]); + mpc8xx_cached_irq_mask = ~mpc8xx_irqd_to_bit(d); + out_be32(siu_reg-sc_simask, mpc8xx_cached_irq_mask); } static void mpc8xx_ack(struct irq_data *d) { - int bit; - unsigned int irq_nr = (unsigned int)irqd_to_hwirq(d); - - bit = irq_nr 0x1f; - out_be32(siu_reg-sc_sipend, 1 (31-bit)); + out_be32(siu_reg-sc_sipend, mpc8xx_irqd_to_bit(d)); } static void mpc8xx_end_irq(struct irq_data *d) { - int bit, word; - unsigned int irq_nr = (unsigned int)irqd_to_hwirq(d); - - bit = irq_nr 0x1f; - word = irq_nr 5; - - ppc_cached_irq_mask[word] |= (1 (31-bit)); - out_be32(siu_reg-sc_simask, ppc_cached_irq_mask[word]); + mpc8xx_cached_irq_mask |= mpc8xx_irqd_to_bit(d); + out_be32(siu_reg-sc_simask, mpc8xx_cached_irq_mask); } static int mpc8xx_set_irq_type(struct irq_data *d, unsigned int flow_type) { - if (flow_type IRQ_TYPE_EDGE_FALLING) { - irq_hw_number_t hw = (unsigned int)irqd_to_hwirq(d); + /* only external IRQ senses are programmable */ + if ((flow_type IRQ_TYPE_EDGE_FALLING) !(irqd_to_hwirq(d) 1)) { unsigned int siel = in_be32(siu_reg-sc_siel); - - /* only external IRQ senses are programmable */ - if ((hw 1) == 0) { - siel |= (0x8000 hw); - out_be32(siu_reg-sc_siel, siel); - __irq_set_handler_locked(d-irq, handle_edge_irq); - } + siel |= mpc8xx_irqd_to_bit(d); + out_be32(siu_reg-sc_siel, siel); + __irq_set_handler_locked(d-irq, handle_edge_irq); } return 0; } @@ -132,6 +108,9 @@ static int mpc8xx_pic_host_xlate(struct irq_domain *h, struct device_node *ct, IRQ_TYPE_EDGE_FALLING, }; + if (intspec[0] 0x1f) + return 0; + *out_hwirq = intspec[0]; if (intsize 1 intspec[1] 4) *out_flags = map_pic_senses[intspec[1]]; ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev