Re: [RFC 1/3] add support for exporting symbols from .S files
Hi Arnd, On Mon, 11 Aug 2008 16:18:07 +0200 Arnd Bergmann [EMAIL PROTECTED] wrote: +#ifdef CONFIG_MODULES +.macro __EXPORT_SYMBOL sym section symtab strtab + .section \section,a,@progbits + .type \symtab, @object + .ifeq BITS_PER_LONG-32 + .align 3 +\symtab: + .long \sym + .long \strtab + .else + .align 4 This won't be portable across architectures as .align is sometimes in bytes and sometimes a power of two. You can use .balign or .p2align portably on gas, though. -- Cheers, Stephen Rothwell[EMAIL PROTECTED] http://www.canb.auug.org.au/~sfr/ pgpLdpBDqgrJj.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] powerpc: Fix loss of vdso on fork on 32bit
When we fork, init_new_context() improperly resets the vdso_base of the new context to 0. That means that the new process loses access to the vdso for signal trampolines. The initialization should be unnecessary anyway as the context on a fresh mm should be 0 in the first place and binfmt_elf will initialize that value for a newly loaded process. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] arch/powerpc/include/asm/mmu_context.h |1 - 1 file changed, 1 deletion(-) --- linux-work.orig/arch/powerpc/include/asm/mmu_context.h 2008-08-12 17:01:06.0 +1000 +++ linux-work/arch/powerpc/include/asm/mmu_context.h 2008-08-12 17:01:08.0 +1000 @@ -147,7 +147,6 @@ static inline void get_mmu_context(struc static inline int init_new_context(struct task_struct *t, struct mm_struct *mm) { mm-context.id = NO_CONTEXT; - mm-context.vdso_base = 0; return 0; } ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/5] Build files needed for relocation
As my scripts did wrong commit, I have posted wrong patches. I am posting the patches 3,4 and 5 again. Sorry for the inconvenience. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/5] Build files needed for relocation
Build files needed for relocation This patch builds vmlinux file with relocation sections and contents so that relocs user space program can extract the required relocation offsets. This packs final relocatable vmlinux kernel as following: earlier part of relocation apply code, vmlinux, rest of relocation apply code. TODO: Relocatable vmlinux image is built in arch/powerpc/boot as vmlinux.reloc. But it should be built in top level directory of kernel source as vmlinux instead of vmlinux.reloc Signed-off-by: Mohan Kumar M [EMAIL PROTECTED] --- arch/powerpc/Kconfig| 15 ++-- arch/powerpc/Makefile |9 ++- arch/powerpc/boot/Makefile | 39 -- arch/powerpc/boot/vmlinux.lds.S | 28 + arch/powerpc/boot/vmlinux.reloc.scr |8 +++ 5 files changed, 91 insertions(+), 8 deletions(-) create mode 100644 arch/powerpc/boot/vmlinux.lds.S create mode 100644 arch/powerpc/boot/vmlinux.reloc.scr diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 63c9caf..b992bc1 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -332,6 +332,15 @@ config CRASH_DUMP Don't change this unless you know what you are doing. +config RELOCATABLE_PPC64 + bool Build a relocatable kernel (EXPERIMENTAL) + depends on PPC_MULTIPLATFORM PPC64 CRASH_DUMP EXPERIMENTAL + help + Build a kernel suitable for use as regular kernel and kdump capture + kernel. + + Don't change this unless you know what you are doing. + config PHYP_DUMP bool Hypervisor-assisted dump (EXPERIMENTAL) depends on PPC_PSERIES EXPERIMENTAL @@ -694,7 +703,7 @@ config LOWMEM_SIZE default 0x3000 config RELOCATABLE - bool Build a relocatable kernel (EXPERIMENTAL) + bool Build relocatable kernel (EXPERIMENTAL) depends on EXPERIMENTAL ADVANCED_OPTIONS FLATMEM FSL_BOOKE help This builds a kernel image that is capable of running at the @@ -814,11 +823,11 @@ config PAGE_OFFSET default 0xc000 config KERNEL_START hex - default 0xc200 if CRASH_DUMP + default 0xc200 if CRASH_DUMP !RELOCATABLE_PPC64 default 0xc000 config PHYSICAL_START hex - default 0x0200 if CRASH_DUMP + default 0x0200 if CRASH_DUMP !RELOCATABLE_PPC64 default 0x endif diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 9155c93..1bfdeea 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -63,7 +63,7 @@ override CC += -m$(CONFIG_WORD_SIZE) override AR:= GNUTARGET=elf$(CONFIG_WORD_SIZE)-powerpc $(AR) endif -LDFLAGS_vmlinux:= -Bstatic +LDFLAGS_vmlinux:= --emit-relocs CFLAGS-$(CONFIG_PPC64) := -mminimal-toc -mtraceback=none -mcall-aixdesc CFLAGS-$(CONFIG_PPC32) := -ffixed-r2 -mmultiple @@ -146,11 +146,16 @@ core-$(CONFIG_KVM)+= arch/powerpc/kvm/ drivers-$(CONFIG_OPROFILE) += arch/powerpc/oprofile/ # Default to zImage, override when needed + +ifneq ($(CONFIG_RELOCATABLE_PPC64),y) all: zImage +else +all: zImage vmlinux.reloc +endif CPPFLAGS_vmlinux.lds := -Upowerpc -BOOT_TARGETS = zImage zImage.initrd uImage zImage% dtbImage% treeImage.% cuImage.% simpleImage.% +BOOT_TARGETS = zImage vmlinux.reloc zImage.initrd uImage zImage% dtbImage% treeImage.% cuImage.% simpleImage.% PHONY += $(BOOT_TARGETS) diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 14174aa..a67a701 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -17,7 +17,7 @@ # CROSS32_COMPILE is setup as a prefix just like CROSS_COMPILE # in the toplevel makefile. -all: $(obj)/zImage +all: $(obj)/zImage $(obj)/vmlinux.reloc BOOTCFLAGS:= -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \ -fno-strict-aliasing -Os -msoft-float -pipe \ @@ -122,18 +122,51 @@ $(patsubst %.S,%.o, $(filter %.S, $(src-boot))): %.o: %.S FORCE $(obj)/wrapper.a: $(obj-wlib) FORCE $(call if_changed,bootar) -hostprogs-y:= addnote addRamDisk hack-coff mktree dtc +hostprogs-y:= addnote addRamDisk hack-coff mktree dtc relocs targets+= $(patsubst $(obj)/%,%,$(obj-boot) wrapper.a) extra-y:= $(obj)/wrapper.a $(obj-plat) $(obj)/empty.o \ $(obj)/zImage.lds $(obj)/zImage.coff.lds $(obj)/zImage.ps3.lds +ifeq ($(CONFIG_RELOCATABLE_PPC64),y) +extra-y+= $(obj)/vmlinux.lds +endif + dtstree:= $(srctree)/$(src)/dts wrapper:=$(srctree)/$(src)/wrapper -wrapperbits:= $(extra-y) $(addprefix $(obj)/,addnote hack-coff mktree dtc) \ +wrapperbits:= $(extra-y) $(addprefix $(obj)/,addnote hack-coff mktree dtc relocs) \ $(wrapper) FORCE +ifeq ($(CONFIG_RELOCATABLE_PPC64),y) + +targets +=
[PATCH 3/5] Apply relocation
Apply relocation This code is a wrapper around regular kernel. This checks whether the kernel is loaded at 32MB, if its not loaded at 32MB, its treated as a regular kernel and the control is given to the kernel immediately. If the kernel is loaded at 32MB, it applies relocation delta to each offset in the list which was generated and appended by patch 1 and 2. After updating all offsets, control is given to the relocatable kernel. Signed-off-by: Mohan Kumar M [EMAIL PROTECTED] --- arch/powerpc/boot/reloc_apply.S | 242 +++ 1 files changed, 242 insertions(+), 0 deletions(-) create mode 100644 arch/powerpc/boot/reloc_apply.S diff --git a/arch/powerpc/boot/reloc_apply.S b/arch/powerpc/boot/reloc_apply.S new file mode 100644 index 000..1886890 --- /dev/null +++ b/arch/powerpc/boot/reloc_apply.S @@ -0,0 +1,242 @@ +/* + * Written by Mohan Kumar M ([EMAIL PROTECTED]) for 64bit PowerPC + * + * This file contains the low-level support and setup for the + * relocatable support for PPC64 kernel + * + * Copyright (C) IBM Corporation, 2008 + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include asm/ppc_asm.h + +#define RELOC_DELTA 0x4200 + +#define LOADADDR(rn,name) \ + lis rn,[EMAIL PROTECTED]; \ + ori rn,rn,[EMAIL PROTECTED];\ + rldicr rn,rn,32,31;\ + orisrn,rn,[EMAIL PROTECTED];\ + ori rn,rn,[EMAIL PROTECTED] + +/* + * Layout of vmlinux.reloc file + * Minimal part of relocation applying code + + * vmlinux + + * Rest of the relocation applying code + */ + +.section .text.head + +.globl start_wrap +start_wrap: + /* Get relocation offset in r15 */ + bl 1f +1: mflrr15 + LOAD_REG_IMMEDIATE(r16,1b) + subfr15,r16,r15 + + LOAD_REG_IMMEDIATE(r17, _reloc) + add r17,r17,r15 + mtctr r17 + bctr/* Jump to start_reloc in section .text.reloc */ + +/* Secondary cpus spin code */ +. = 0x60 + /* Get relocation offset in r15 */ + bl 1f +1: mflrr15 + LOAD_REG_IMMEDIATE(r16,1b) + subfr15,r16,r15 + + LOADADDR(r18, __spinloop) + add r18,r18,r15 +100: ld r19,0(r18) + cmpwi 0,r19,1 + bne 100b + + LOAD_REG_IMMEDIATE(r17, _reloc) + add r17,r17,r15 + addir17,r17,0x60 + mtctr r17 + /* Jump to start_reloc + 0x60 in section .text.reloc */ + bctr + +/* + * Layout of vmlinux.reloc file + * Minimal part of relocation applying code + + * vmlinux + + * Rest of the relocation applying code + */ + + +.section .text.reloc + +start_reloc: + b master + +.org start_reloc + 0x60 + LOADADDR(r18, __spinloop) + add r18,r18,r15 +100: ld r19,0(r18) + cmpwi 0,r19,2 + bne 100b + + /* Now vmlinux is at _head */ + LOAD_REG_IMMEDIATE(r17, _head) + add r17,r17,r15 + addir17,r17,0x60 + mtctr r17 + bctr + +master: + LOAD_REG_IMMEDIATE(r16, output_len) + add r16,r16,r15 + + /* +* Load the delimiter to distinguish between different relocation +* types +*/ + LOAD_REG_IMMEDIATE(r24, __delimiter) + add r24,r24,r15 + ld r24,0(r24) + + LOAD_REG_IMMEDIATE(r17, _head) + LOAD_REG_IMMEDIATE(r21, _ehead) + sub r21,r21,r17 /* Number of bytes in head section */ + + sub r16,r16,r21 /* Original output_len */ + + /* Destination address */ + LOAD_REG_IMMEDIATE(r17, _head) /* KERNELBASE */ + add r17,r17,r15 + + /* Source address */ + LOAD_REG_IMMEDIATE(r18, _text) /* Regular vmlinux */ + add r18,r18,r15 + + /* Number of bytes to copy */ + LOAD_REG_IMMEDIATE(r19, _etext) + add r19,r19,r15 + sub r19,r19,r18 + + /* Move cpu spin code to text.reloc section */ + LOADADDR(r23, __spinloop) + add r23,r23,r15 + li r25,1 + stw r25,0(r23) + + /* Copy vmlinux code to physical address 0 */ + bl .copy /* copy(_head, _text, _etext-_text) */ + + /* +* If its not running at 32MB, assume it to be a normal kernel. +* Copy the vmlinux code to KERNELBASE and jump to KERNELBASE +*/ + LOAD_REG_IMMEDIATE(r21, RELOC_DELTA) + cmpdr15,r21 + beq apply_relocation + li r6,0 + b skip_apply +apply_relocation: + + /* Kernel is running at 32MB */ + mr r22,r15 + xor r23,r23,r23 + addir23,r23,16 + srw r22,r22,r23 + + li r25,0 + +
[PATCH 4/5] Relocation support
Relocation support Add relocatable kernel support like avoiding copying the vmlinux image to compile address, adding relocation delta to the absolute symbol references etc. ld does not provide relocation entries for .got section, and the user space relocation extraction program can not process @got entries. So use LOAD_REG_IMMEDIATE macro instead of LOAD_REG_ADDR macro for relocatable kernel. Signed-off-by: Mohan Kumar M [EMAIL PROTECTED] --- arch/powerpc/include/asm/ppc_asm.h |4 ++ arch/powerpc/include/asm/prom.h|2 + arch/powerpc/include/asm/sections.h|4 ++- arch/powerpc/include/asm/system.h |5 +++ arch/powerpc/kernel/head_64.S | 53 ++- arch/powerpc/kernel/machine_kexec_64.c |4 +- arch/powerpc/kernel/misc.S | 40 +++- arch/powerpc/kernel/prom.c | 13 +++- arch/powerpc/kernel/prom_init.c| 27 +--- arch/powerpc/kernel/prom_init_check.sh |2 +- arch/powerpc/kernel/setup_64.c |5 +-- arch/powerpc/mm/hash_low_64.S | 12 +++ arch/powerpc/mm/init_64.c |7 ++-- arch/powerpc/mm/mem.c |3 +- arch/powerpc/mm/slb_low.S |4 ++ 15 files changed, 158 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 0966899..2309ad0 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -295,8 +295,12 @@ n: oris(reg),(reg),(expr)@h; \ ori (reg),(reg),(expr)@l; +#ifdef CONFIG_RELOCATABLE_PPC64 +#define LOAD_REG_ADDR(reg,name)LOAD_REG_IMMEDIATE(reg,name) +#else #define LOAD_REG_ADDR(reg,name)\ ld (reg),[EMAIL PROTECTED](r2) +#endif #define LOAD_REG_ADDRBASE(reg,name)LOAD_REG_ADDR(reg,name) #define ADDROFF(name) 0 diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h index eb3bd2e..4d7aa4f 100644 --- a/arch/powerpc/include/asm/prom.h +++ b/arch/powerpc/include/asm/prom.h @@ -39,6 +39,8 @@ #define OF_DT_VERSION 0x10 +extern unsigned long reloc_delta, kernel_base; + /* * This is what gets passed to the kernel by prom_init or kexec * diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h index 916018e..f19dab3 100644 --- a/arch/powerpc/include/asm/sections.h +++ b/arch/powerpc/include/asm/sections.h @@ -7,10 +7,12 @@ #ifdef __powerpc64__ extern char _end[]; +extern unsigned long kernel_base; static inline int in_kernel_text(unsigned long addr) { - if (addr = (unsigned long)_stext addr (unsigned long)__init_end) + if (addr = (unsigned long)_stext addr (unsigned long)__init_end + + kernel_base) return 1; return 0; diff --git a/arch/powerpc/include/asm/system.h b/arch/powerpc/include/asm/system.h index d6648c1..065c830 100644 --- a/arch/powerpc/include/asm/system.h +++ b/arch/powerpc/include/asm/system.h @@ -537,6 +537,11 @@ extern unsigned long add_reloc_offset(unsigned long); extern void reloc_got2(unsigned long); #define PTRRELOC(x)((typeof(x)) add_reloc_offset((unsigned long)(x))) +#ifdef CONFIG_PPC64 +#define RELOC(x) (*PTRRELOC((x))) +#else +#define RELOC(x) (x) +#endif #ifdef CONFIG_VIRT_CPU_ACCOUNTING extern void account_system_vtime(struct task_struct *); diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S index cc8fb47..6274686 100644 --- a/arch/powerpc/kernel/head_64.S +++ b/arch/powerpc/kernel/head_64.S @@ -102,6 +102,12 @@ __secondary_hold_acknowledge: .llong hvReleaseData-KERNELBASE #endif /* CONFIG_PPC_ISERIES */ +#ifdef CONFIG_RELOCATABLE_PPC64 + /* Used as static variable to initialize the reloc_delta */ +__initialized: + .long 0x0 +#endif + . = 0x60 /* * The following code is used to hold secondary processors @@ -1248,6 +1254,38 @@ _STATIC(__mmu_off) * */ _GLOBAL(__start_initialization_multiplatform) +#ifdef CONFIG_RELOCATABLE_PPC64 + mr r21,r3 + mr r22,r4 + mr r23,r5 + bl .reloc_offset + mr r26,r3 + mr r3,r21 + mr r4,r22 + mr r5,r23 + + LOAD_REG_IMMEDIATE(r27, __initialized) + add r27,r26,r27 + ld r7,0(r27) + cmpdi r7,0 + bne 4f + + li r7,1 + stw r7,0(r27) + + cmpdi r6,0 + beq 4f + LOAD_REG_IMMEDIATE(r27, reloc_delta) + add r27,r27,r26 + std r6,0(r27) + + LOAD_REG_IMMEDIATE(r27, KERNELBASE) + add r7,r6,r27 + LOAD_REG_IMMEDIATE(r27, kernel_base) + add r27,r27,r26 + std r7,0(r27) +4: +#endif /* * Are we booted from a PROM Of-type
[PATCH 5/5] Relocation support for kdump kernel
Relocation support for kdump kernel Add relocation kernel support for kdump kernel path. Signed-off-by: Mohan Kumar M [EMAIL PROTECTED] --- arch/powerpc/kernel/crash_dump.c | 19 +++ arch/powerpc/kernel/iommu.c|7 +-- arch/powerpc/kernel/machine_kexec.c|6 ++ arch/powerpc/mm/hash_utils_64.c|5 +++-- arch/powerpc/platforms/pseries/iommu.c |5 - 5 files changed, 37 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kernel/crash_dump.c b/arch/powerpc/kernel/crash_dump.c index e0debcc..58354b8 100644 --- a/arch/powerpc/kernel/crash_dump.c +++ b/arch/powerpc/kernel/crash_dump.c @@ -29,7 +29,12 @@ void __init reserve_kdump_trampoline(void) { +#ifdef CONFIG_RELOCATABLE_PPC64 + if (RELOC(reloc_delta)) + lmb_reserve(0, KDUMP_RESERVE_LIMIT); +#else lmb_reserve(0, KDUMP_RESERVE_LIMIT); +#endif } static void __init create_trampoline(unsigned long addr) @@ -45,7 +50,11 @@ static void __init create_trampoline(unsigned long addr) * two instructions it doesn't require any registers. */ patch_instruction(p, PPC_NOP_INSTR); +#ifndef CONFIG_RELOCATABLE_PPC64 patch_branch(++p, addr + PHYSICAL_START, 0); +#else + patch_branch(++p, addr + (RELOC(reloc_delta) 0xfff), 0); +#endif } void __init setup_kdump_trampoline(void) @@ -54,13 +63,23 @@ void __init setup_kdump_trampoline(void) DBG( - setup_kdump_trampoline()\n); +#ifdef CONFIG_RELOCATABLE_PPC64 + if (!RELOC(reloc_delta)) + return; +#endif + for (i = KDUMP_TRAMPOLINE_START; i KDUMP_TRAMPOLINE_END; i += 8) { create_trampoline(i); } #ifdef CONFIG_PPC_PSERIES +#ifndef CONFIG_RELOCATABLE_PPC64 create_trampoline(__pa(system_reset_fwnmi) - PHYSICAL_START); create_trampoline(__pa(machine_check_fwnmi) - PHYSICAL_START); +#else + create_trampoline(__pa(system_reset_fwnmi) - RELOC(reloc_delta)); + create_trampoline(__pa(machine_check_fwnmi) - RELOC(reloc_delta)); +#endif #endif /* CONFIG_PPC_PSERIES */ DBG( - setup_kdump_trampoline()\n); diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index 550a193..9ae7657 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -494,7 +494,7 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid) spin_lock_init(tbl-it_lock); #ifdef CONFIG_CRASH_DUMP - if (ppc_md.tce_get) { + if (reloc_delta ppc_md.tce_get) { unsigned long index; unsigned long tceval; unsigned long tcecount = 0; @@ -520,7 +520,10 @@ struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid) index tbl-it_size; index++) __clear_bit(index, tbl-it_map); } - } + } else + /* Clear the hardware table in case firmware left allocations + in it */ + ppc_md.tce_free(tbl, tbl-it_offset, tbl-it_size); #else /* Clear the hardware table in case firmware left allocations in it */ ppc_md.tce_free(tbl, tbl-it_offset, tbl-it_size); diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c index aab7688..75dc6af 100644 --- a/arch/powerpc/kernel/machine_kexec.c +++ b/arch/powerpc/kernel/machine_kexec.c @@ -67,6 +67,12 @@ void __init reserve_crashkernel(void) unsigned long long crash_size, crash_base; int ret; +#ifdef CONFIG_RELOCATABLE_PPC64 + /* Return if its kdump kernel */ + if (reloc_delta) + return; +#endif + /* this is necessary because of lmb_phys_mem_size() */ lmb_analyze(); diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 5ce5a4d..29474e9 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -677,8 +677,9 @@ void __init htab_initialize(void) continue; } #endif /* CONFIG_U3_DART */ - BUG_ON(htab_bolt_mapping(base, base + size, __pa(base), - mode_rw, mmu_linear_psize, mmu_kernel_ssize)); + BUG_ON(htab_bolt_mapping(base + kernel_base, base + size, + __pa(base) + kernel_base, mode_rw, mmu_linear_psize, + mmu_kernel_ssize)); } /* diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index a8c4466..480341b 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -291,7 +291,10 @@ static void iommu_table_setparms(struct pci_controller *phb, tbl-it_base = (unsigned long)__va(*basep); -#ifndef CONFIG_CRASH_DUMP +#ifdef CONFIG_CRASH_DUMP + if (!reloc_delta) +
Re: Does Dev Tree WORK with [EMAIL PROTECTED] #address/size = 2/1
I recommend you look at 2/2 instead of 2/1. 2/1 is just a degenerate case and it doesn't really get you much value. - k ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: Does Dev Tree WORK with [EMAIL PROTECTED] #address/size = 2/1
Thank you...I will take that recommendation... I will try that and get some results before I complete my response to Becky this morning... T -Original Message- From: Kumar Gala [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 12, 2008 8:42 AM To: Morrison, Tom Cc: ppc-dev list; Becky Bruce; Paul Mackerras Subject: Re: Does Dev Tree WORK with [EMAIL PROTECTED] #address/size = 2/1 I recommend you look at 2/2 instead of 2/1. 2/1 is just a degenerate case and it doesn't really get you much value. - k ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/5] ib/ehca: Fix stability issues
Hi Roland, the following patchset contains four small fixes and one bigger patch (5/5) for addressing some ehca issues we found during cluster test. [1/5] update qp_state on cached modify_qp() [2/5] rename goto label in ehca_poll_cq_one() [3/5] repoll on invalid opcode instead of returning success [4/5] check idr_find() return value [5/5] discard double CQE for one WR They all apply on top of 2.6.27-rc1. If possible, we would like to get them into 2.6.27. Regards, Alexander Schmidt ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 2/5] ib/ehca: rename goto label
Rename the poll_cq_one_read_cqe goto label to what it actually does, repoll. Signed-off-by: Alexander Schmidt [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_reqs.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -589,7 +589,7 @@ static inline int ehca_poll_cq_one(struc struct ehca_qp *my_qp; int cqe_count = 0, is_error; -poll_cq_one_read_cqe: +repoll: cqe = (struct ehca_cqe *) ipz_qeit_get_inc_valid(my_cq-ipz_queue); if (!cqe) { @@ -617,7 +617,7 @@ poll_cq_one_read_cqe: ehca_dmp(cqe, 64, cq_num=%x qp_num=%x, my_cq-cq_number, cqe-local_qp_number); /* ignore this purged cqe */ - goto poll_cq_one_read_cqe; + goto repoll; } spin_lock_irqsave(qp-spinlock_s, flags); purgeflag = qp-sqerr_purgeflag; @@ -636,7 +636,7 @@ poll_cq_one_read_cqe: * that caused sqe and turn off purge flag */ qp-sqerr_purgeflag = 0; - goto poll_cq_one_read_cqe; + goto repoll; } } ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 5/5] ib/ehca: discard double CQE for one WR
Under rare circumstances, the ehca hardware might erroneously generate two CQEs for the same WQE, which is not compliant to the IB spec and will cause unpredictable errors like memory being freed twice. To avoid this problem, the driver needs to detect the second CQE and discard it. For this purpose, introduce an array holding as many elements as the SQ of the QP, called sq_map. Each sq_map entry stores a reported flag for one WQE in the SQ. When a work request is posted to the SQ, the respective reported flag is set to zero. After the arrival of a CQE, the flag is set to 1, which allows to detect the occurence of a second CQE. The mapping between WQE / CQE and the corresponding sq_map element is implemented by replacing the lowest 16 Bits of the wr_id with the index in the queue map. The original 16 Bits are stored in the sq_map entry and are restored when the CQE is passed to the application. Signed-off-by: Alexander Schmidt [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |9 + drivers/infiniband/hw/ehca/ehca_qes.h |1 drivers/infiniband/hw/ehca/ehca_qp.c | 34 +- drivers/infiniband/hw/ehca/ehca_reqs.c| 54 +++--- 4 files changed, 78 insertions(+), 20 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -139,6 +139,7 @@ static void trace_send_wr_ud(const struc static inline int ehca_write_swqe(struct ehca_qp *qp, struct ehca_wqe *wqe_p, const struct ib_send_wr *send_wr, + u32 sq_map_idx, int hidden) { u32 idx; @@ -157,7 +158,11 @@ static inline int ehca_write_swqe(struct /* clear wqe header until sglist */ memset(wqe_p, 0, offsetof(struct ehca_wqe, u.ud_av.sg_list)); - wqe_p-work_request_id = send_wr-wr_id; + wqe_p-work_request_id = send_wr-wr_id ~QMAP_IDX_MASK; + wqe_p-work_request_id |= sq_map_idx QMAP_IDX_MASK; + + qp-sq_map[sq_map_idx].app_wr_id = send_wr-wr_id QMAP_IDX_MASK; + qp-sq_map[sq_map_idx].reported = 0; switch (send_wr-opcode) { case IB_WR_SEND: @@ -381,6 +386,7 @@ static inline int post_one_send(struct e { struct ehca_wqe *wqe_p; int ret; + u32 sq_map_idx; u64 start_offset = my_qp-ipz_squeue.current_q_offset; /* get pointer next to free WQE */ @@ -393,8 +399,15 @@ static inline int post_one_send(struct e qp_num=%x, my_qp-ib_qp.qp_num); return -ENOMEM; } + + /* +* Get the index of the WQE in the send queue. The same index is used +* for writing into the sq_map. +*/ + sq_map_idx = start_offset / my_qp-ipz_squeue.qe_size; + /* write a SEND WQE into the QUEUE */ - ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, hidden); + ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, sq_map_idx, hidden); /* * if something failed, * reset the free entry pointer to the start value @@ -654,8 +667,34 @@ repoll: my_cq, my_cq-cq_number); } - /* we got a completion! */ - wc-wr_id = cqe-work_request_id; + read_lock(ehca_qp_idr_lock); + my_qp = idr_find(ehca_qp_idr, cqe-qp_token); + read_unlock(ehca_qp_idr_lock); + if (!my_qp) + goto repoll; + wc-qp = my_qp-ib_qp; + + if (!(cqe-w_completion_flags WC_SEND_RECEIVE_BIT)) { + struct ehca_qmap_entry *qmap_entry; + /* +* We got a send completion and need to restore the original +* wr_id. +*/ + qmap_entry = my_qp-sq_map[cqe-work_request_id + QMAP_IDX_MASK]; + + if (qmap_entry-reported) { + ehca_warn(cq-device, Double cqe on qp_num=%#x, + my_qp-real_qp_num); + /* found a double cqe, discard it and read next one */ + goto repoll; + } + wc-wr_id = cqe-work_request_id ~QMAP_IDX_MASK; + wc-wr_id |= qmap_entry-app_wr_id; + qmap_entry-reported = 1; + } else + /* We got a receive completion. */ + wc-wr_id = cqe-work_request_id; /* eval ib_wc_opcode */ wc-opcode = ib_wc_opcode[cqe-optype]-1; @@ -678,13 +717,6 @@ repoll: } else wc-status = IB_WC_SUCCESS; - read_lock(ehca_qp_idr_lock); - my_qp = idr_find(ehca_qp_idr, cqe-qp_token); - read_unlock(ehca_qp_idr_lock); - if (!my_qp) - goto repoll; - wc-qp = my_qp-ib_qp; - wc-byte_len = cqe-nr_bytes_transferred; wc-pkey_index = cqe-pkey_index; wc-slid =
[PATCH 3/5] ib/ehca: repoll on invalid opcode
When the ehca driver detects an invalid opcode in a CQE, it currently passes the CQE to the application and returns with success. This patch changes the CQE handling to discard CQEs with invalid opcodes and to continue reading the next CQE from the CQ. Signed-off-by: Alexander Schmidt [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_reqs.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -667,7 +667,7 @@ repoll: ehca_dmp(cqe, 64, ehca_cq=%p cq_num=%x, my_cq, my_cq-cq_number); /* update also queue adder to throw away this entry!!! */ - goto poll_cq_one_exit0; + goto repoll; } /* eval ib_wc_status */ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 4/5] ib/ehca: check idr_find() return value
The idr_find() function may fail when trying to get the QP that is associated with a CQE, e.g. when a QP has been destroyed between the generation of a CQE and the poll request for it. In consequence, the return value of idr_find() must be checked and the CQE must be discarded when the QP cannot be found. Signed-off-by: Alexander Schmidt [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_reqs.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -680,8 +680,10 @@ repoll: read_lock(ehca_qp_idr_lock); my_qp = idr_find(ehca_qp_idr, cqe-qp_token); - wc-qp = my_qp-ib_qp; read_unlock(ehca_qp_idr_lock); + if (!my_qp) + goto repoll; + wc-qp = my_qp-ib_qp; wc-byte_len = cqe-nr_bytes_transferred; wc-pkey_index = cqe-pkey_index; ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/5 try2] ib/ehca: update qp_state on cached modify_qp()
Since the introduction of the port auto-detect mode for ehca, calls to modify_qp() may be cached in the device driver when the ports are not activated yet. When a modify_qp() call is cached, the qp state remains untouched until the port is activated, which will leave the qp in the reset state. In the reset state, however, it is not allowed to post SQ WQEs, which confuses applications like ib_mad. The solution for this problem is to immediately set the qp state as requested by modify_qp(), even when the call is cached. Signed-off-by: Alexander Schmidt [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_qp.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_qp.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_qp.c @@ -1534,8 +1534,6 @@ static int internal_modify_qp(struct ib_ if (attr_mask IB_QP_QKEY) my_qp-qkey = attr-qkey; - my_qp-state = qp_new_state; - modify_qp_exit2: if (squeue_locked) { /* this means: sqe - rts */ spin_unlock_irqrestore(my_qp-spinlock_s, flags); @@ -1551,6 +1549,8 @@ modify_qp_exit1: int ehca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, struct ib_udata *udata) { + int ret = 0; + struct ehca_shca *shca = container_of(ibqp-device, struct ehca_shca, ib_device); struct ehca_qp *my_qp = container_of(ibqp, struct ehca_qp, ib_qp); @@ -1597,12 +1597,18 @@ int ehca_modify_qp(struct ib_qp *ibqp, s attr-qp_state, my_qp-init_attr.port_num, ibqp-qp_type); spin_unlock_irqrestore(sport-mod_sqp_lock, flags); - return 0; + goto out; } spin_unlock_irqrestore(sport-mod_sqp_lock, flags); } - return internal_modify_qp(ibqp, attr, attr_mask, 0); + ret = internal_modify_qp(ibqp, attr, attr_mask, 0); + +out: + if ((ret == 0) (attr_mask IB_QP_STATE)) + my_qp-state = attr-qp_state; + + return ret; } void ehca_recover_sqp(struct ib_qp *sqp) ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 0/5 try2] ib/ehca: Fix stability issues
Hi Roland, Sorry, the first set was mangled because of a broken mailer, so here it is again, double checked... the following patchset contains four small fixes and one bigger patch (5/5) for addressing some ehca issues we found during cluster test. [1/5] update qp_state on cached modify_qp() [2/5] rename goto label in ehca_poll_cq_one() [3/5] repoll on invalid opcode instead of returning success [4/5] check idr_find() return value [5/5] discard double CQE for one WR They all apply on top of 2.6.27-rc1. If possible, we would like to get them into 2.6.27. Regards, Alexander Schmidt ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 4/5 try2] ib/ehca: check idr_find() return value
The idr_find() function may fail when trying to get the QP that is associated with a CQE, e.g. when a QP has been destroyed between the generation of a CQE and the poll request for it. In consequence, the return value of idr_find() must be checked and the CQE must be discarded when the QP cannot be found. Signed-off-by: Alexander Schmidt [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_reqs.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -680,8 +680,10 @@ repoll: read_lock(ehca_qp_idr_lock); my_qp = idr_find(ehca_qp_idr, cqe-qp_token); - wc-qp = my_qp-ib_qp; read_unlock(ehca_qp_idr_lock); + if (!my_qp) + goto repoll; + wc-qp = my_qp-ib_qp; wc-byte_len = cqe-nr_bytes_transferred; wc-pkey_index = cqe-pkey_index; ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 3/5 try2] ib/ehca: repoll on invalid opcode
When the ehca driver detects an invalid opcode in a CQE, it currently passes the CQE to the application and returns with success. This patch changes the CQE handling to discard CQEs with invalid opcodes and to continue reading the next CQE from the CQ. Signed-off-by: Alexander Schmidt [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_reqs.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -667,7 +667,7 @@ repoll: ehca_dmp(cqe, 64, ehca_cq=%p cq_num=%x, my_cq, my_cq-cq_number); /* update also queue adder to throw away this entry!!! */ - goto poll_cq_one_exit0; + goto repoll; } /* eval ib_wc_status */ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 5/5 try2] ib/ehca: discard double CQE for one WR
Under rare circumstances, the ehca hardware might erroneously generate two CQEs for the same WQE, which is not compliant to the IB spec and will cause unpredictable errors like memory being freed twice. To avoid this problem, the driver needs to detect the second CQE and discard it. For this purpose, introduce an array holding as many elements as the SQ of the QP, called sq_map. Each sq_map entry stores a reported flag for one WQE in the SQ. When a work request is posted to the SQ, the respective reported flag is set to zero. After the arrival of a CQE, the flag is set to 1, which allows to detect the occurence of a second CQE. The mapping between WQE / CQE and the corresponding sq_map element is implemented by replacing the lowest 16 Bits of the wr_id with the index in the queue map. The original 16 Bits are stored in the sq_map entry and are restored when the CQE is passed to the application. Signed-off-by: Alexander Schmidt [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_classes.h |9 + drivers/infiniband/hw/ehca/ehca_qes.h |1 drivers/infiniband/hw/ehca/ehca_qp.c | 34 +- drivers/infiniband/hw/ehca/ehca_reqs.c| 54 +++--- 4 files changed, 78 insertions(+), 20 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -139,6 +139,7 @@ static void trace_send_wr_ud(const struc static inline int ehca_write_swqe(struct ehca_qp *qp, struct ehca_wqe *wqe_p, const struct ib_send_wr *send_wr, + u32 sq_map_idx, int hidden) { u32 idx; @@ -157,7 +158,11 @@ static inline int ehca_write_swqe(struct /* clear wqe header until sglist */ memset(wqe_p, 0, offsetof(struct ehca_wqe, u.ud_av.sg_list)); - wqe_p-work_request_id = send_wr-wr_id; + wqe_p-work_request_id = send_wr-wr_id ~QMAP_IDX_MASK; + wqe_p-work_request_id |= sq_map_idx QMAP_IDX_MASK; + + qp-sq_map[sq_map_idx].app_wr_id = send_wr-wr_id QMAP_IDX_MASK; + qp-sq_map[sq_map_idx].reported = 0; switch (send_wr-opcode) { case IB_WR_SEND: @@ -381,6 +386,7 @@ static inline int post_one_send(struct e { struct ehca_wqe *wqe_p; int ret; + u32 sq_map_idx; u64 start_offset = my_qp-ipz_squeue.current_q_offset; /* get pointer next to free WQE */ @@ -393,8 +399,15 @@ static inline int post_one_send(struct e qp_num=%x, my_qp-ib_qp.qp_num); return -ENOMEM; } + + /* +* Get the index of the WQE in the send queue. The same index is used +* for writing into the sq_map. +*/ + sq_map_idx = start_offset / my_qp-ipz_squeue.qe_size; + /* write a SEND WQE into the QUEUE */ - ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, hidden); + ret = ehca_write_swqe(my_qp, wqe_p, cur_send_wr, sq_map_idx, hidden); /* * if something failed, * reset the free entry pointer to the start value @@ -654,8 +667,34 @@ repoll: my_cq, my_cq-cq_number); } - /* we got a completion! */ - wc-wr_id = cqe-work_request_id; + read_lock(ehca_qp_idr_lock); + my_qp = idr_find(ehca_qp_idr, cqe-qp_token); + read_unlock(ehca_qp_idr_lock); + if (!my_qp) + goto repoll; + wc-qp = my_qp-ib_qp; + + if (!(cqe-w_completion_flags WC_SEND_RECEIVE_BIT)) { + struct ehca_qmap_entry *qmap_entry; + /* +* We got a send completion and need to restore the original +* wr_id. +*/ + qmap_entry = my_qp-sq_map[cqe-work_request_id + QMAP_IDX_MASK]; + + if (qmap_entry-reported) { + ehca_warn(cq-device, Double cqe on qp_num=%#x, + my_qp-real_qp_num); + /* found a double cqe, discard it and read next one */ + goto repoll; + } + wc-wr_id = cqe-work_request_id ~QMAP_IDX_MASK; + wc-wr_id |= qmap_entry-app_wr_id; + qmap_entry-reported = 1; + } else + /* We got a receive completion. */ + wc-wr_id = cqe-work_request_id; /* eval ib_wc_opcode */ wc-opcode = ib_wc_opcode[cqe-optype]-1; @@ -678,13 +717,6 @@ repoll: } else wc-status = IB_WC_SUCCESS; - read_lock(ehca_qp_idr_lock); - my_qp = idr_find(ehca_qp_idr, cqe-qp_token); - read_unlock(ehca_qp_idr_lock); - if (!my_qp) - goto repoll; - wc-qp = my_qp-ib_qp; - wc-byte_len = cqe-nr_bytes_transferred; wc-pkey_index = cqe-pkey_index; wc-slid =
[PATCH 2/5 try2] ib/ehca: rename goto label
Rename the poll_cq_one_read_cqe goto label to what it actually does, repoll. Signed-off-by: Alexander Schmidt [EMAIL PROTECTED] --- drivers/infiniband/hw/ehca/ehca_reqs.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- infiniband.git.orig/drivers/infiniband/hw/ehca/ehca_reqs.c +++ infiniband.git/drivers/infiniband/hw/ehca/ehca_reqs.c @@ -589,7 +589,7 @@ static inline int ehca_poll_cq_one(struc struct ehca_qp *my_qp; int cqe_count = 0, is_error; -poll_cq_one_read_cqe: +repoll: cqe = (struct ehca_cqe *) ipz_qeit_get_inc_valid(my_cq-ipz_queue); if (!cqe) { @@ -617,7 +617,7 @@ poll_cq_one_read_cqe: ehca_dmp(cqe, 64, cq_num=%x qp_num=%x, my_cq-cq_number, cqe-local_qp_number); /* ignore this purged cqe */ - goto poll_cq_one_read_cqe; + goto repoll; } spin_lock_irqsave(qp-spinlock_s, flags); purgeflag = qp-sqerr_purgeflag; @@ -636,7 +636,7 @@ poll_cq_one_read_cqe: * that caused sqe and turn off purge flag */ qp-sqerr_purgeflag = 0; - goto poll_cq_one_read_cqe; + goto repoll; } } ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] pata_of_platform: fix no irq handling
Benjamin Herrenschmidt wrote: 1. IDE status read does not work. (But am I understand correctly that IDE works well if IRQ is unspecified? Then this is hardly an issue.) 2. IDE interrupt comes when it should not. I'd recommend to use oscilloscope to find out what is happening there, that is, if the drive actually deasserts its irq line after status read. If so, than this could be a PIC problem. What is the platform on which you're observing the issue, btw? Another possibility is that you got the wrong interrupt number in the device-tree... Ben. The platform is the AMCC Sequoia board. We've built a little adapter to connect a compact flash card to the processor bus. I believe the interrupt selection in the device tree is correct, and I've checked over the u-boot settings for the IRQ line (active high, level sensitive). I've also tried edge-sensitive but it doesn't make a difference. When u-boot queries the CF card, we see the IRQ pulse as expected, but when the kernel runs, we see the IRQ go high and stay there, which the kernel naturally treats as a stuck interrupt. The other oddity is that we see a single diagnostic failure on startup: ata1.00: Drive reports diagnostics failure. This may indicate a drive ata1.00: fault or invalid emulation. Contact drive vendor for information. That is strange, because if we manually do the soft reset from u-boot, we see the ATA feature byte return 0x01, which means success. When the kernel does the soft reset, we see a 0x00, which means failure. You would think it is timing related, but a logic analyzer trace shows reasonable timing. We need to wire up a better test rig, so I don't want folks on this list to waste any time on it. I'll report back if I learn anything of general interest. With the interrupt disabled in the device tree, and ignoring the diagnostics failure, the drive actually works. I'm able to mount a filesystem, read files from it, etc. So, the drive is fully functional, just without using interrupts. Therefore, I believe most everything is correct - byte lanes, read/write signaling, timing, etc. Curious. Steve ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC 1/3] add support for exporting symbols from .S files
This makes it possible to export symbols from assembly files, instead of having to export them through an extra ksyms.c file. Signed-off-by: Arnd Bergmann [EMAIL PROTECTED] --- On Tuesday 12 August 2008, Stephen Rothwell wrote: This won't be portable across architectures as .align is sometimes in bytes and sometimes a power of two. You can use .balign or .p2align portably on gas, though. Ok, this version uses the .balign as suggested by rusty, and also fixes building with modversions turned on, which did not work in the first version. Arnd --- a/include/linux/module.h +++ b/include/linux/module.h @@ -1,5 +1,7 @@ #ifndef _LINUX_MODULE_H #define _LINUX_MODULE_H + +#ifndef __ASSEMBLY__ /* * Dynamic loading of modules into the kernel. * @@ -605,4 +607,72 @@ static inline void module_remove_modinfo_attrs(struct module *mod) #define __MODULE_STRING(x) __stringify(x) +#else /* __ASSEMBLY__ */ +#include asm/types.h + +#ifdef CONFIG_MODULES +.macro __EXPORT_SYMBOL sym section symtab strtab crctab crc + .section \section, a, @progbits + .type \symtab, @object + .balign BITS_PER_LONG/8 +\symtab: + .ifeq BITS_PER_LONG - 32 + .long \sym + .long \strtab + .else + .quad \sym + .quad \strtab + .endif + .size \symtab, . - \symtab + .previous + + .section __ksymtab_strings, a, @progbits + .type \strtab, @object +\strtab: + .string \sym + .size \strtab, . - \strtab + .previous + +#ifdef CONFIG_MODVERSIONS + /* +* Modversions doesn't work with assembly files, +* so insert a dummy CRC. +*/ + .section __kcrctab, a, @progbits + .balign 4 + .type \crctab, @object +\crctab: + .long \crc + .size \crctab, . - \crctab + .set \crc, 0 + .previous +#endif + .endm + +#define EXPORT_SYMBOL(sym) \ +__EXPORT_SYMBOL sym, __ksymtab, __ksymtab_ ## sym, \ + __kstrtab_ ## sym, __kcrctab_ ## sym, __crc_ ## sym +#define EXPORT_SYMBOL_GPL(sym) \ +__EXPORT_SYMBOL sym, __ksymtab_gpl, __ksymtab_ ## sym, \ + __kstrtab_ ## sym, __kcrctab_ ## sym, __crc_ ## sym +#define EXPORT_SYMBOL_GPL_FUTURE(sym) \ +__EXPORT_SYMBOL sym, __ksymtab_gpl_future, __ksymtab_ ## sym, \ + __kstrtab_ ## sym, __kcrctab_ ## sym, __crc_ ## sym +#define EXPORT_UNUSED_SYMBOL(sym) \ +__EXPORT_SYMBOL sym, __ksymtab_unused, __ksymtab_ ## sym, \ + __kstrtab_ ## sym, __kcrctab_ ## sym, __crc_ ## sym +#define EXPORT_UNUSED_SYMBOL_GPL(sym) \ +__EXPORT_SYMBOL sym, __ksymtab_unused_gpl, __ksymtab_ ## sym, \ + __kstrtab_ ## sym, __kcrctab_ ## sym, __crc_ ## sym + +#else /* CONFIG_MODULES... */ +#define EXPORT_SYMBOL(sym) +#define EXPORT_SYMBOL_GPL(sym) +#define EXPORT_SYMBOL_GPL_FUTURE(sym) +#define EXPORT_UNUSED_SYMBOL(sym) +#define EXPORT_UNUSED_SYMBOL_GPL(sym) +#endif /* !CONFIG_MODULES... */ + +#endif /* __ASSEMBLY__ */ + #endif /* _LINUX_MODULE_H */ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] pata_of_platform: fix no irq handling
On Tue, Aug 12, 2008 at 10:00:40AM -0400, Steven A. Falco wrote: Benjamin Herrenschmidt wrote: 1. IDE status read does not work. (But am I understand correctly that IDE works well if IRQ is unspecified? Then this is hardly an issue.) 2. IDE interrupt comes when it should not. I'd recommend to use oscilloscope to find out what is happening there, that is, if the drive actually deasserts its irq line after status read. If so, than this could be a PIC problem. What is the platform on which you're observing the issue, btw? Another possibility is that you got the wrong interrupt number in the device-tree... Ben. The platform is the AMCC Sequoia board. We've built a little adapter to connect a compact flash card to the processor bus. I believe the interrupt selection in the device tree is correct, and I've checked over the u-boot settings for the IRQ line (active high, level sensitive). IDE IRQs are active-low. -- Anton Vorontsov email: [EMAIL PROTECTED] irc://irc.freenode.net/bd2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] pata_of_platform: fix no irq handling
On Tuesday 12 August 2008, Anton Vorontsov wrote: Another possibility is that you got the wrong interrupt number in the device-tree... Ben. The platform is the AMCC Sequoia board. We've built a little adapter to connect a compact flash card to the processor bus. I believe the interrupt selection in the device tree is correct, and I've checked over the u-boot settings for the IRQ line (active high, level sensitive). IDE IRQs are active-low. IIRC, the CompactFlash interrupt is active-high. Best regards, Stefan ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] pata_of_platform: fix no irq handling
Anton Vorontsov wrote: 1. IDE status read does not work. (But am I understand correctly that IDE works well if IRQ is unspecified? Then this is hardly an issue.) 2. IDE interrupt comes when it should not. I'd recommend to use oscilloscope to find out what is happening there, that is, if the drive actually deasserts its irq line after status read. If so, than this could be a PIC problem. What is the platform on which you're observing the issue, btw? Another possibility is that you got the wrong interrupt number in the device-tree... Ben. The platform is the AMCC Sequoia board. We've built a little adapter to connect a compact flash card to the processor bus. I believe the interrupt selection in the device tree is correct, and I've checked over the u-boot settings for the IRQ line (active high, level sensitive). IDE IRQs are active-low. Only on the PCI and only in the native mode. Natively, the IDE INTRQ signal is active-high, rising edge triggering, as on ISA. You seem to have an invertor somewhere, if it's not a PCI chip... WBR, Sergei ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] pata_of_platform: fix no irq handling
On Tue, Aug 12, 2008 at 06:18:42PM +0400, Sergei Shtylyov wrote: Anton Vorontsov wrote: 1. IDE status read does not work. (But am I understand correctly that IDE works well if IRQ is unspecified? Then this is hardly an issue.) 2. IDE interrupt comes when it should not. I'd recommend to use oscilloscope to find out what is happening there, that is, if the drive actually deasserts its irq line after status read. If so, than this could be a PIC problem. What is the platform on which you're observing the issue, btw? Another possibility is that you got the wrong interrupt number in the device-tree... Ben. The platform is the AMCC Sequoia board. We've built a little adapter to connect a compact flash card to the processor bus. I believe the interrupt selection in the device tree is correct, and I've checked over the u-boot settings for the IRQ line (active high, level sensitive). IDE IRQs are active-low. Only on the PCI and only in the native mode. Natively, the IDE INTRQ signal is active-high, rising edge triggering, as on ISA. You seem to have an invertor somewhere, if it's not a PCI chip... Ugh. Right you are, as always. I've just looked into mpc8349emitx schematics, there is indeed an inverter on the irq line. CF in True IDE mode is active-high, sorry. -- Anton Vorontsov email: [EMAIL PROTECTED] irc://irc.freenode.net/bd2 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: Does Dev Tree WORK with [EMAIL PROTECTED] #address/size = 2/1
Thank you Becky (and Kumar) for all the informationand help! To answer your questions, yes, we are using 4GB++ of memory (and plan more in the near future). But, for the initial bring up, I reduced the memory to 2Gig. Further, I have modified u-boot to NOT modify the memory reg properties (see below my snippet)... Question: what other 'devices' does u-boot put down that I care about I have modified u-boot to put the correct memory structure? Also, are you saying there are additional patches for prom parsing code for this to work right - or are you talking about in general for the 4Gig memory?? Here is a dts snippet with some of the interesting parts: /{ model = MPC8548_CHEETAH; compatible = MPC8548_CHEETAH; #address-cells = 2; #size-cells = 2; memory { device_type = memory; reg = 0 0 8000; // 2 GIG @ 0x0 }; [EMAIL PROTECTED] { #address-cells = 1; #size-cells = 1; #interrupt-cells = 2; device_type = soc; ranges = 1000 c 6df0 000ff000; reg = 000C 6df0 0 001000; // CCSRBAR . [EMAIL PROTECTED] { device_type = serial; compatible = ns16550; reg = 4500 100; // reg base, size clock-frequency = 0; // should we fill in in uboot? interrupts = 2a 2; interrupt-parent = mpic; }; [EMAIL PROTECTED] { device_type = serial; compatible = ns16550; reg = 4600 100; // reg base, size clock-frequency = 0; // should we fill in in uboot? interrupts = 2a 2; interrupt-parent = mpic; }; . }; Now, it looks like I am successfully parsing and translating the address to the expected address for the default stdout (0xc_6df0_4600)! FWIW, I have identified that another problem with the code configuring the CCSRBAR (and am sure I'll figure that one out soon because we have a working solution in the arch/ppc directory). While I have your expert attention, I'd like to have you comment about what potentially could be right/wrong with my definitions for the pci express settings... a) Do I put those ranges in the ranges for the parent soc device (also)? b) Do the below correctly define a 2 Gig PCI memory Window starting at 0xC_6F00_ (to 0xC_EF00_) and PCI IO 16M Window starting at 0xC_6E00_ (to 0xC_6F00_)? - /* PCI Express */ [EMAIL PROTECTED] { compatible = fsl,mpc8548-pcie; device_type = pci; #interrupt-cells = 1; #size-cells = 2; #address-cells = 3; reg = a000 1000; bus-range = 0 ff; ranges = 0200 0 c 6f00 c 6f00 0 8000 0100 0 000c 6E00 0 0100; - Thank you for all your help/comments... Sincerely, Tom Morrison -Original Message- From: Becky Bruce [mailto:[EMAIL PROTECTED] Sent: Monday, August 11, 2008 6:26 PM To: Morrison, Tom Cc: linuxppc-dev@ozlabs.org; Paul Mackerras Subject: Re: Does Dev Tree WORK with [EMAIL PROTECTED] #address/size = 2/1 On Aug 11, 2008, at 4:37 PM, Morrison, Tom wrote: I am sorry, but I've butted my head against a tree for over a week and some things just aren't making sense...especially how the prom parse code is working to exact / resolve physical addresses to then ioremap... a) Setup, I have a working MPC8548E board using 2.6.23.8 (ARCH=ppc) with PHYS/PTE_64BIT enabled (with the proper patches)). So, how much RAM are you trying to use? If it's 4GB+, there are still patches you need that haven't been released yet but should be out in the next week or so (I'm in the middle of pulling everything up to top-of-tree and re-testing). b) Goal: we want to move our board to a generic 2.6.23 version that Freescale has produced to support the MPC8572DS (and eventually use that kernel to build our BSP for our next board). c) I have successfully gotten to work a pure '32-bit' dev tree (PHYS/PTE_64BIT not defined - and the CCSRBAR @ 0xE000_) The #address/#size 1/1 is '1' (as well as the #size is '1') d) I have modified this dev tree to support the 36bit mode addressing (with the CCSRBAR and PCIExpress defined to in the last 4Gig (instead of the default first 4Gig - e.g.: 0xC_E000_)... e) I then set the soc to have #address-cells to '2' (#size-cells is '1') (and added the additional values accordingly to the ranges... I'm not sure you've done the right thing here. In many cases, you don't actually need to modify the size/addr cells in the soc node - those just
Re: Does Dev Tree WORK with [EMAIL PROTECTED] #address/size = 2/1
On Aug 12, 2008, at 10:14 AM, Morrison, Tom wrote: Thank you Becky (and Kumar) for all the informationand help! To answer your questions, yes, we are using 4GB++ of memory (and plan more in the near future). But, for the initial bring up, I reduced the memory to 2Gig. Further, I have modified u-boot to NOT modify the memory reg properties (see below my snippet)... Question: what other 'devices' does u-boot put down that I care about I have modified u-boot to put the correct memory structure? If you have 4GB of RAM, you need to move *all* devices in u-boot above 4GB. The config file for your board should have all the information you need about what's out there. Also, are you saying there are additional patches for prom parsing code for this to work right - or are you talking about in general for the 4Gig memory?? I was talking about in general for 4GB. The prom parsing code should work already. The issue with having 4GB is that there are some devices that only recognize 32 bits of physical address. These devices will require bounce buffering a la swiotlb in order to do dmas when the dma buffer is somewhere above about 3.5GB (that number depends on your board - the reason you can't see the whole 4GB of space is that some portion of the PCI space is reserved for PCI-IO). I'm currently working through some issues that popped up when I updated to TOT and picked up the dma_attrs changes. I expect to have these patches out within a week or so. Here is a dts snippet with some of the interesting parts: /{ model = MPC8548_CHEETAH; compatible = MPC8548_CHEETAH; #address-cells = 2; #size-cells = 2; memory { device_type = memory; reg = 0 0 8000; // 2 GIG @ 0x0 }; [EMAIL PROTECTED] { #address-cells = 1; #size-cells = 1; #interrupt-cells = 2; device_type = soc; ranges = 1000 c 6df0 000ff000; reg = 000C 6df0 0 001000; // CCSRBAR . [EMAIL PROTECTED] { device_type = serial; compatible = ns16550; reg = 4500 100; // reg base, size clock-frequency = 0;// should we fill in in uboot? interrupts = 2a 2; interrupt-parent = mpic; }; [EMAIL PROTECTED] { device_type = serial; compatible = ns16550; reg = 4600 100; // reg base, size clock-frequency = 0;// should we fill in in uboot? interrupts = 2a 2; interrupt-parent = mpic; }; . }; Now, it looks like I am successfully parsing and translating the address to the expected address for the default stdout (0xc_6df0_4600)! Yeah, this looks OK (other than the fact that you're using an old .dts format). FWIW, I have identified that another problem with the code configuring the CCSRBAR (and am sure I'll figure that one out soon because we have a working solution in the arch/ppc directory). FWIW, I moved CCSRBAR in u-boot. While I have your expert attention, I'd like to have you comment about what potentially could be right/wrong with my definitions for the pci express settings... a) Do I put those ranges in the ranges for the parent soc device (also)? b) Do the below correctly define a 2 Gig PCI memory Window starting at 0xC_6F00_ (to 0xC_EF00_) and PCI IO 16M Window starting at 0xC_6E00_ (to 0xC_6F00_)? - /* PCI Express */ [EMAIL PROTECTED] { compatible = fsl,mpc8548-pcie; device_type = pci; #interrupt-cells = 1; #size-cells = 2; #address-cells = 3; reg = a000 1000; bus-range = 0 ff; ranges = 0200 0 c 6f00 c 6f00 0 8000 0100 0 000c 6E00 0 0100; This doesn't look right to me, assuming your pcie device is at the same tree level as mine. What's the parent node and what are address- cells and size-cells in the parent? I'm basically at next-to-top level, under the platform node. Here's an example from my working tree: model = MPC8641HPCN; compatible = mpc86xx; #address-cells = 2; #size-cells = 2; ... pci0: [EMAIL PROTECTED] { cell-index = 0; compatible = fsl,mpc8641-pcie; device_type = pci; #interrupt-cells = 1; #size-cells = 2; #address-cells = 3; reg = 0x0f 0xf8008000 0x0 0x1000; bus-range = 0x0 0xff; ranges = 0x0200 0x0 0x8000 0x0f 0x8000 0x0
[PATCH 2/2] rtc: bunch of drivers: fix 'no irq' case handing
This patch fixes bunch of irq checking misuses. Most drivers were getting irq via platform_get_irq(), which returns -ENXIO or r-start. Platforms may specify r-start = 0 to emphasize 'no irq' case, and drivers should handle this correctly. rtc-cmos.c is special. It is using PNP and platform bindings. Hopefully nobody is using PNP IRQ 0 for RTC. So the changes should be safe. Also, rtc-sh.c was using platform_get_irq, and stored a result into an unsigned type, then was checking for 0. This is fixed now. Signed-off-by: Anton Vorontsov [EMAIL PROTECTED] --- Unlike the first patch, this one is untested. Please review carefully. drivers/rtc/rtc-at32ap700x.c |2 +- drivers/rtc/rtc-cmos.c |4 ++-- drivers/rtc/rtc-ds1511.c | 18 +- drivers/rtc/rtc-ds1553.c | 14 +++--- drivers/rtc/rtc-m48t59.c |2 +- drivers/rtc/rtc-sh.c | 10 ++ drivers/rtc/rtc-stk17ta8.c | 14 +++--- drivers/rtc/rtc-vr41xx.c |4 ++-- 8 files changed, 35 insertions(+), 33 deletions(-) diff --git a/drivers/rtc/rtc-at32ap700x.c b/drivers/rtc/rtc-at32ap700x.c index 90b9a65..e327f6f 100644 --- a/drivers/rtc/rtc-at32ap700x.c +++ b/drivers/rtc/rtc-at32ap700x.c @@ -222,7 +222,7 @@ static int __init at32_rtc_probe(struct platform_device *pdev) } irq = platform_get_irq(pdev, 0); - if (irq 0) { + if (irq = 0) { dev_dbg(pdev-dev, could not get irq\n); ret = -ENXIO; goto out; diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c index 6ea349a..2dea444 100644 --- a/drivers/rtc/rtc-cmos.c +++ b/drivers/rtc/rtc-cmos.c @@ -57,8 +57,8 @@ struct cmos_rtc { u8 century; }; -/* both platform and pnp busses use negative numbers for invalid irqs */ -#define is_valid_irq(n)((n) = 0) +/* both platform and pnp busses use positive numbers for valid irqs */ +#define is_valid_irq(n)((n) 0) static const char driver_name[] = rtc_cmos; diff --git a/drivers/rtc/rtc-ds1511.c b/drivers/rtc/rtc-ds1511.c index 0f0d27d..434b045 100644 --- a/drivers/rtc/rtc-ds1511.c +++ b/drivers/rtc/rtc-ds1511.c @@ -326,9 +326,9 @@ ds1511_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm) struct platform_device *pdev = to_platform_device(dev); struct rtc_plat_data *pdata = platform_get_drvdata(pdev); - if (pdata-irq 0) { + if (pdata-irq = 0) return -EINVAL; - } + pdata-alrm_mday = alrm-time.tm_mday; pdata-alrm_hour = alrm-time.tm_hour; pdata-alrm_min = alrm-time.tm_min; @@ -346,9 +346,9 @@ ds1511_rtc_read_alarm(struct device *dev, struct rtc_wkalrm *alrm) struct platform_device *pdev = to_platform_device(dev); struct rtc_plat_data *pdata = platform_get_drvdata(pdev); - if (pdata-irq 0) { + if (pdata-irq = 0) return -EINVAL; - } + alrm-time.tm_mday = pdata-alrm_mday 0 ? 0 : pdata-alrm_mday; alrm-time.tm_hour = pdata-alrm_hour 0 ? 0 : pdata-alrm_hour; alrm-time.tm_min = pdata-alrm_min 0 ? 0 : pdata-alrm_min; @@ -385,7 +385,7 @@ ds1511_rtc_release(struct device *dev) struct platform_device *pdev = to_platform_device(dev); struct rtc_plat_data *pdata = platform_get_drvdata(pdev); - if (pdata-irq = 0) { + if (pdata-irq 0) { pdata-irqen = 0; ds1511_rtc_update_alarm(pdata); } @@ -397,7 +397,7 @@ ds1511_rtc_ioctl(struct device *dev, unsigned int cmd, unsigned long arg) struct platform_device *pdev = to_platform_device(dev); struct rtc_plat_data *pdata = platform_get_drvdata(pdev); - if (pdata-irq 0) { + if (pdata-irq = 0) { return -ENOIOCTLCMD; /* fall back into rtc-dev's emulation */ } switch (cmd) { @@ -559,7 +559,7 @@ ds1511_rtc_probe(struct platform_device *pdev) * if the platform has an interrupt in mind for this device, * then by all means, set it */ - if (pdata-irq = 0) { + if (pdata-irq 0) { rtc_read(RTC_CMD1); if (request_irq(pdata-irq, ds1511_interrupt, IRQF_DISABLED | IRQF_SHARED, pdev-name, pdev) 0) { @@ -586,7 +586,7 @@ ds1511_rtc_probe(struct platform_device *pdev) if (pdata-rtc) { rtc_device_unregister(pdata-rtc); } - if (pdata-irq = 0) { + if (pdata-irq 0) { free_irq(pdata-irq, pdev); } if (ds1511_base) { @@ -609,7 +609,7 @@ ds1511_rtc_remove(struct platform_device *pdev) sysfs_remove_bin_file(pdev-dev.kobj, ds1511_nvram_attr); rtc_device_unregister(pdata-rtc); pdata-rtc = NULL; - if (pdata-irq = 0) { + if (pdata-irq 0) { /* * disable the alarm interrupt */ diff --git
RE: [PATCH]: [MPC5200] Add ATA DMA support
Hi Tim, Continuing the discussion on the mailing list ... Looking at the original patch I don't undestand why you had to duplicate the bestcomm data structures and functions. The only apparent difference is that you have a minimal data length of 2 bytes instead of 1. Does this make any difference as the bd_size will be filled with the correct length value anyway ? Moreover what is the difference between bcom_submit_next_buffer() and bcom_submit_next_buffer2() ? The same with bcom_retrieve_buffer() and bcom_retrieve_buffer2(). Why are these functions implemented unequally ? Best regards, Daniel Schnell. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 5/5 try2] ib/ehca: discard double CQE for one WR
thanks, applied all 5. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 1/1] powerpc: Fix vio_bus_probe oops on probe error
When CMO is enabled and booted on a non CMO system and the VIO device's probe function fails, an oops can result since vio_cmo_bus_remove is called when it should not. cpu 0x0: Vector: 300 (Data Access) at [ce13b3d0] pc: c0020d34: .vio_cmo_bus_remove+0xc0/0x1f4 lr: c0020ca4: .vio_cmo_bus_remove+0x30/0x1f4 sp: ce13b650 msr: 80009032 dar: 0 dsisr: 4000 current = 0xce0566c0 paca= 0xc06f9b80 pid = 2428, comm = modprobe enter ? for help [ce13b6e0] c0021d94 .vio_bus_probe+0x2f8/0x33c [ce13b7a0] c029fc88 .driver_probe_device+0x13c/0x200 [ce13b830] c029fdac .__driver_attach+0x60/0xa4 [ce13b8c0] c029f050 .bus_for_each_dev+0x80/0xd8 [ce13b980] c029f9ec .driver_attach+0x28/0x40 [ce13ba00] c029f630 .bus_add_driver+0xd4/0x284 [ce13baa0] c02a01bc .driver_register+0xc4/0x198 [ce13bb50] c002168c .vio_register_driver+0x40/0x5c [ce13bbe0] d03b3f1c .ibmvfc_module_init+0x70/0x109c [ibmvfc] [ce13bc70] c00acf08 .sys_init_module+0x184c/0x1a10 [ce13be30] c0008748 syscall_exit+0x0/0x40 Signed-off-by: Brian King [EMAIL PROTECTED] --- arch/powerpc/kernel/vio.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN arch/powerpc/kernel/vio.c~powerpc_vio_bus_probe_error arch/powerpc/kernel/vio.c --- linux-2.6/arch/powerpc/kernel/vio.c~powerpc_vio_bus_probe_error 2008-08-12 13:43:02.0 -0500 +++ linux-2.6-bjking1/arch/powerpc/kernel/vio.c 2008-08-12 13:43:56.0 -0500 @@ -1113,7 +1113,7 @@ static int vio_bus_probe(struct device * return error; } error = viodrv-probe(viodev, id); - if (error) + if (error firmware_has_feature(FW_FEATURE_CMO)) vio_cmo_bus_remove(viodev); } _ ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC/PATCH 1/3] powerpc: add ioremap_bat() function for setting up BAT translated IO regions.
On Thu, Aug 07, 2008 at 07:04:04PM -0500, Kumar Gala wrote: mem_init_done isn't a good indication. We can do page tables when it's 0, we would have to use a separate mem_preinit_done or something :-) I initially also though about a flag to ioremap_prot to be honest. But it does obfuscate the normal ioremap code path and if there's a flag, that means that callers know the difference and thus may as well call a separate function, don't you think ? I'm ok with exposing a separate function as far as the API goes.. I'm not ok with duplicating the logic of __ioremap(). Turns out there is very little actual duplication of code with __ioremap(). The checks for p_mapped_by_* are the same, but all the alignment checks are different because different boundaries are used. I attempted to break things down to a common function, but there is not a lot there. But I will add a function to manage modification of ioremap_bot. g. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] gianfar: Call gfar_halt_nodisable() from gfar_halt().
gfar_halt() was factored out into halting and disabling by commit d87eb12785c14de1586e3bad86ca2c0991300339, as the suspend() method only wants to do the former. However, the call to gfar_halt_nodisable() from gfar_halt() apparently got lost during the patch respin process. This adds it back. Signed-off-by: Scott Wood [EMAIL PROTECTED] --- drivers/net/gianfar.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c index ca6cf6e..999d691 100644 --- a/drivers/net/gianfar.c +++ b/drivers/net/gianfar.c @@ -134,9 +134,7 @@ static int gfar_process_frame(struct net_device *dev, struct sk_buff *skb, int l static void gfar_vlan_rx_register(struct net_device *netdev, struct vlan_group *grp); void gfar_halt(struct net_device *dev); -#ifdef CONFIG_PM static void gfar_halt_nodisable(struct net_device *dev); -#endif void gfar_start(struct net_device *dev); static void gfar_clear_exact_match(struct net_device *dev); static void gfar_set_mac_for_addr(struct net_device *dev, int num, u8 *addr); @@ -631,7 +629,6 @@ static void init_registers(struct net_device *dev) } -#ifdef CONFIG_PM /* Halt the receive and transmit queues */ static void gfar_halt_nodisable(struct net_device *dev) { @@ -657,7 +654,6 @@ static void gfar_halt_nodisable(struct net_device *dev) cpu_relax(); } } -#endif /* Halt the receive and transmit queues */ void gfar_halt(struct net_device *dev) @@ -666,6 +662,8 @@ void gfar_halt(struct net_device *dev) struct gfar __iomem *regs = priv-regs; u32 tempval; + gfar_halt_nodisable(dev); + /* Disable Rx and Tx */ tempval = gfar_read(regs-maccfg1); tempval = ~(MACCFG1_RX_EN | MACCFG1_TX_EN); -- 1.5.6.rc1.6.gc53ad.dirty ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] powerpc: some vmlinux files that should be ignored
In message [EMAIL PROTECTED] you wrote: A couple of vmlinux output files that should be ignored: vmlinux.strip and vmlinux.lds. Signed-off-by: Sean MacLennan [EMAIL PROTECTED] --- diff --git a/.gitignore b/.gitignore This will need to goto LKML. Mikey index 869e1a3..f7e924a 100644 --- a/.gitignore +++ b/.gitignore @@ -32,6 +32,7 @@ tags TAGS vmlinux +vmlinux.strip System.map Module.markers Module.symvers diff --git a/arch/powerpc/kernel/.gitignore b/arch/powerpc/kernel/.gitignore new file mode 100644 index 000..c5f676c --- /dev/null +++ b/arch/powerpc/kernel/.gitignore @@ -0,0 +1 @@ +vmlinux.lds ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH] remove redundant sysfs_remove_file calls for cache info
When removing a directory, the sysfs core takes care of removing files in the directory (see sysfs_remove_dir()). So when we are about to delete a kobject (and thus cause its sysfs directory to be removed), we don't have to explicitly remove the files attached to it, although it's harmless to do so. Signed-off-by: Nathan Lynch [EMAIL PROTECTED] --- arch/powerpc/kernel/sysfs.c | 11 ++- 1 files changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c index 56d172d..12058db 100644 --- a/arch/powerpc/kernel/sysfs.c +++ b/arch/powerpc/kernel/sysfs.c @@ -641,16 +641,9 @@ static void remove_cache_info(struct sys_device *sysdev) int cpu = sysdev-id; cache_desc = per_cpu(cache_desc, cpu); - if (cache_desc != NULL) { - sysfs_remove_file(cache_desc-kobj, cache_size_attr.attr); - sysfs_remove_file(cache_desc-kobj, cache_line_size_attr.attr); - sysfs_remove_file(cache_desc-kobj, cache_type_attr.attr); - sysfs_remove_file(cache_desc-kobj, cache_level_attr.attr); - sysfs_remove_file(cache_desc-kobj, cache_nr_sets_attr.attr); - sysfs_remove_file(cache_desc-kobj, cache_assoc_attr.attr); - + if (cache_desc != NULL) kobject_put(cache_desc-kobj); - } + cache_toplevel = per_cpu(cache_toplevel, cpu); if (cache_toplevel != NULL) kobject_put(cache_toplevel); -- 1.5.5 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 4/5] powerpc: Make the 64-bit kernel as a position-independent executable
This implements CONFIG_RELOCATABLE for 64-bit by making the kernel as a position-independent executable (PIE). This involves processing the dynamic relocations in the image in the early stages of booting, even if the kernel is being run at the address it is linked at, since the linker does not necessarily fill in words in the image for which there are dynamic relocations. The dynamic relocations are processed by a new function relocate(addr), where the addr parameter is the virtual address where the image will be run. In fact we call it twice; once before calling prom_init, and again when starting the main kernel. This means that reloc_offset() returns 0 in prom_init (since it has been relocated to the address it is running at), which necessitated a few adjustments. The relocate() function currently only handles R_PPC64_RELATIVE relocs, which are very simple to process (and the linker puts them all first in the dynamic relocation section, and tells us how many of them there are). Currently we only get R_PPC64_RELATIVE relocs, plus one R_PPC64_NONE reloc which we can ignore, plus some relocs against weak undefined symbols (e.g. mach_iseries, mach_powermac) which we can also ignore. Ideally we would have a little program to check that we hadn't inadvertently ended up with any other relocs. This also changes __va and __pa to use an equivalent definition that is simpler. With the relocatable kernel, PAGE_OFFSET and MEMORY_START are constants (for 64-bit) whereas PHYSICAL_START is a variable (and KERNELBASE ideally should be too, but isn't yet). With this, relocatable kernels still copy themselves down to physical address 0 and run there. Signed-off-by: Paul Mackerras [EMAIL PROTECTED] --- diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 63c9caf..5a5cf3f 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -809,6 +809,19 @@ config PIN_TLB endmenu if PPC64 +config RELOCATABLE + bool Build a relocatable kernel + help + This builds a kernel image that is capable of running anywhere + in the RMA (real memory area) at any 16k-aligned base address. + The kernel is linked as a position-independent executable (PIE) + and contains dynamic relocations which are processed early + in the bootup process. + + One use is for the kexec on panic case where the recovery kernel + must live at a different physical address than the primary + kernel. + config PAGE_OFFSET hex default 0xc000 diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 9155c93..9e5a53f 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -63,7 +63,8 @@ override CC += -m$(CONFIG_WORD_SIZE) override AR:= GNUTARGET=elf$(CONFIG_WORD_SIZE)-powerpc $(AR) endif -LDFLAGS_vmlinux:= -Bstatic +LDFLAGS_vmlinux-$(CONFIG_PPC64)$(CONFIG_RELOCATABLE) := -pie +LDFLAGS_vmlinux:= -Bstatic $(LDFLAGS_vmlinux-yy) CFLAGS-$(CONFIG_PPC64) := -mminimal-toc -mtraceback=none -mcall-aixdesc CFLAGS-$(CONFIG_PPC32) := -ffixed-r2 -mmultiple diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 14174aa..9109e1f 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -310,8 +310,11 @@ $(obj)/dtbImage.%: vmlinux $(wrapperbits) $(obj)/%.dtb $(obj)/vmlinux.strip: vmlinux $(STRIP) -s -R .comment $ -o $@ +# The iseries hypervisor won't take an ET_DYN executable, so this +# changes the type (byte 17) in the file to ET_EXEC (2). $(obj)/zImage.iseries: vmlinux $(STRIP) -s -R .comment $ -o $@ + printf \x02 | dd of=$@ conv=notrunc bs=1 seek=17 $(obj)/uImage: vmlinux $(wrapperbits) $(call if_changed,wrap,uboot) diff --git a/arch/powerpc/boot/elf_util.c b/arch/powerpc/boot/elf_util.c index 7454aa4..1567a0c 100644 --- a/arch/powerpc/boot/elf_util.c +++ b/arch/powerpc/boot/elf_util.c @@ -27,7 +27,8 @@ int parse_elf64(void *hdr, struct elf_info *info) elf64-e_ident[EI_MAG3] == ELFMAG3 elf64-e_ident[EI_CLASS] == ELFCLASS64 elf64-e_ident[EI_DATA] == ELFDATA2MSB - elf64-e_type== ET_EXEC + (elf64-e_type== ET_EXEC || + elf64-e_type== ET_DYN) elf64-e_machine == EM_PPC64)) return 0; @@ -58,7 +59,8 @@ int parse_elf32(void *hdr, struct elf_info *info) elf32-e_ident[EI_MAG3] == ELFMAG3 elf32-e_ident[EI_CLASS] == ELFCLASS32 elf32-e_ident[EI_DATA] == ELFDATA2MSB - elf32-e_type== ET_EXEC + (elf32-e_type== ET_EXEC || + elf32-e_type== ET_DYN) elf32-e_machine == EM_PPC)) return 0; diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h index
[PATCH 2/5] powerpc: Make it possible to move the interrupt handlers away from the kernel
This changes the way that the exception prologs transfer control to the handlers in 64-bit kernels with the aim of making it possible to have the prologs separate from the main body of the kernel. Now, instead of computing the address of the handler by taking the top 32 bits of the paca address (to get the 0xc000 part) and ORing in something in the bottom 16 bits, we get the base address of the kernel by doing a load from the paca and add an offset. This also replaces an mfmsr and an ori to compute the MSR value for the handler with a load from the paca. That makes it unnecessary to have a separate version of EXCEPTION_PROLOG_PSERIES that forces 64-bit mode. We can no longer use a direct branches in the exception prolog code, which means that the SLB miss handlers can't branch directly to .slb_miss_realmode any more. Instead we have to compute the address and do an indirect branch. Since the secondary CPUs on pSeries start execution in the first 0x100 bytes of real memory and then have to get to wherever the kernel is, we can't use a direct branch to get there. Instead this changes __secondary_hold_spinloop from a flag to a function pointer. When it is set to a non-NULL value, the secondary CPUs jump to the function pointed to by that value. Finally this eliminates one code difference between 32-bit and 64-bit by making __secondary_hold be the text address of the secondary CPU spinloop rather than a function descriptor for it. Signed-off-by: Paul Mackerras [EMAIL PROTECTED] --- diff --git a/arch/powerpc/include/asm/exception.h b/arch/powerpc/include/asm/exception.h index 329148b..d3d4534 100644 --- a/arch/powerpc/include/asm/exception.h +++ b/arch/powerpc/include/asm/exception.h @@ -53,14 +53,8 @@ * low halfword of the address, but for Kdump we need the whole low * word. */ -#ifdef CONFIG_CRASH_DUMP #define LOAD_HANDLER(reg, label) \ - orisreg,reg,(label)@h; /* virt addr of handler ... */ \ - ori reg,reg,(label)@l; /* .. and the rest */ -#else -#define LOAD_HANDLER(reg, label) \ - ori reg,reg,(label)@l; /* virt addr of handler ... */ -#endif + addireg,reg,(label)-_stext; /* virt addr of handler ... */ #define EXCEPTION_PROLOG_1(area) \ mfspr r13,SPRN_SPRG3; /* get paca address into r13 */ \ @@ -72,37 +66,12 @@ std r9,area+EX_R13(r13);\ mfcrr9 -/* - * Equal to EXCEPTION_PROLOG_PSERIES, except that it forces 64bit mode. - * The firmware calls the registered system_reset_fwnmi and - * machine_check_fwnmi handlers in 32bit mode if the cpu happens to run - * a 32bit application at the time of the event. - * This firmware bug is present on POWER4 and JS20. - */ -#define EXCEPTION_PROLOG_PSERIES_FORCE_64BIT(area, label) \ - EXCEPTION_PROLOG_1(area); \ - clrrdi r12,r13,32; /* get high part of label */ \ - mfmsr r10;\ - /* force 64bit mode */ \ - li r11,5; /* MSR_SF_LG|MSR_ISF_LG */ \ - rldimi r10,r11,61,0; /* insert into top 3 bits */\ - /* done 64bit mode */ \ - mfspr r11,SPRN_SRR0; /* save SRR0 */ \ - LOAD_HANDLER(r12,label) \ - ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \ - mtspr SPRN_SRR0,r12; \ - mfspr r12,SPRN_SRR1; /* and SRR1 */ \ - mtspr SPRN_SRR1,r10; \ - rfid; \ - b . /* prevent speculative execution */ - #define EXCEPTION_PROLOG_PSERIES(area, label) \ EXCEPTION_PROLOG_1(area); \ - clrrdi r12,r13,32; /* get high part of label */ \ - mfmsr r10;\ + ld r12,PACAKBASE(r13); /* get high part of label */ \ + ld r10,PACAKMSR(r13); /* get MSR value for kernel */ \ mfspr r11,SPRN_SRR0; /* save SRR0 */ \ LOAD_HANDLER(r12,label) \ - ori r10,r10,MSR_IR|MSR_DR|MSR_RI; \ mtspr SPRN_SRR0,r12; \ mfspr r12,SPRN_SRR1; /* and SRR1 */ \ mtspr SPRN_SRR1,r10; \ @@ -210,11 +179,10 @@ label##_pSeries:
[PATCH 5/5] powerpc: Run relocatable kernel where it's loaded
This demonstrates that the relocatable kernel doesn't have to run at real address 0. It only copies the interrupt vectors down and leaves the rest of the kernel where it was loaded, and runs it there. This is mostly just a proof of concept, since it doesn't do anything to ensure that the kernel base address is 16kB-aligned, and we probably want to move the kernel down to 0 in most cases (except for kdump kernels) anyway. Signed-off-by: Paul Mackerras [EMAIL PROTECTED] --- diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S index abb3bfe..fdb8565 100644 --- a/arch/powerpc/kernel/head_64.S +++ b/arch/powerpc/kernel/head_64.S @@ -1377,6 +1377,7 @@ _STATIC(__after_prom_start) /* process relocations for the final address of the kernel */ lis r25,[EMAIL PROTECTED] /* compute virtual base of kernel */ sldir25,r25,32 + add r25,r25,r26 mr r3,r25 bl .relocate #endif @@ -1391,10 +1392,13 @@ _STATIC(__after_prom_start) li r3,0/* target addr */ mr. r4,r26 /* In some cases the loader may */ beq 9f /* have already put us at zero */ - lis r5,(copy_to_here - _stext)@ha - addir5,r5,(copy_to_here - _stext)@l /* # bytes of memory to copy */ li r6,0x100/* Start offset, the first 0x100 */ /* bytes were copied earlier.*/ +#ifdef CONFIG_RELOCATABLE + li r5,__end_interrupts - _stext/* just copy interrupts */ +#else + lis r5,(copy_to_here - _stext)@ha + addir5,r5,(copy_to_here - _stext)@l /* # bytes of memory to copy */ bl .copy_and_flush /* copy the first n bytes*/ /* this includes the code being */ @@ -1404,15 +1408,16 @@ _STATIC(__after_prom_start) mtctr r8 bctr +p_end: .llong _end - _stext + 4: /* Now copy the rest of the kernel up to _end */ addis r5,r26,(p_end - _stext)@ha ld r5,(p_end - _stext)@l(r5) /* get _end */ +#endif bl .copy_and_flush /* copy the rest */ 9: b .start_here_multiplatform -p_end: .llong _end - _stext - /* * Copy routine used to copy the kernel to start at physical address 0 * and flush and invalidate the caches as needed. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[PATCH 3/5] powerpc: Only use LOAD_REG_IMMEDIATE for constants on 64-bit
Using LOAD_REG_IMMEDIATE to get the address of kernel symbols generates 5 instructions where LOAD_REG_ADDR can do it in one, and will generate R_PPC64_ADDR16_* relocations in the output when we get to making the kernel as a position-independent executable, which we'd rather not have to handle. This changes various bits of assembly code to use LOAD_REG_ADDR when we need to get the address of a symbol, or to use suitable position-independent code for cases where we can't access the TOC for various reasons, or if we're not running at the address we were linked at. It also cleans up a few minor things; there's no reason to save and restore SRR0/1 around RTAS calls, __mmu_off can get the return address from LR more conveniently than the caller can supply it in R4 (and we already assume elsewhere that EA == RA if the MMU is on in early boot), and enable_64b_mode was using 5 instructions where 2 would do. Signed-off-by: Paul Mackerras [EMAIL PROTECTED] --- diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 0966899..c4a029c 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -268,7 +268,7 @@ n: * Loads the value of the constant expression 'expr' into register 'rn' * using immediate instructions only. Use this when it's important not * to reference other data (i.e. on ppc64 when the TOC pointer is not - * valid). + * valid) and when 'expr' is a constant or absolute address. * * LOAD_REG_ADDR(rn, name) * Loads the address of label 'name' into register 'rn'. Use this when diff --git a/arch/powerpc/kernel/cpu_setup_ppc970.S b/arch/powerpc/kernel/cpu_setup_ppc970.S index bf118c3..27f2507 100644 --- a/arch/powerpc/kernel/cpu_setup_ppc970.S +++ b/arch/powerpc/kernel/cpu_setup_ppc970.S @@ -110,7 +110,7 @@ load_hids: isync /* Save away cpu state */ - LOAD_REG_IMMEDIATE(r5,cpu_state_storage) + LOAD_REG_ADDR(r5,cpu_state_storage) /* Save HID0,1,4 and 5 */ mfspr r3,SPRN_HID0 @@ -134,7 +134,7 @@ _GLOBAL(__restore_cpu_ppc970) rldicl. r0,r0,4,63 beqlr - LOAD_REG_IMMEDIATE(r5,cpu_state_storage) + LOAD_REG_ADDR(r5,cpu_state_storage) /* Before accessing memory, we make sure rm_ci is clear */ li r0,0 mfspr r3,SPRN_HID4 diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 2d802e9..5a8619f 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -685,10 +685,6 @@ _GLOBAL(enter_rtas) std r7,_DAR(r1) mfdsisr r8 std r8,_DSISR(r1) - mfsrr0 r9 - std r9,_SRR0(r1) - mfsrr1 r10 - std r10,_SRR1(r1) /* Temporary workaround to clear CR until RTAS can be modified to * ignore all bits. @@ -749,6 +745,10 @@ _STATIC(rtas_return_loc) mfspr r4,SPRN_SPRG3 /* Get PACA */ clrldi r4,r4,2 /* convert to realmode address */ + bcl 20,31,$+4 +0: mflrr3 + ld r3,(1f-0b)(r3) /* get .rtas_restore_regs */ + mfmsr r6 li r0,MSR_RI andcr6,r6,r0 @@ -756,7 +756,6 @@ _STATIC(rtas_return_loc) mtmsrd r6 ld r1,PACAR1(r4) /* Restore our SP */ - LOAD_REG_IMMEDIATE(r3,.rtas_restore_regs) ld r4,PACASAVEDMSR(r4) /* Restore our MSR */ mtspr SPRN_SRR0,r3 @@ -764,6 +763,8 @@ _STATIC(rtas_return_loc) rfid b . /* prevent speculative execution */ +1: .llong .rtas_restore_regs + _STATIC(rtas_restore_regs) /* relocation is on at this point */ REST_GPR(2, r1) /* Restore the TOC */ @@ -783,10 +784,6 @@ _STATIC(rtas_restore_regs) mtdar r7 ld r8,_DSISR(r1) mtdsisr r8 - ld r9,_SRR0(r1) - mtsrr0 r9 - ld r10,_SRR1(r1) - mtsrr1 r10 addi r1,r1,RTAS_FRAME_SIZE /* Unstack our frame */ ld r0,16(r1) /* get return address */ diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S index 229ccd1..afbd530 100644 --- a/arch/powerpc/kernel/head_64.S +++ b/arch/powerpc/kernel/head_64.S @@ -128,11 +128,11 @@ __secondary_hold: /* Tell the master cpu we're here */ /* Relocation is off we are located at an address less */ /* than 0x100, so only need to grab low order offset.*/ - std r24,[EMAIL PROTECTED](0) + std r24,__secondary_hold_acknowledge-_stext(0) sync /* All secondary cpus wait here until told to start. */ -100: ld r4,[EMAIL PROTECTED](0) +100: ld r4,__secondary_hold_spinloop-_stext(0) cmpdi 0,r4,0 beq 100b @@ -1216,11 +1216,14 @@ _GLOBAL(generic_secondary_smp_init) /* turn on 64-bit mode */ bl .enable_64b_mode + /* get the
[PATCH 1/5] powerpc: Move interrupt handler code to the beginning of head_64.S
This rearranges head_64.S so that we have all the first-level exception prologs together starting at 0x100, followed by all the second-level handlers that are invoked from the first-level prologs, followed by other code. This doesn't make any functional change but will make following changes for relocatable kernel support easier. Signed-off-by: Paul Mackerras [EMAIL PROTECTED] --- diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S index cc8fb47..27935d1 100644 --- a/arch/powerpc/kernel/head_64.S +++ b/arch/powerpc/kernel/head_64.S @@ -325,16 +325,32 @@ do_stab_bolted_pSeries: mfspr r12,SPRN_SPRG2 EXCEPTION_PROLOG_PSERIES(PACA_EXSLB, .do_stab_bolted) +#ifdef CONFIG_PPC_PSERIES +/* + * Vectors for the FWNMI option. Share common code. + */ + .globl system_reset_fwnmi + .align 7 +system_reset_fwnmi: + HMT_MEDIUM + mtspr SPRN_SPRG1,r13 /* save r13 */ + EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common) + + .globl machine_check_fwnmi + .align 7 +machine_check_fwnmi: + HMT_MEDIUM + mtspr SPRN_SPRG1,r13 /* save r13 */ + EXCEPTION_PROLOG_PSERIES(PACA_EXMC, machine_check_common) + +#endif /* CONFIG_PPC_PSERIES */ + +#ifdef __DISABLED__ /* - * We have some room here we use that to put - * the peries slb miss user trampoline code so it's reasonably - * away from slb_miss_user_common to avoid problems with rfid - * * This is used for when the SLB miss handler has to go virtual, * which doesn't happen for now anymore but will once we re-implement * dynamic VSIDs for shared page tables */ -#ifdef __DISABLED__ slb_miss_user_pseries: std r10,PACA_EXGEN+EX_R10(r13) std r11,PACA_EXGEN+EX_R11(r13) @@ -357,25 +373,14 @@ slb_miss_user_pseries: b . /* prevent spec. execution */ #endif /* __DISABLED__ */ -#ifdef CONFIG_PPC_PSERIES + .align 7 + .globl __end_interrupts +__end_interrupts: + /* - * Vectors for the FWNMI option. Share common code. + * Code from here down to __end_handlers is invoked from the + * exception prologs above. */ - .globl system_reset_fwnmi - .align 7 -system_reset_fwnmi: - HMT_MEDIUM - mtspr SPRN_SPRG1,r13 /* save r13 */ - EXCEPTION_PROLOG_PSERIES_FORCE_64BIT(PACA_EXGEN, system_reset_common) - - .globl machine_check_fwnmi - .align 7 -machine_check_fwnmi: - HMT_MEDIUM - mtspr SPRN_SPRG1,r13 /* save r13 */ - EXCEPTION_PROLOG_PSERIES_FORCE_64BIT(PACA_EXMC, machine_check_common) - -#endif /* CONFIG_PPC_PSERIES */ /*** Common interrupt handlers ***/ @@ -457,65 +462,6 @@ bad_stack: b 1b /* - * Return from an exception with minimal checks. - * The caller is assumed to have done EXCEPTION_PROLOG_COMMON. - * If interrupts have been enabled, or anything has been - * done that might have changed the scheduling status of - * any task or sent any task a signal, you should use - * ret_from_except or ret_from_except_lite instead of this. - */ -fast_exc_return_irq: /* restores irq state too */ - ld r3,SOFTE(r1) - TRACE_AND_RESTORE_IRQ(r3); - ld r12,_MSR(r1) - rldicl r4,r12,49,63/* get MSR_EE to LSB */ - stb r4,PACAHARDIRQEN(r13) /* restore paca-hard_enabled */ - b 1f - - .globl fast_exception_return -fast_exception_return: - ld r12,_MSR(r1) -1: ld r11,_NIP(r1) - andi. r3,r12,MSR_RI /* check if RI is set */ - beq-unrecov_fer - -#ifdef CONFIG_VIRT_CPU_ACCOUNTING - andi. r3,r12,MSR_PR - beq 2f - ACCOUNT_CPU_USER_EXIT(r3, r4) -2: -#endif - - ld r3,_CCR(r1) - ld r4,_LINK(r1) - ld r5,_CTR(r1) - ld r6,_XER(r1) - mtcrr3 - mtlrr4 - mtctr r5 - mtxer r6 - REST_GPR(0, r1) - REST_8GPRS(2, r1) - - mfmsr r10 - rldicl r10,r10,48,1/* clear EE */ - rldicr r10,r10,16,61 /* clear RI (LE is 0 already) */ - mtmsrd r10,1 - - mtspr SPRN_SRR1,r12 - mtspr SPRN_SRR0,r11 - REST_4GPRS(10, r1) - ld r1,GPR1(r1) - rfid - b . /* prevent speculative execution */ - -unrecov_fer: - bl .save_nvgprs -1: addir3,r1,STACK_FRAME_OVERHEAD - bl .unrecoverable_exception - b 1b - -/* * Here r13 points to the paca, r9 contains the saved CR, * SRR0 and SRR1 are saved in r11 and r12, * r9 - r13 are saved in paca-exgen. @@ -766,6 +712,85 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) bl .altivec_unavailable_exception b .ret_from_except + .align 7 + .globl vsx_unavailable_common +vsx_unavailable_common: + EXCEPTION_PROLOG_COMMON(0xf40, PACA_EXGEN) +#ifdef CONFIG_VSX +BEGIN_FTR_SECTION +
[PATCH 0/5] Relocatable 64-bit kernel using linker PIE support
The following series of patches implement support for a relocatable kernel by building it as a position-independent executable (PIE). When the linker is given the -pie flag, it creates an executable that contains dynamic relocations which can be used to relocate the image at boot time for any desired base address. This patch series adds a CONFIG_RELOCATABLE config option for 64-bit which links the kernel with -pie and arranges to process the relocations in early boot. With the first 4 patches applied, a relocatable kernel will still copy itself down to real address 0. The last patch changes things so that a relocatable kernel will run wherever it was loaded. This last patch is pretty much just a proof of concept since it doesn't do anything to ensure appropriate alignment of the base address (the base address needs to be 16kB aligned). We probably want to work out whether we are a kdump kernel and run in-place if so, or copy down to 0 if not. Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
[RFC/PATCH v2] powerpc: add ioremap_early() function for mapping IO regions before MMU_init()
From: Grant Likely [EMAIL PROTECTED] ioremap_early() is useful for things like mapping SoC internally memory mapped register and early text because it allows mappings to devices to be setup early in the boot process where they are needed, and the mappings persist after the MMU is configured. Without ioremap_early(), setting up the MMU would cause the early text mappings to get lost and mostly likely result in a kernel panic on the next attempt at output. Signed-off-by: Grant Likely [EMAIL PROTECTED] --- I've made changes based on the comments I've received. I tried to share code between __ioremap() and ioremap_early(), but when it came down to writing the code, there was very little that was actually common. Most of it was around access to the ioremap_bot variable, but due to the different alignments, the code ended up being different anyway. I've not made any attempt to have this routine work after mem_init() time. I don't think the use case justifies the extra code and I think I want to enforce mapping with BATs (or other large region methods) to be performed before smaller ioremaps() to maximize the performance gains. If the BATs are mapped first, then many smaller ioremaps() get to use them 'for free'. Comments? g. arch/powerpc/kernel/setup_32.c |4 + arch/powerpc/mm/init_32.c|7 -- arch/powerpc/mm/mmu_decl.h |7 +- arch/powerpc/mm/pgtable_32.c | 75 arch/powerpc/mm/ppc_mmu_32.c | 140 -- arch/powerpc/sysdev/cpm_common.c |2 - include/asm-powerpc/io.h |8 ++ 7 files changed, 209 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c index 066e65c..822ae7e 100644 --- a/arch/powerpc/kernel/setup_32.c +++ b/arch/powerpc/kernel/setup_32.c @@ -40,6 +40,7 @@ #include asm/udbg.h #include setup.h +#include mm/mmu_decl.h #define DBG(fmt...) @@ -113,6 +114,9 @@ notrace unsigned long __init early_init(unsigned long dt_ptr) */ notrace void __init machine_init(unsigned long dt_ptr, unsigned long phys) { + /* Get ready to allocate IO virtual address regions */ + ioremap_init(); + /* Enable early debugging if any specified (see udbg.h) */ udbg_early_init(); diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c index 388ceda..a3d9b4e 100644 --- a/arch/powerpc/mm/init_32.c +++ b/arch/powerpc/mm/init_32.c @@ -169,13 +169,6 @@ void __init MMU_init(void) ppc_md.progress(MMU:mapin, 0x301); mapin_ram(); -#ifdef CONFIG_HIGHMEM - ioremap_base = PKMAP_BASE; -#else - ioremap_base = 0xfe00UL;/* for now, could be 0xf000 */ -#endif /* CONFIG_HIGHMEM */ - ioremap_bot = ioremap_base; - /* Map in I/O resources */ if (ppc_md.progress) ppc_md.progress(MMU:setio, 0x302); diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h index fab3cfa..3c951d5 100644 --- a/arch/powerpc/mm/mmu_decl.h +++ b/arch/powerpc/mm/mmu_decl.h @@ -29,11 +29,14 @@ extern void hash_preload(struct mm_struct *mm, unsigned long ea, #ifdef CONFIG_PPC32 extern void mapin_ram(void); extern int map_page(unsigned long va, phys_addr_t pa, int flags); -extern void setbat(int index, unsigned long virt, phys_addr_t phys, - unsigned int size, int flags); +extern int setbat(unsigned long virt, phys_addr_t phys, unsigned int size, + int flags); +extern int loadbat(unsigned long virt, phys_addr_t phys, unsigned int size, + int flags); extern void settlbcam(int index, unsigned long virt, phys_addr_t phys, unsigned int size, int flags, unsigned int pid); extern void invalidate_tlbcam_entry(int index); +extern void ioremap_init(void); /* called by machine_init() */ extern int __map_without_bats; extern unsigned long ioremap_base; diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 2001abd..40820fa 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -55,8 +55,6 @@ extern void hash_page_sync(void); #ifdef HAVE_BATS extern phys_addr_t v_mapped_by_bats(unsigned long va); extern unsigned long p_mapped_by_bats(phys_addr_t pa); -void setbat(int index, unsigned long virt, phys_addr_t phys, - unsigned int size, int flags); #else /* !HAVE_BATS */ #define v_mapped_by_bats(x)(0UL) @@ -142,6 +140,21 @@ void pte_free(struct mm_struct *mm, pgtable_t ptepage) __free_page(ptepage); } +/** + * ioremap_init - setup ioremap address range + */ +void __init ioremap_init(void) +{ + if (ioremap_base) + return; +#ifdef CONFIG_HIGHMEM + ioremap_base = PKMAP_BASE; +#else + ioremap_base = 0xfe00UL;/* for now, could be 0xf000 */ +#endif + ioremap_bot = ioremap_base; +} + void __iomem * ioremap(phys_addr_t addr, unsigned long size) { @@ -265,6 +278,64 @@ void
Re: [PATCH 5121 pci 1/3] powerpc: 83xx: pci: Remove need for get_immrbase from mpc83xx_add_bridge.
On Thu, Aug 07, 2008 at 11:36:25AM -0600, John Rigby wrote: Modify mpc83xx_add_bridge to get config space register base address from the device tree instead of immr + hardcoded offset. 83xx pci nodes have these changes: register properties now contain two address length tuples: First is the pci bridge register base, this has always been there. Second is the config base, this is new. The primary pci bus should have the primary property. These are documented in Documentation/powerpc/dts-bindings/fsl/83xx-512x-pci.txt Looks mostly good to me. I only have one comment on the device tree binding... diff --git a/Documentation/powerpc/dts-bindings/fsl/83xx-512x-pci.txt b/Documentation/powerpc/dts-bindings/fsl/83xx-512x-pci.txt new file mode 100644 index 000..51214a0 --- /dev/null +++ b/Documentation/powerpc/dts-bindings/fsl/83xx-512x-pci.txt @@ -0,0 +1,43 @@ +* Freescale 83xx and 512x PCI bridges + +Freescale 83xx and 512x SOCs include the same pci bridge core. + +83xx/512x specific notes: +- reg: should contain two address length tuples +The first is for the internal pci bridge registers +The second is for the pci config space access registers +- primary: +This property should be present for the primary pci bridge Can you use something like 'fsl,primary-pci-bridge' instead? 'primary' is a little too generic for my taste. Also, the purpose of identifying one of the PCI bridges as primary should be documented (This is me pushing against encoding Linux internal implementation details into the device tree, I suspect that 'primary' doesn't belong in the device tree at all). ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 5121 pci 2/3] powerpc: 5121: Add PCI support.
On Thu, Aug 07, 2008 at 11:36:26AM -0600, John Rigby wrote: Uses mpc83xx_add_bridge in fsl_pci.c Adds second register tuple to pci node register property as previously done for 83xx device trees in a previous patch. Signed-off-by: John Rigby [EMAIL PROTECTED] Looks good to me. Acked-by: Grant Likely [EMAIL PROTECTED] I'll pick this one up once 1/3 is sorted out. g. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 5121 pci 3/3] powerpc: pci: 5121: Hide pci bridge.
On Thu, Aug 07, 2008 at 11:36:27AM -0600, John Rigby wrote: The class of the MPC5121 pci host bridge is PCI_CLASS_BRIDGE_OTHER while other freescale host bridges have class set to PCI_CLASS_PROCESSOR_POWERPC. This patch makes fixup_hide_host_resource_fsl match PCI_CLASS_BRIDGE_OTHER in addition to PCI_CLASS_PROCESSOR_POWERPC. Signed-off-by: John Rigby [EMAIL PROTECTED] I think this is okay, but it might need to be more conservative. I'm not the PCI expert. Kumar, thoughts? g. --- arch/powerpc/kernel/pci_32.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/pci_32.c b/arch/powerpc/kernel/pci_32.c index 88db4ff..162c3a8 100644 --- a/arch/powerpc/kernel/pci_32.c +++ b/arch/powerpc/kernel/pci_32.c @@ -54,11 +54,12 @@ LIST_HEAD(hose_list); static int pci_bus_count; static void -fixup_hide_host_resource_fsl(struct pci_dev* dev) +fixup_hide_host_resource_fsl(struct pci_dev *dev) { int i, class = dev-class 8; - if ((class == PCI_CLASS_PROCESSOR_POWERPC) + if ((class == PCI_CLASS_PROCESSOR_POWERPC || + class == PCI_CLASS_BRIDGE_OTHER) (dev-hdr_type == PCI_HEADER_TYPE_NORMAL) (dev-bus-parent == NULL)) { for (i = 0; i DEVICE_COUNT_RESOURCE; i++) { -- ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 3/5] Apply relocation
Mohan Kumar M writes: This code is a wrapper around regular kernel. This checks whether the kernel is loaded at 32MB, if its not loaded at 32MB, its treated as a regular kernel and the control is given to the kernel immediately. If the kernel is loaded at 32MB, it applies relocation delta to each offset in the list which was generated and appended by patch 1 and 2. After updating all offsets, control is given to the relocatable kernel. In patch 1, you output the addresses for three kinds of relocations, but here you only seem to handle two (R_PPC64_ADDR64 and R_PPC64_ADDR16_HI, I assume). How does that work? In general with this patch series, I would like to have seen much more detailed patch descriptions. I think most of these patches could have used 4 or 5 paragraphs of description (or more if you like) telling us things such as why you handle the particular relocations you do and not others. Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH 4/5] Relocation support
Mohan Kumar M writes: Add relocatable kernel support like avoiding copying the vmlinux image to compile address, adding relocation delta to the absolute symbol references etc. ld does not provide relocation entries for .got section, and the user space relocation extraction program can not process @got entries. So use LOAD_REG_IMMEDIATE macro instead of LOAD_REG_ADDR macro for relocatable kernel. I think this is a symptom of the more general problem that --emit-relocs doesn't actually give us all of the relocations we need. That, and the fact that the relevant code paths in ld are more widely used and better tested, is why I would prefer to build the kernel as a position-independent executable. static inline int in_kernel_text(unsigned long addr) { - if (addr = (unsigned long)_stext addr (unsigned long)__init_end) + if (addr = (unsigned long)_stext addr (unsigned long)__init_end + + kernel_base) Your patch adds kernel_base to some addresses but not to all of them, so your patch description should have told us why you added it in the those places and not others. If you tell us the general principle you're following (even if it seems obvious to you) it will be useful to people chasing bugs or adding new code later on, or even just trying to understand what the code does. - RELOC(alloc_bottom) = PAGE_ALIGN((unsigned long)RELOC(_end) + 0x4000); +#ifndef CONFIG_RELOCATABLE_PPC64 + RELOC(alloc_bottom) = PAGE_ALIGN((unsigned long)RELOC(_end) + 0x4000); +#else + RELOC(alloc_bottom) = PAGE_ALIGN((unsigned long)RELOC(_end) + 0x4000 + + RELOC(reloc_delta)); +#endif Ifdefs in code inside a function are frowned upon in the Linux kernel. Try to find an alternative way to do this, such as ensuring that reloc_delta is 0 when CONFIG_RELOCATABLE_PPC64 is not set. Also it's not clear (to me at least) why you need to add reloc_data in the relocatable case. +#ifndef CONFIG_RELOCATABLE_PPC64 unsigned long *spinloop = (void *) LOW_ADDR(__secondary_hold_spinloop); unsigned long *acknowledge = (void *) LOW_ADDR(__secondary_hold_acknowledge); +#else + unsigned long *spinloop + = (void *) __secondary_hold_spinloop; + unsigned long *acknowledge + = (void *) __secondary_hold_acknowledge; +#endif This also needs some explanation. (Put it in the patch description or in a comment in the code, not in a reply to this mail. :) +#ifndef CONFIG_RELOCATABLE_PPC64 ld r4,[EMAIL PROTECTED](2) +#else + LOAD_REG_IMMEDIATE(r4,htab_hash_mask) +#endif ld r27,0(r4) /* htab_hash_mask - r27 */ Here and in the other similar places, I would prefer you just changed it to LOAD_REG_ADDR and not have any ifdef. Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [RFC/PATCH 2/3] of: add of_lookup_stdout() utility function
On Thu, Aug 07, 2008 at 04:12:54PM +1000, David Gibson wrote: On Wed, Aug 06, 2008 at 11:46:47AM -0500, Timur Tabi wrote: On Wed, Aug 6, 2008 at 1:02 AM, Grant Likely [EMAIL PROTECTED] wrote: From: Grant Likely [EMAIL PROTECTED] of_lookup_stdout() is useful for figuring out what device to use as output for early boot progress messages. It returns the node pointed to by the linux,stdout-path property in the chosen node. I thought linux,stdout-path is deprecated are we're supposed to be using the aliases node instead? During the ePAPR process this idea came up - standardising a 'stdout' alias that would replace linux,stdout-path in chosen. However that was done in ignorance of the history of the linux,stdout-path property and its connection to the stdout ihandle in chosen. In any case, the proposed 'stdout' alias didn't make the final cut for ePAPR, so how to address this for flat-tree systems is still an open question. So, seeing as settling on a way to determine stdout still up in the air, it probably makes sense to condense that code down to a single authoritative function so that changes in this area are contained in one place. For now, I'll stick with decoding linux,stdout-path and on Sparc decoding the ihandle with the expectation that there will be further refinements to be made. g. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH]: [MPC5200] Add ATA DMA support
On Tue, Aug 12, 2008 at 6:30 PM, Daniel Schnell [EMAIL PROTECTED] wrote: Hi Tim, Continuing the discussion on the mailing list ... Looking at the original patch I don't undestand why you had to duplicate the bestcomm data structures and functions. The only apparent difference is that you have a minimal data length of 2 bytes instead of 1. Does this make any difference as the bd_size will be filled with the correct length value anyway ? The new version of the patch does this the correct way. I just haven't back ported this to the 2.6.24 version that I sent you. Tim ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev