[git pull] Please pull powerpc.git next branch
Hi Linus ! So from the depth of frozen Minnesota, here's the powerpc pull request for 3.9. It has a few interesting highlights, in addition to the usual bunch of bug fixes, minor updates, embedded device tree updates and new boards: - Hand tuned asm implementation of SHA1 (by Paulus & Michael Ellerman) - Support for Doorbell interrupts on Power8 (kind of fast thread-thread IPIs) by Ian Munsie - Long overdue cleanup of the way we handle relocation of our open firmware trampoline (prom_init.c) on 64-bit by Anton Blanchard - Support for saving/restoring & context switching the PPR (Processor Priority Register) on server processors that support it. This allows the kernel to preserve thread priorities established by userspace. By Haren Myneni. - DAWR (new watchpoint facility) support on Power8 by Michael Neuling - Ability to change the DSCR (Data Stream Control Register) which controls cache prefetching on a running process via ptrace by Alexey Kardashevskiy - Support for context switching the TAR register on Power8 (new branch target register meant to be used by some new specific userspace perf event interrupt facility which is yet to be enabled) by Ian Munsie. - Improve preservation of the CFAR register (which captures the origin of a branch) on various exception conditions by Paulus. - Move the Bestcomm DMA driver from arch powerpc to drivers/dma where it belongs by Philippe De Muyter - Support for Transactional Memory on Power8 by Michael Neuling (based on original work by Matt Evans). For those curious about the feature, the patch contains a pretty good description. Cheers, Ben. The following changes since commit 689dfa894c57842a05bf6dc9f97e6bb71ec5f386: powerpc: Max next_tb to prevent from replaying timer interrupt (2013-01-29 10:18:16 +1100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git next for you to fetch changes up to 8520e443aa56cc157b015205ea53e7b9fc831291: powerpc/kexec: Disable hard IRQ before kexec (2013-02-24 03:49:28 +1100) Alexey Kardashevskiy (1): powerpc: Add DSCR support to ptrace Anatolij Gustschin (11): powerpc/mpc5121: add common .dtsi and use it in mpc5121ads.dts powerpc/mpc5121: pdm360ng.dts: use common mpc5121.dtsi mpc5121: remove obsolete cell-index property from PSC clock code mpc5121: don't check PSC ac97 using node name powerpc/512x: initialize clocks before bus probing drivers/video: fsl-diu-fb: fix pixel formats for 24 and 16 bpp drivers/video: fsl-diu-fb: fix bugs in interrupt handling powerpc/512x: add function for chip select parameter configuration powerpc/mpc512x: fix noderef sparse warnings powerpc/mpc512x: fix sparce warnings for non static symbols powerpc/mpc5xxx: fix sparse warning for non static symbol Anshuman Khandual (1): powerpc/perf: Change PMU flag representation from decimal to hex Anton Blanchard (7): powerpc: Relocate prom_init.c on 64bit powerpc: Remove RELOC() macro powerpc: Build kernel with -mcmodel=medium powerpc: Run savedefconfig over pseries, ppc64 and ppc64e defconfig powerpc: Cleanup NLS config options on pseries, ppc64 and ppc64e defconfig powerpc: Enable devtmpfs, EFI partition support and tmpfs ACLs on pseries, ppc64 and ppc64e defconfig powerpc: Avoid load of static chain register when calling nested functions through a pointer on 64bit Benjamin Collins (1): powerpc: Add support for CTS-1000 GPIO controlled system poweroff Benjamin Herrenschmidt (4): powerpc: Make room in exception vector area Merge branch 'merge' into next Merge remote-tracking branch 'kumar/next' into next Merge remote-tracking branch 'agust/next' into next Chris Freehill (1): powerpc/perf: Add stalled-cycles events Cody P Schafer (1): powerpc/mm: Eliminate unneeded for_each_memblock Daniel Borkmann (1): powerpc: fix ics_rtas_init and start_secondary section mismatch David Woodhouse (1): powerpc: Enable ARCH_USE_BUILTIN_BSWAP Geoff Levand (4): powerpc/ps3: Add macro PS3_VERBOSE_RESULT powerpc/ps3: Increase verbosity of htab errors powerpc/ps3: Refresh ps3_defconfig powerpc: Move boot_paca into early_setup Gerlando Falauto (2): powerpc/83xx: refactor mpc8360e quirk for kmeter1 powerpc/83xx: apply mpc8360e quirk for kmeter1 only when par_io is present Gernot Vormayr (1): powerpc/dts/virtex440: Add ethernet phy to virtex440-ml507 board Grant Likely (2): powerpc/5200: Add Lite5200 on-board LEDs as devices powerpc/5200: Use the gpt* labels to simplify mpc5200 dts files Haren Myneni (6): powerpc: Move branch instruction from ACCOUNT_CPU_USER_ENTRY to caller powerpc: Enable PPR save/restore powerpc: Increase exceptions arrays in paca struct to save PPR powerpc: Defi
Re: [RFC PATCH -V2 05/21] powerpc: Reduce PTE table memory wastage
Paul Mackerras writes: > On Thu, Feb 21, 2013 at 10:17:12PM +0530, Aneesh Kumar K.V wrote: >> From: "Aneesh Kumar K.V" >> >> We now have PTE page consuming only 2K of the 64K page.This is in order to > > In fact the PTE page together with the hash table indexes occupies 4k, > doesn't it? The comments in the code are similarly confusing since > they talk about 2k but actually allocate 4k. > >> facilitate transparent huge page support, which works much better if our PMDs >> cover 16MB instead of 256MB. >> >> Inorder to reduce the wastage, we now have multiple PTE page fragment > ^ In order (two words) > >> from the same PTE page. > > A patch like this needs a more complete description and explanation > than you have given. For instance, you could mention that the code > that you're adding for the 32-bit and non-64k cases are just copies of > the previously generic code from pgalloc.h (actually, this movement > might be something that could be split out as a separate patch). > Also, you should describe in outline how you keep a list of pages that > aren't fully allocated and have a bitmap of which 4k sections are in > use, and also how your scheme interacts with RCU. will do > > [snip] > >> +#ifdef CONFIG_PPC_64K_PAGES >> +/* >> + * we support 15 fragments per PTE page. This is limited by how many > > Why only 15? Don't we get 16 fragments per page? > That was one of the details I wanted to come back and closely look at before posting. But missed that in the excitement of getting this all working :). ._mapcount is a signed value and hence I was not sure whether setting the top bit have any impact on how we deal with the page in other part of VM. >> + * bits we can pack in page->_mapcount. We use the first half for >> + * tracking the usage for rcu page table free. > > What does "first" mean? The high half or the low half? > high half. >> +unsigned long *page_table_alloc(struct mm_struct *mm, unsigned long vmaddr) >> +{ >> +struct page *page; >> +unsigned int mask, bit; >> +unsigned long *table; >> + >> +/* Allocate fragments of a 4K page as 1K/2K page table */ > > A 4k page? Do you mean a 64k page? And what is 1K to do with > anything? > That should be completely dropped, Cut-paste from s390 code :) >> +#ifdef CONFIG_SMP >> +static void __page_table_free_rcu(void *table) >> +{ >> +unsigned int bit; >> +struct page *page; >> +/* >> + * this is a PTE page free 2K page table >> + * fragment of a 64K page. >> + */ >> +page = virt_to_page(table); >> +bit = 1 << ((__pa(table) & ~PAGE_MASK) / PTE_FRAG_SIZE); >> +bit <<= FRAG_MASK_BITS; >> +/* >> + * clear the higher half and if nobody used the page in >> + * between, even lower half would be zero. >> + */ >> +if (atomic_xor_bits(&page->_mapcount, bit) == 0) { >> +pgtable_page_dtor(page); >> +atomic_set(&page->_mapcount, -1); >> +__free_page(page); >> +} >> +} >> + >> +static void page_table_free_rcu(struct mmu_gather *tlb, unsigned long >> *table) >> +{ >> +struct page *page; >> +struct mm_struct *mm; >> +unsigned int bit, mask; >> + >> +mm = tlb->mm; >> +/* Free 2K page table fragment of a 64K page */ >> +page = virt_to_page(table); >> +bit = 1 << ((__pa(table) & ~PAGE_MASK) / PTE_FRAG_SIZE); >> +spin_lock(&mm->page_table_lock); >> +/* >> + * stash the actual mask in higher half, and clear the lower half >> + * and selectively, add remove from pgtable list >> + */ >> +mask = atomic_xor_bits(&page->_mapcount, bit | (bit << FRAG_MASK_BITS)); >> +if (!(mask & FRAG_MASK)) >> +list_del(&page->lru); >> +else { >> +/* >> + * Add the page table page to pgtable_list so that >> + * the free fragment can be used by the next alloc >> + */ >> +list_del_init(&page->lru); >> +list_add_tail(&page->lru, &mm->context.pgtable_list); >> +} >> +spin_unlock(&mm->page_table_lock); >> +tlb_remove_table(tlb, table); >> +} -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH -V2 03/21] powerpc: Don't hard code the size of pte page
Paul Mackerras writes: > On Thu, Feb 21, 2013 at 10:17:10PM +0530, Aneesh Kumar K.V wrote: >> From: "Aneesh Kumar K.V" >> >> USE PTRS_PER_PTE to indicate the size of pte page. >> >> Signed-off-by: Aneesh Kumar K.V >> powerpc: Don't hard code the size of pte page >> >> USE PTRS_PER_PTE to indicate the size of pte page. >> >> Signed-off-by: Aneesh Kumar K.V > > Description and signoff are duplicated. Description could be more > informative, for example - why would we want to do this? > >> +/* >> + * hidx is in the second half of the page table. We use the >> + * 8 bytes per each pte entry. > > The casual reader probably wouldn't know what "hidx" is. The comment > needs at least to use a better name than "hidx". how about +/* + * We save the slot number & secondary bit in the second half of the + * PTE page. We use the 8 bytes per each pte entry. + */ -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev