Re: [PATCH v2 0/4] dma-mapping: Define dma_{alloc,free}_attrs() for all archs

2013-05-23 Thread Catalin Marinas
On Thu, May 23, 2013 at 03:47:13AM +0100, Damian Hobson-Garcia wrote:
> Hi Catalin,
> On 2013/05/22 18:47, Catalin Marinas wrote:
> > On Wed, May 22, 2013 at 03:37:17AM +0100, Damian Hobson-Garcia wrote:
> >> Hello,
> >> On 2013/04/30 12:01, Damian Hobson-Garcia wrote:
> >>> Most architectures that define CONFIG_HAVE_DMA=y, have implementations for
> >>> both dma_alloc_attrs() and dma_free_attrs().  All achitectures that do
> >>> not define CONFIG_HAVE_DMA also have both of these definitions provided by
> >>> dma-mapping-broken.h.
> > 
> > BTW, shouldn't this be called CONFIG_HAVE_DMA_ATTRS?
> 
> CONFIG_HAVE_DMA_ATTRS is currently used to enable the functions to
> set/get the DMA attribute values. Poking through the headers, it looks
> like the struct dma_attrs is defined regardless of the
> CONFIG_HAVE_DMA_ATTRS setting, so in that respect
> we always seem to "have" DMA attributes (if we have DMA), but they may
> not always be meaningful (ie. set to some value).

My point was about the commit log - grep'ing the kernel for
CONFIG_HAVE_DMA did not return anything.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] mm/kmemleak.c: Merge the consecutive scan-areas.

2013-05-24 Thread Catalin Marinas
On Tue, May 14, 2013 at 12:49:51PM +0100, majianpeng wrote:
> If the scan-areas are adjacent,it can merge in order to reduce memomy.

Have you found any significant reduction in the memory size?

What we miss though is removing an area (and I found a use-case for it).

> +hlist_for_each_entry(area, &object->area_list, node) {
> +if (ptr + size == area->start) {
> +area->start = ptr;
> +area->size += size;
> +goto out_unlock;
> +} else if (ptr == area->start + area->size) {
> +area->size += size;
> +goto out_unlock;

I prefer to keep 'goto' only for the error path. You could add a 'bool
merged' and another 'if' block for area allocation.

I'll pick the other too patches.

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] include/asm-generic/pci.h: include generic "pci-dma-compat.h"

2013-06-27 Thread Catalin Marinas
On Thu, Jun 27, 2013 at 05:03:25AM +0100, Chen Gang wrote:
> If an architecture need "generic pci.h", it also need generic "pci-dma-
> compat.h",  so recommend to include it in asm-generic directly.
> 
> And now, for arm64 and m32r, may cause compiling error about it.
> 
> The related error (with allmodconfig):
> 
>   drivers/media/usb/b2c2/flexcop-usb.c: In function 
> ‘flexcop_usb_transfer_exit’:
>   drivers/media/usb/b2c2/flexcop-usb.c:393:3: error: implicit declaration of 
> function ‘pci_free_consistent’ [-Werror=implicit-function-declaration]
>   drivers/media/usb/b2c2/flexcop-usb.c: In function 
> ‘flexcop_usb_transfer_init’:
>   drivers/media/usb/b2c2/flexcop-usb.c:410:2: error: implicit declaration of 
> function ‘pci_alloc_consistent’ [-Werror=implicit-function-declaration]
>   drivers/media/usb/b2c2/flexcop-usb.c:410:21: warning: assignment makes 
> pointer from integer without a cast [enabled by default]
>   cc1: some warnings being treated as errors
> 
> 
> Signed-off-by: Chen Gang 

>From the arm64 perspective:

Acked-by: Catalin Marinas 

(but make sure it doesn't break other archs)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 0/6] xen/arm/arm64: CONFIG_PARAVIRT and stolen ticks accounting

2013-06-28 Thread Catalin Marinas
On Fri, Jun 28, 2013 at 04:58:40PM +0100, Konrad Rzeszutek Wilk wrote:
> On Fri, Jun 28, 2013 at 12:19:54PM +0100, Stefano Stabellini wrote:
> > Hi all,
> > this patch series introduces stolen ticks accounting for Xen on ARM and
> > ARM64.
> > Stolen ticks are clocksource ticks that have been "stolen" from the cpu,
> > typically because Linux is running in a virtual machine and the vcpu has
> > been descheduled.
> > To account for these ticks we introduce CONFIG_PARAVIRT and pv_time_ops
> > so that we can make use of:
> > 
> > kernel/sched/cputime.c:steal_account_process_tick
> > 
> > 
> > Stefano Stabellini (6):
> >   xen: move xen_setup_runstate_info and get_runstate_snapshot to 
> > drivers/xen/time.c
> >   kernel: missing include in cputime.c
> >   arm: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and 
> > pv_time_ops
> >   arm64: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and 
> > pv_time_ops
> >   core: remove ifdef CONFIG_PARAVIRT
> >   xen/arm: account for stolen ticks
> > 
> >  arch/arm/Kconfig  |   20 
> >  arch/arm/include/asm/paravirt.h   |   20 
> >  arch/arm/kernel/Makefile  |1 +
> >  arch/arm/kernel/paravirt.c|   25 ++
> >  arch/arm/xen/enlighten.c  |   21 +
> >  arch/arm64/Kconfig|   20 
> >  arch/arm64/include/asm/paravirt.h |   20 
> >  arch/arm64/kernel/Makefile|1 +
> >  arch/arm64/kernel/paravirt.c  |   25 ++
> >  arch/ia64/xen/time.c  |   48 +++-
> >  arch/x86/xen/time.c   |   76 +--
> 
> This is going to hit some of the patches that David
> has sent to tglx, I think. You might want to try to rebase on top
> of them (tip/time/for-xen, or something like that ) when they
> are ready.
> 
> But for the Xen generic maintainer I am OK with these changes
> so you can stick Acked-by on them.
> 
> Are you thinking to push them yourself or via the arm64 maintainer?

Once the core Xen support is pushed via the arm64 tree (queued for
3.11-rc1), I'm happy for the subsequent Xen patches to go directly
(similarly for KVM). But it's -rc7 now and I'm not taking any more
patches for the upcoming merging window (unless they are fixes).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] arm64 patches for 3.11

2013-07-01 Thread Catalin Marinas
Hi Linus,

Please pull the arm64-upstream tag as below.

For arm64 huge pages support, there are x86 changes moving part of
arch/x86/mm/hugetlbpage.c into mm/hugetlb.c to be re-used by arm64. They
have been acked by akpm
(http://marc.info/?l=linux-mm&m=137115843813005&w=2) and I'm merging
them through the arm64 tree to avoid dependency issues.

There is also a trivial merge conflict with 3.10 in
include/uapi/linux/kvm.h (with the KVM_REG_MIPS define).

Thanks.


The following changes since commit d683b96b072dc4680fc74964eca77e6a23d1fa6e:

  Linux 3.10-rc4 (2013-06-02 17:11:17 +0900)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64.git 
tags/arm64-upstream

for you to fetch changes up to aa729dccb5e8dfbc78e2e235b8754d6acccee731:

  Merge branch 'for-next/hugepages' of 
git://git.linaro.org/people/stevecapper/linux into upstream-hugepages 
(2013-07-01 11:20:58 +0100)



Main features:
- KVM and Xen ports to AArch64
- Hugetlbfs and transparent huge pages support for arm64
- Applied Micro X-Gene Kconfig entry and dts file
- Cache flushing improvements

------------
Catalin Marinas (7):
  arm64: Avoid cache flushing in flush_dcache_page()
  arm64: Do not flush the D-cache for anonymous pages
  arm64: Remove __flush_dcache_page()
  arm64: spinlock: retry trylock operation if strex fails on free lock
  Merge tag 'xen-arm64-3.1-tag' of git://git.kernel.org/.../sstabellini/xen 
into upstream
  Merge branch 'kvm-arm64/kvm-for-3.11' of 
git://git.kernel.org/.../maz/arm-platforms into upstream
  Merge branch 'for-next/hugepages' of 
git://git.linaro.org/people/stevecapper/linux into upstream-hugepages

Chen Gang (1):
  arm64: kernel: compiling issue, need delete read_current_timer()

Damian Hobson-Garcia (1):
  arm64: Provide default implementation for dma_{alloc,free}_attrs

Kyle McMartin (1):
  arm64/Makefile: provide vdso_install target

Marc Zyngier (33):
  arm64: KVM: define HYP and Stage-2 translation page flags
  arm64: KVM: HYP mode idmap support
  arm64: KVM: EL2 register definitions
  arm64: KVM: system register definitions for 64bit guests
  arm64: KVM: Basic ESR_EL2 helpers and vcpu register access
  arm64: KVM: fault injection into a guest
  arm64: KVM: architecture specific MMU backend
  arm64: KVM: user space interface
  arm64: KVM: system register handling
  arm64: KVM: CPU specific system registers handling
  arm64: KVM: virtual CPU reset
  arm64: KVM: kvm_arch and kvm_vcpu_arch definitions
  arm64: KVM: MMIO access backend
  arm64: KVM: guest one-reg interface
  arm64: KVM: hypervisor initialization code
  arm64: KVM: HYP mode world switch implementation
  arm64: KVM: Exit handling
  arm64: KVM: Plug the VGIC
  ARM: KVM: timer: allow DT matching for ARMv8 cores
  arm64: KVM: Plug the arch timer
  arm64: KVM: PSCI implementation
  arm64: KVM: Build system integration
  arm64: KVM: define 32bit specific registers
  arm64: KVM: 32bit GP register access
  arm64: KVM: 32bit conditional execution emulation
  arm64: KVM: 32bit handling of coprocessor traps
  arm64: KVM: CPU specific 32bit coprocessor access
  arm64: KVM: 32bit specific register world switch
  arm64: KVM: 32bit guest fault injection
  arm64: KVM: enable initialization of a 32bit vcpu
  arm64: KVM: userspace API documentation
  arm64: KVM: MAINTAINERS update
  arm64: KVM: document kernel object mappings in HYP

Stefano Stabellini (6):
  arm/xen: define xen_remap as ioremap_cached
  arm64/xen: introduce asm/xen header files on arm64
  arm64/xen: implement ioremap_cached on arm64
  arm64/xen: use XEN_IO_PROTO_ABI_ARM on ARM64
  arm64/xen: introduce CONFIG_XEN and hypercall.S on ARM64
  MAINTAINERS: add myself as arm64/xen maintainer

Steve Capper (11):
  mm: hugetlb: Copy huge_pmd_share from x86 to mm.
  x86: mm: Remove x86 version of huge_pmd_share.
  mm: hugetlb: Copy general hugetlb code from x86 to mm.
  x86: mm: Remove general hugetlb code from x86.
  mm: thp: Correct the HPAGE_PMD_ORDER check.
  ARM64: mm: Restore memblock limit when map_mem finished.
  ARM64: mm: Make PAGE_NONE pages read only and no-execute.
  ARM64: mm: Move PTE_PROT_NONE bit.
  ARM64: mm: HugeTLB support.
  ARM64: mm: Raise MAX_ORDER for 64KB pages and THP.
  ARM64: mm: THP support.

Vinayak Kale (4):
  arm64: Add Kconfig option for APM X-Gene SOC family
  arm64: Enable APM X-Gene SOC family in the defconfig
  arm64: Add defines for APM ARMv8 implementation
  arm64: Add initial DTS for APM X-Gene Storm SOC and APM Mustang board

Will Deacon (5):
  arm64: mm: don't bother inva

Re: [PATCH v2] xen/arm and xen/arm64: update xen_restart after ff701306cd49

2013-07-23 Thread Catalin Marinas
On Mon, Jul 22, 2013 at 10:47:44AM +0100, Stefano Stabellini wrote:
> Commit ff701306cd49 (arm64: use common reboot infrastructure) changes
> the prototype of arm_pm_restart.  Update xen_restart accordingly.
> 
> Signed-off-by: Stefano Stabellini 

You may even want to quote 7b6d864b48d9 (reboot: arm: change reboot_mode
to use enum reboot_mode) since you are changing a file under arch/arm ;)

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/3] drivers: clocksource: configure event stream for ARM arch timer

2013-07-23 Thread Catalin Marinas
On Mon, Jul 22, 2013 at 12:21:20PM +0100, Sudeep KarkadaNagesha wrote:
> From: Will Deacon 
> 
> The ARM architected timer can generate events (used for waking up
> CPUs executing the wfe instruction) at a frequency represented as a
> power-of-2 divisor of the clock rate.
> 
> This patch configures the event stream, aiming for a period of 100us
> between events. This can be used to implement wfe-based timeouts for
> userspace locking implementations.
...
> --- a/include/clocksource/arm_arch_timer.h
> +++ b/include/clocksource/arm_arch_timer.h
> @@ -29,6 +29,8 @@
>  #define ARCH_TIMER_PHYS_ACCESS   0
>  #define ARCH_TIMER_VIRT_ACCESS   1
>  
> +#define ARCH_TIMER_EVT_STREAM_FREQ   1   /* 100us */

BTW, if user-space starts using this, it will become an ABI. Is this the
right frequency?

In addition, do we want to expose this via hwcap? Something like
HWCAP_EVSTR100US?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/3] drivers: clocksource: configure event stream for ARM arch timer

2013-07-23 Thread Catalin Marinas
On Tue, Jul 23, 2013 at 11:33:33AM +0100, Will Deacon wrote:
> On Tue, Jul 23, 2013 at 11:23:34AM +0100, Catalin Marinas wrote:
> > On Mon, Jul 22, 2013 at 12:21:20PM +0100, Sudeep KarkadaNagesha wrote:
> > > From: Will Deacon 
> > > 
> > > The ARM architected timer can generate events (used for waking up
> > > CPUs executing the wfe instruction) at a frequency represented as a
> > > power-of-2 divisor of the clock rate.
> > > 
> > > This patch configures the event stream, aiming for a period of 100us
> > > between events. This can be used to implement wfe-based timeouts for
> > > userspace locking implementations.
> > ...
> > > --- a/include/clocksource/arm_arch_timer.h
> > > +++ b/include/clocksource/arm_arch_timer.h
> > > @@ -29,6 +29,8 @@
> > >  #define ARCH_TIMER_PHYS_ACCESS   0
> > >  #define ARCH_TIMER_VIRT_ACCESS   1
> > >  
> > > +#define ARCH_TIMER_EVT_STREAM_FREQ   1   /* 100us */
> > 
> > BTW, if user-space starts using this, it will become an ABI. Is this the
> > right frequency?
> 
> It doesn't quite become ABI; not all platforms will use the architected
> timers and not all timers can support an arbitrary frequency. The best we
> can do is calculate something as close to the target value as possible.

ABI in the sense that if it is available and advertised by the kernel as
such, people may use it.

> I spoke to both tools developers and some HSA driver guys about the frequency,
> and this is what ended up being suggested.
> 
> > In addition, do we want to expose this via hwcap? Something like
> > HWCAP_EVSTR100US?
> 
> Hmm, maybe, but we don't want people to try and use this for any accurate
> time measurements, so I wouldn't include the period.

Definitely not for accurate time but some user-space may find the delay
too small or too large. I'm fine without specifying the period, maybe
add a comment in the kernel like /* currently 100us */.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Suggestion] ARM64:kernel: compiling issue for early_console.

2013-03-27 Thread Catalin Marinas
On Wed, Mar 27, 2013 at 11:44:03AM +, Chen Gang wrote:
>   the error message:
> arch/arm64/kernel/early_printk.c: At top level:
> arch/arm64/kernel/early_printk.c:98:23: error: conflicting types for 
> ‘early_console’
> In file included from arch/arm64/kernel/early_printk.c:20:0:
> include/linux/console.h:145:24: note: previous declaration of 
> ‘early_console’ was here
> make[1]: *** [arch/arm64/kernel/early_printk.o] Error 1
> make: *** [arch/arm64/kernel] Error 2

Is this in linux-next? Mainline seems fine.

I saw some patches from tglx on unifying the various early printk
implementations, though not sure whether it's those patches causing it
(in which case arm64 needs to be updated as well).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v3 3/6] sched: pack small tasks

2013-03-27 Thread Catalin Marinas
On Wed, Mar 27, 2013 at 05:18:53PM +, Nicolas Pitre wrote:
> On Wed, 27 Mar 2013, Catalin Marinas wrote:
> 
> > So if the above works, the scheduler guys can mandate that little CPUs
> > are always first and for ARM it would be a matter of getting the right
> > CPU topology in the DT (independent of what hw vendors think of CPU
> > topology) and booting Linux on CPU 4 etc.
> 
> Just a note about that: if the scheduler mandates little CPUs first, 
> that should _not_ have any implications on the DT content.  DT is not 
> about encoding Linux specific implementation details.  It is simple 
> enough to tweak the CPU logical map at run time when enumeratiing CPUs.

You are right, though a simpler way (hack) to tweak the cpu_logical_map
is to change the DT ;).

But the problem is that the kernel doesn't know which CPU is big and
which is little, unless you specify this in some way via the DT. It can
be the cpu nodes order or some other means.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] arm64 patches for 3.9

2013-03-28 Thread Catalin Marinas
Hi Linus,

Please pull the arm64 fix below. Thanks.

The following changes since commit a937536b868b8369b98967929045f1df54234323:

  Linux 3.9-rc3 (2013-03-17 15:59:32 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64.git 
tags/arm64-fixes

for you to fetch changes up to d17cfb34dc5eb527b98448f3999aac52311d438b:

  ARM64: early_printk: Fix check for CONFIG_ARM64_64K_PAGES (2013-03-25 
17:59:57 +)


Fix IS_ENABLED() usage typo (missing CONFIG_ prefix).


Ben Hutchings (1):
  ARM64: early_printk: Fix check for CONFIG_ARM64_64K_PAGES

 arch/arm64/mm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 0/3] perf: AARCH64 arch support

2014-03-12 Thread Catalin Marinas
On Wed, Mar 12, 2014 at 11:19:48AM +0100, Jean Pihet wrote:
> Gentle ping on this series? Which tree is it supposed to land in?

Since it's only arm64 stuff and Will already acked the series, I'm going
to merge it via the arm64 tree. Consider it applied (for the upcoming
merging window).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf: ARM64: wire up perf_regs and unwind support

2014-03-12 Thread Catalin Marinas
On Wed, Mar 12, 2014 at 01:42:51PM +0100, Jean Pihet wrote:
> Hi,
> 
> On 13 February 2014 18:06, Jean Pihet  wrote:
> > This patch hooks in the perf_regs and libunwind code for ARM64.
> > The tools/perf/arch/arm64 is created; it contains the arch specific
> > code for DWARF unwinding.
> >
> > Signed-off-by: Jean Pihet 
> > Acked-by: Will Deacon 
> 
> Ping on this patch. Can this one go into the perf tree?

I'm happy to take it as well if there are no objections from the perf
maintainers.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tip:perf/core] ARM64, perf: Add support for perf registers API

2014-03-13 Thread Catalin Marinas
Hi Ingo,

On Thu, Mar 13, 2014 at 12:27:45AM -0700, tip-bot for Jean Pihet wrote:
> Commit-ID:  1acfb01a43db9d8cde2d4c1d51746bae0b46b06b
> Gitweb: http://git.kernel.org/tip/1acfb01a43db9d8cde2d4c1d51746bae0b46b06b
> Author: Jean Pihet 
> AuthorDate: Mon, 3 Feb 2014 19:18:27 +0100
> Committer:  Ingo Molnar 
> CommitDate: Wed, 12 Mar 2014 13:45:28 +0100
> 
> ARM64, perf: Add support for perf registers API
> 
> This patch implements the functions required for the perf
> registers API, allowing the perf tool to interface kernel
> register dumps with libunwind in order to provide userspace
> backtracing. Compat mode is also supported.
> 
> Only the general purpose user space registers are exported,
> i.e.:  PERF_REG_ARM_X0,
>  ...
>  PERF_REG_ARM_X28,
>  PERF_REG_ARM_FP,
>  PERF_REG_ARM_LR,
>  PERF_REG_ARM_SP,
>  PERF_REG_ARM_PC
> and not the PERF_REG_ARM_V* registers.
> 
> Signed-off-by: Jean Pihet 
> Acked-by: Will Deacon 
> Cc: Arnaldo 
> Cc: Jiri Olsa 
> Cc: patc...@linaro.org
> Cc: linaro-ker...@lists.linaro.org
> Cc: linux-arm-ker...@lists.infradead.org
> Link: 
> http://lkml.kernel.org/r/1391451509-31265-2-git-send-email-jean.pi...@linaro.org
> Signed-off-by: Ingo Molnar 
> ---
>  arch/arm64/Kconfig  |  2 ++
>  arch/arm64/include/asm/ptrace.h |  1 +
>  arch/arm64/include/uapi/asm/Kbuild  |  1 +
>  arch/arm64/include/uapi/asm/perf_regs.h | 40 ++
>  arch/arm64/kernel/Makefile  |  1 +
>  arch/arm64/kernel/perf_regs.c   | 44 
> +
>  6 files changed, 89 insertions(+)

As I replied to Jean yesterday, I merged the arch/arm64 patches already
in the arm64 tree. Is it late to drop them from tip (and avoid potential
conflicts)?

But please take the perf tools patch (tools/perf/arch/arm64/...).

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf: ARM64: wire up perf_regs and unwind support

2014-03-13 Thread Catalin Marinas
On Thu, Mar 13, 2014 at 10:12:02AM +, Will Deacon wrote:
> On Wed, Mar 12, 2014 at 05:45:20PM +0000, Catalin Marinas wrote:
> > On Wed, Mar 12, 2014 at 01:42:51PM +0100, Jean Pihet wrote:
> > > On 13 February 2014 18:06, Jean Pihet  wrote:
> > > > This patch hooks in the perf_regs and libunwind code for ARM64.
> > > > The tools/perf/arch/arm64 is created; it contains the arch specific
> > > > code for DWARF unwinding.
> > > >
> > > > Signed-off-by: Jean Pihet 
> > > > Acked-by: Will Deacon 
> > > 
> > > Ping on this patch. Can this one go into the perf tree?
> > 
> > I'm happy to take it as well if there are no objections from the perf
> > maintainers.
> 
> I object :) Last time I took a patch for perf tools (via the arm tree) it
> ended in a conflict from hell and we vowed not to take tools patches via
> arch trees again.

Works for me ;)

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] generic early_ioremap support

2014-03-13 Thread Catalin Marinas
On Wed, Mar 12, 2014 at 07:53:48PM -0700, Andrew Morton wrote:
> On Wed, 12 Mar 2014 22:29:48 -0400 Mark Salter  wrote:
> > Could you add this series into the -mm tree for v3.15?
> > 
> > The following changes since commit c3bebc71c4bcdafa24b506adf0c1de3c1f77e2e0:
> > 
> >   Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2014-03-04 
> > 08:44:32 -0800)
> > 
> > are available in the git repository at:
> > 
> > 
> >   git://github.com/mosalter/linux.git tags/for-v3.15
> > 
> > for you to fetch changes up to b27e0658d90c63dc2696eca44f7701a903cb13c5:
> > 
> >   doc/kernel-parameters.txt: add early_ioremap_debug (2014-03-09 12:53:50 
> > -0400)
> > 
> > 
> > generic early_ioremap support
> 
> Spose so.  I was hoping the x86 and arm people might do it.  Has there
> been sufficient feedback from those parties?

Both x86 and arm64 people acked these patches and we were wondering how
should they get in. Since they touch mm/, Mark thought you could take
them.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: Fix __addr_ok and __range_ok macros

2014-03-13 Thread Catalin Marinas
On Wed, Mar 05, 2014 at 10:41:28PM +, Christopher Covington wrote:
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -66,12 +66,12 @@ static inline void set_fs(mm_segment_t fs)
>  #define segment_eq(a,b)  ((a) == (b))
>  
>  /*
> - * Return 1 if addr < current->addr_limit, 0 otherwise.
> + * Return 1 if addr <= current->addr_limit, 0 otherwise.
>   */
>  #define __addr_ok(addr)  
> \
>  ({   \
>   unsigned long flag; \
> - asm("cmp %1, %0; cset %0, lo"   \
> + asm("cmp %1, %0; cset %0, ls"   \
>   : "=&r" (flag)  \
>   : "r" (addr), "0" (current_thread_info()->addr_limit)   \
>   : "cc");\

As Will said, this doesn't look right. Why do you need TASK_SIZE_64 to
be valid?

> @@ -83,7 +83,7 @@ static inline void set_fs(mm_segment_t fs)
>   * Returns 1 if the range is valid, 0 otherwise.
>   *
>   * This is equivalent to the following test:
> - * (u65)addr + (u65)size < (u65)current->addr_limit
> + * (u65)addr + (u65)size <= current->addr_limit
>   *
>   * This needs 65-bit arithmetic.
>   */
> @@ -91,7 +91,7 @@ static inline void set_fs(mm_segment_t fs)
>  ({   \
>   unsigned long flag, roksum; \
>   __chk_user_ptr(addr);   \
> - asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, cc" \
> + asm("adds %1, %1, %3; ccmp %1, %4, #3, cc; cset %0, ls" \
>   : "=&r" (flag), "=&r" (roksum)  \
>   : "1" (addr), "Ir" (size),  \
> "r" (current_thread_info()->addr_limit)   \

Just trying to understand: if adds does not set the C flag, we go on and
do the ccmp. If addr + size <= addr_limit, "cset ls" sets the flag
variable. If addr + size actually sets the C flag, we need to make sure
that "cset ls" doesn't trigger, which would mean to set C flag and clear
Z flag. So why do you change the ccmp flags from #2 to #3? It looks to
me like #2 is enough.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [tip:perf/core] ARM64, perf: Add support for perf registers API

2014-03-13 Thread Catalin Marinas
On Thu, Mar 13, 2014 at 11:01:21AM +, Ingo Molnar wrote:
> * Catalin Marinas  wrote:
> > On Thu, Mar 13, 2014 at 12:27:45AM -0700, tip-bot for Jean Pihet wrote:
> > > Commit-ID:  1acfb01a43db9d8cde2d4c1d51746bae0b46b06b
> > > Gitweb: 
> > > http://git.kernel.org/tip/1acfb01a43db9d8cde2d4c1d51746bae0b46b06b
> > > Author: Jean Pihet 
> > > AuthorDate: Mon, 3 Feb 2014 19:18:27 +0100
> > > Committer:  Ingo Molnar 
> > > CommitDate: Wed, 12 Mar 2014 13:45:28 +0100
> > > 
> > > ARM64, perf: Add support for perf registers API
> > > 
> > > This patch implements the functions required for the perf
> > > registers API, allowing the perf tool to interface kernel
> > > register dumps with libunwind in order to provide userspace
> > > backtracing. Compat mode is also supported.
> > > 
> > > Only the general purpose user space registers are exported,
> > > i.e.:  PERF_REG_ARM_X0,
> > >  ...
> > >  PERF_REG_ARM_X28,
> > >  PERF_REG_ARM_FP,
> > >  PERF_REG_ARM_LR,
> > >  PERF_REG_ARM_SP,
> > >  PERF_REG_ARM_PC
> > > and not the PERF_REG_ARM_V* registers.
> > > 
> > > Signed-off-by: Jean Pihet 
> > > Acked-by: Will Deacon 
> > > Cc: Arnaldo 
> > > Cc: Jiri Olsa 
> > > Cc: patc...@linaro.org
> > > Cc: linaro-ker...@lists.linaro.org
> > > Cc: linux-arm-ker...@lists.infradead.org
> > > Link: 
> > > http://lkml.kernel.org/r/1391451509-31265-2-git-send-email-jean.pi...@linaro.org
> > > Signed-off-by: Ingo Molnar 
> > > ---
> > >  arch/arm64/Kconfig  |  2 ++
> > >  arch/arm64/include/asm/ptrace.h |  1 +
> > >  arch/arm64/include/uapi/asm/Kbuild  |  1 +
> > >  arch/arm64/include/uapi/asm/perf_regs.h | 40 
> > > ++
> > >  arch/arm64/kernel/Makefile  |  1 +
> > >  arch/arm64/kernel/perf_regs.c   | 44 
> > > +
> > >  6 files changed, 89 insertions(+)
> > 
> > As I replied to Jean yesterday, I merged the arch/arm64 patches already
> > in the arm64 tree. Is it late to drop them from tip (and avoid potential
> > conflicts)?
> > 
> > But please take the perf tools patch (tools/perf/arch/arm64/...).
> 
> Well, I merged them so that the tooling bits can be tested on top of 
> that and merged as well.

That's why I asked if I should take the tooling bits but Will objected ;).

> Anyway, I dropped them, they were still the tail of tip:perf/core.

Thanks. I assume you still take the tooling patch?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] kmemleak: allow freeing internal objects after disabling kmemleak

2014-03-13 Thread Catalin Marinas
On Thu, Mar 13, 2014 at 06:47:46AM +, Li Zefan wrote:
> +Freeing kmemleak internal objects
> +-
> +
> +To allow access to previosuly found memory leaks even when an error fatal
> +to kmemleak happens, internal kmemleak objects won't be freed when kmemleak
> +is disabled, and those objects may occupy a large part of physical
> +memory.
> +
> +If you want to make sure they're freed before disabling kmemleak:
> +
> +  # echo scan=off > /sys/kernel/debug/kmemleak
> +  # echo off > /sys/kernel/debug/kmemleak

I would actually change the code to do a stop_scan_thread() as part of
the "off" handling so that scan=off is not required (we can't put it as
part of the kmemleak_disable because we need scan_mutex held).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: Fix __addr_ok and __range_ok macros

2014-03-13 Thread Catalin Marinas
On Thu, Mar 13, 2014 at 01:41:01PM +, Christopher Covington wrote:
> On 03/13/2014 07:20 AM, Catalin Marinas wrote:
> > On Wed, Mar 05, 2014 at 10:41:28PM +, Christopher Covington wrote:
> >> @@ -83,7 +83,7 @@ static inline void set_fs(mm_segment_t fs)
> >>   * Returns 1 if the range is valid, 0 otherwise.
> >>   *
> >>   * This is equivalent to the following test:
> >> - * (u65)addr + (u65)size < (u65)current->addr_limit
> >> + * (u65)addr + (u65)size <= current->addr_limit
> >>   *
> >>   * This needs 65-bit arithmetic.
> >>   */
> >> @@ -91,7 +91,7 @@ static inline void set_fs(mm_segment_t fs)
> >>  ({
> >> \
> >>unsigned long flag, roksum; \
> >>__chk_user_ptr(addr);   \
> >> -  asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, cc" \
> >> +  asm("adds %1, %1, %3; ccmp %1, %4, #3, cc; cset %0, ls" \
> >>: "=&r" (flag), "=&r" (roksum)  \
> >>: "1" (addr), "Ir" (size),  \
> >>  "r" (current_thread_info()->addr_limit)   \
> > 
> > Just trying to understand: if adds does not set the C flag, we go on and
> > do the ccmp. If addr + size <= addr_limit, "cset ls" sets the flag
> > variable. If addr + size actually sets the C flag, we need to make sure
> > that "cset ls" doesn't trigger, which would mean to set C flag and clear
> > Z flag. So why do you change the ccmp flags from #2 to #3? It looks to
> > me like #2 is enough.
> 
> #2 is indeed sufficient. I'll respin using it.
> 
> I think Will's suggested approach could also work but I figure since I've
> taken the time to understand the assembly I might as well fix the problem
> there rather than adding another step in the calculation for developers and
> compilers to parse. (I don't know if this code is performance critical, but I
> nevertheless wanted to see how the compiler handled Will's approach.
> Unfortunately my initial implementation resulted in unaligned opcode errors
> and I haven't yet dug in.)

If it's only one condition change, I would prefer the inline asm fix. I
haven't done any benchmarks with a C-only implementation to assess the
impact.

For __addr_ok() I think the compiler should be good enough as we don't
need 65-bit arithmetics but we can leave it as it is.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/3] asm-generic: Add generic seccomp.h for secure computing mode 1

2014-03-14 Thread Catalin Marinas
On Thu, Mar 13, 2014 at 10:17:01AM +, AKASHI Takahiro wrote:
> Those values (__NR_seccomp_*) are used solely in secure_computing()
> to identify mode 1 system calls. If compat system calls have different
> syscall numbers, asm/seccomp.h may override them.
> 
> Signed-off-by: AKASHI Takahiro 
> ---
>  include/asm-generic/seccomp.h |   28 
>  1 file changed, 28 insertions(+)
>  create mode 100644 include/asm-generic/seccomp.h

I think you need an Ack from Arnd on this patch. The other patches in
this series look ok but they depend on the ftrace patches, so we'll have
to sort those out first.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 3/3] arm64: Add architecture support for PCI

2014-03-14 Thread Catalin Marinas
On Fri, Mar 14, 2014 at 03:34:18PM +, Liviu Dudau wrote:
> --- /dev/null
> +++ b/arch/arm64/kernel/pci.c
[...]
> +int pci_register_io_range(phys_addr_t address, resource_size_t size)
[...]
> +unsigned long pci_address_to_pio(phys_addr_t address)
[...]
> +void pcibios_fixup_bus(struct pci_bus *bus)
[ actually most of this file ]

Maybe it was raised before already but can we have __weak generic
definitions of these functions? They don't seem to be arm64 specific in
any way.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 3/3] arm64: Add architecture support for PCI

2014-03-17 Thread Catalin Marinas
On Fri, Mar 14, 2014 at 06:05:27PM +, Liviu Dudau wrote:
> On Fri, Mar 14, 2014 at 05:38:08PM +, Arnd Bergmann wrote:
> > On Friday 14 March 2014, Catalin Marinas wrote:
> > > On Fri, Mar 14, 2014 at 03:34:18PM +, Liviu Dudau wrote:
> > > > --- /dev/null
> > > > +++ b/arch/arm64/kernel/pci.c
> > > [...]
> > > > +int pci_register_io_range(phys_addr_t address, resource_size_t size)
> > > [...]
> > > > +unsigned long pci_address_to_pio(phys_addr_t address)
> > > [...]
> > > > +void pcibios_fixup_bus(struct pci_bus *bus)
> > > [ actually most of this file ]
> > > 
> > > Maybe it was raised before already but can we have __weak generic
> > > definitions of these functions? They don't seem to be arm64 specific in
> > > any way.
[...]
> Catalin, if you are happy to ask for ACKs from all arch maintainers that 
> might get
> affected by our custom version of pci_address_to_pio() before you can pull 
> PCI support
> for arm64 then I can propose a new patchset.

You don't need to change the other architectures, that's the point of a
__weak definition, it will be automatically overridden. If you want, you
can even place a GENERIC_PCI or whatever config option that is only
selected by arm64 for the time being.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 11/15] arm64: add EFI stub

2014-03-18 Thread Catalin Marinas
On Thu, Mar 13, 2014 at 10:47:04PM +, Leif Lindholm wrote:
> --- /dev/null
> +++ b/arch/arm64/kernel/efi-entry.S
> @@ -0,0 +1,93 @@
> +/*
> + * EFI entry point.
> + *
> + * Copyright (C) 2013 Red Hat, Inc.
> + * Author: Mark Salter 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + */
> +#include 
> +#include 
> +
> +#include 
> +
> +#define EFI_LOAD_ERROR 0x8001
> +
> +   __INIT
> +
> +   /*
> +* We arrive here from the EFI boot manager with:
> +*
> +** MMU on with identity-mapped RAM.
> +** Icache and Dcache on
> +*
> +* We will most likely be running from some place other than where
> +* we want to be. The kernel image wants to be placed at TEXT_OFFSET
> +* from start of RAM.
> +*/
> +ENTRY(efi_stub_entry)
> +   stp x29, x30, [sp, #-32]!
> +
> +   /*
> +* Call efi_entry to do the real work.
> +* x0 and x1 are already set up by firmware. Current runtime
> +* address of image is calculated and passed via *image_addr.
> +*
> +* unsigned long efi_entry(void *handle,
> +* efi_system_table_t *sys_table,
> +* unsigned long *image_addr) ;
> +*/
> +   adrpx8, _text
> +   add x8, x8, #:lo12:_text
> +   add x2, sp, 16
> +   str x8, [x2]
> +   bl  efi_entry
> +   cmn x0, #1
> +   b.eqefi_load_fail
> +
> +   /*
> +* efi_entry() will have relocated the kernel image if necessary
> +* and we return here with device tree address in x0 and the kernel
> +* entry point stored at *image_addr. Save those values in registers
> +* which are preserved by __flush_dcache_all.
> +*/
> +   ldr x1, [sp, #16]
> +   mov x20, x0
> +   mov x21, x1
> +
> +   /* Turn off Dcache and MMU */
> +   mrs x0, CurrentEL
> +   cmp x0, #PSR_MODE_EL2t
> +   ccmpx0, #PSR_MODE_EL2h, #0x4, ne
> +   b.ne1f
> +   mrs x0, sctlr_el2
> +   bic x0, x0, #1 << 0 // clear SCTLR.M
> +   bic x0, x0, #1 << 2 // clear SCTLR.C
> +   msr sctlr_el2, x0
> +   isb
> +   b   2f
> +1:
> +   mrs x0, sctlr_el1
> +   bic x0, x0, #1 << 0 // clear SCTLR.M
> +   bic x0, x0, #1 << 2 // clear SCTLR.C
> +   msr sctlr_el1, x0
> +   isb
> +2:
> +   bl  __flush_dcache_all

In linux-next I'm pushing a patch which no longer exports the
__flush_dcache_all function. The reason is that it doesn't really work
if you have a (not fully transparent) external cache like on the Applied
Micro boards. There other issues when running as a guest as well.

If you know exactly what needs to be flushed here, can you use a range
(MVA) operation?

> diff --git a/arch/arm64/kernel/efi-stub.c b/arch/arm64/kernel/efi-stub.c
> new file mode 100644
> index 000..bf30913
> --- /dev/null
> +++ b/arch/arm64/kernel/efi-stub.c
> @@ -0,0 +1,83 @@
> +/*
> + * linux/arch/arm/boot/compressed/efi-stub.c

Nitpick: arch/arm64/... But we don't really need to write the file name
here, I use a smart editor that tells me which file I'm viewing ;).
Better write a one-line summary of what this file is about.

> + *
> + * Copyright (C) 2013, 2014 Linaro Ltd;  
> + *
> + * This file implements the EFI boot stub for the arm64 kernel.
> + * Adapted from ARM version by Mark Salter 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/*
> + * EFI function call wrappers. These are not required for arm/arm64, but
> + * wrappers are required for X86 to convert between ABIs. These wrappers are
> + * provided to allow code sharing between X86 and other architectures. Since
> + * these wrappers directly invoke the EFI function pointer, the function
> + * pointer type must be properly defined, which is not the case for X86. One
> + * advantage of this is it allows for type checking of arguments, which is 
> not
> + * possible with the X86 wrappers.
> + */
> +#define efi_call_phys0(f)  f()
> +#define efi_call_phys1(f, a1)  f(a1)
> +#define efi_call_phys2(f, a1, a2)  f(a1, a2)
> +#define efi_call_phys3(f, a1, a2, a3)  f(a1, a2, a3)
> +#define efi_call_phys4(f, a1, a2, a3, a4)  f(a1, a2, a3, a4)
> +#define efi_call_phys5(f, a1, a2, a3, a4, a5)  f(a1, a2, a3, a4, a5)
> +
> +/*
> + * AArch64 requires the DTB to be 8-byte aligned in the first 512MiB from
> + * start of kernel and may not cross a 2MiB boundary. We set alignment

Re: [PATCH v2 13/15] arm64: add EFI runtime services

2014-03-18 Thread Catalin Marinas
On Thu, Mar 13, 2014 at 10:47:06PM +, Leif Lindholm wrote:
> --- /dev/null
> +++ b/arch/arm64/kernel/efi.c
[...]
> +/*
> + * Called from setup_arch with interrupts disabled.
> + */
> +void __init efi_enter_virtual_mode(void)
[...]
> --- a/init/main.c
> +++ b/init/main.c
> @@ -902,6 +902,10 @@ static noinline void __init kernel_init_freeable(void)
> smp_prepare_cpus(setup_max_cpus);
> 
> do_pre_smp_initcalls();
> +
> +   if (IS_ENABLED(CONFIG_ARM64) && efi_enabled(EFI_BOOT))
> +   efi_enter_virtual_mode();

The comment for the efi_enter_virtual_mode() function says "called from
setup_arch with interrupts disabled". None of these are true for the
call above (and I would really prefer an arch call than this arm64
conditional call in init/main.c.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] of: setup dma parameters using dma-ranges and dma-coherent

2014-04-22 Thread Catalin Marinas
On Sat, Apr 19, 2014 at 05:25:28PM +0100, Thomas Petazzoni wrote:
> On Sat, 19 Apr 2014 10:32:45 -0400, Santosh Shilimkar wrote:
> > Here is an updated version of [2] based on discussion. Series introduces
> > support for setting up dma parameters based on device tree properties
> > like 'dma-ranges' and 'dma-coherent' and also update to ARM 32 bit port.
> > Earlier version of the same series is here [1].
> > 
> > The 'dma-ranges' helps to take care of few DMAable system memory 
> > restrictions
> > by use of dma_pfn_offset which we maintain now per device. Arch code then
> > uses it for dma address translations for such cases. We update the
> > dma_pfn_offset accordingly during DT the device creation process.The
> > 'dma-coherent' property is used to setup arch's coherent dma_ops.
> > 
> > After some off-list discussion with RMK and Arnd, I have now dropped the
> > controversial dma_mask setup code from the series which actually isn't 
> > blocking
> > me as such. Considering rest of the parts of the series are already aligned,
> > am hoping to get this version merged for 3.16 merge window.
> > 
> > We agreed in last discussion that drivers have the ultimate
> > responsibility to setup the correct dma mask but then we need to have some
> > means to see if bus can support what driver has requested for a case where
> > driver request for bigger mask than what bus supports. I can follow up on
> > the mask topic if we have broken drivers.
> 
> I am not sure whether there is an intersection or not, but I wanted to
> mention that the mvebu platform (in mach-mvebu) supports hardware I/O
> coherency, which makes it a coherent DMA platform. However, we are not
> able to use arm_coherent_dma_ops for this platform, because when a
> transfer is being made DMA_FROM_DEVICE, at the end of the transfer, we
> need to perform an I/O barrier to wait for the snooping unit to
> complete its coherency work. So we're coherent, but not with
> arm_coherent_dma_ops: we have our own dma operation implementation (see
> arch/arm/mach-mvebu/coherency.c).

Ordering between I/O, DMA and CPU memory accesses is the reason we added
rmb() to the readl() macro. The mvebu ops solve the DMA streaming case
but not the dma_alloc() buffers case where you no longer have a change
of ownership between device and CPU. We could handle this via per-SoC
__io*mb() barriers as function pointers with a bit of overhead (though
we already do an outer_sync() for wmb()).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] of: setup dma parameters using dma-ranges and dma-coherent

2014-04-22 Thread Catalin Marinas
On Tue, Apr 22, 2014 at 04:02:19PM +0100, Arnd Bergmann wrote:
> On Saturday 19 April 2014, Thomas Petazzoni wrote:
> > I am not sure whether there is an intersection or not, but I wanted to
> > mention that the mvebu platform (in mach-mvebu) supports hardware I/O
> > coherency, which makes it a coherent DMA platform. However, we are not
> > able to use arm_coherent_dma_ops for this platform, because when a
> > transfer is being made DMA_FROM_DEVICE, at the end of the transfer, we
> > need to perform an I/O barrier to wait for the snooping unit to
> > complete its coherency work. So we're coherent, but not with
> > arm_coherent_dma_ops: we have our own dma operation implementation (see
> > arch/arm/mach-mvebu/coherency.c).
> 
> I had completely missed the fact that this support was merged already.
> 
> It's an interesting question if this should actually be called
> 'coherent' or not. It's certainly more coherent than without that
> support, but then again, you still can't rely on incoming data to
> be visible after a readl() from the device has returned or an MSI
> interrupt has been delivered, which is what we normally expect.
> 
> In particular, it means you can't really use arm_coherent_dma_alloc(),
> which is a shame, since that is a significante performance overhead.

It should still work if __io*mb() macros do the extra work specific to
mvebu (similar to the L2x0 outer_sync(), though this was for
non-cacheable DMA buffers).

> I would hope we can find a way to avoid the platform notifiers for
> mvebu as well and come up with a generic way to express this
> 'semi-coherent' mode. I believe x-gene has a similar issue, and
> I wouldn't be surprised if there are others like this.

The solution is for the snooping unit to detect the DSB instruction
(which is propagated outside the CPU) and wait for the completion of the
coherency work (but we need more information from the hardware guys).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] of: setup dma parameters using dma-ranges and dma-coherent

2014-04-22 Thread Catalin Marinas
On Tue, Apr 22, 2014 at 04:30:36PM +0100, Rob Herring wrote:
> On Tue, Apr 22, 2014 at 10:25 AM, Catalin Marinas
>  wrote:
> > On Tue, Apr 22, 2014 at 04:02:19PM +0100, Arnd Bergmann wrote:
> >> On Saturday 19 April 2014, Thomas Petazzoni wrote:
> 
> [...]
> 
> >> I would hope we can find a way to avoid the platform notifiers for
> >> mvebu as well and come up with a generic way to express this
> >> 'semi-coherent' mode. I believe x-gene has a similar issue, and
> >> I wouldn't be surprised if there are others like this.
> >
> > The solution is for the snooping unit to detect the DSB instruction
> > (which is propagated outside the CPU) and wait for the completion of the
> > coherency work (but we need more information from the hardware guys).
> 
> If the solution was fixing broken h/w, we'd all be retired (or h/w
> designers). :)

At least they could admit it's a hardware bug and hopefully won't do the
same mistake in the future (wishful thinking ;)).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/15] arm64: __NR_compat_syscalls fix

2014-04-22 Thread Catalin Marinas
On Fri, Apr 11, 2014 at 11:25:40AM +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi 
> 
> Signed-off-by: Miklos Szeredi 
> Cc: Catalin Marinas 
> ---
>  arch/arm64/include/asm/unistd32.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/unistd32.h 
> b/arch/arm64/include/asm/unistd32.h
> index bb8eb8a78e67..faa0e1ce59df 100644
> --- a/arch/arm64/include/asm/unistd32.h
> +++ b/arch/arm64/include/asm/unistd32.h
> @@ -404,7 +404,7 @@ __SYSCALL(379, sys_finit_module)
>  __SYSCALL(380, sys_sched_setattr)
>  __SYSCALL(381, sys_sched_getattr)
>  
> -#define __NR_compat_syscalls 379
> +#define __NR_compat_syscalls 382

I picked up this patch, together with a Cc: stable and longer comment.

What's your plan with the other patches? Do you submit them as a series
or would like the arch maintainers to pick them up?

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 05/15] arm64: add renameat2 syscall

2014-04-22 Thread Catalin Marinas
On Fri, Apr 11, 2014 at 11:25:41AM +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi 
> 
> Signed-off-by: Miklos Szeredi 
> Cc: Catalin Marinas 

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/15] arm64: __NR_compat_syscalls fix

2014-04-23 Thread Catalin Marinas
On Wed, Apr 23, 2014 at 09:46:40AM +0100, Miklos Szeredi wrote:
> On Tue, Apr 22, 2014 at 6:58 PM, Catalin Marinas
>  wrote:
> > What's your plan with the other patches? Do you submit them as a series
> > or would like the arch maintainers to pick them up?
> 
> Either is OK for me.  I'll collect ACKs for those which the maintainer
> doesn't want to submit, and I'll submit them in one batch.
> 
> But if you have patches anyway, please feel free to pick this up as well.

OK, I took the renameat2 wiring as well. Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/8] tty/serial: pl011: add generic earlycon support

2014-04-23 Thread Catalin Marinas
On Wed, Apr 16, 2014 at 11:14:28PM +0100, Rob Herring wrote:
> On Mon, Mar 24, 2014 at 6:28 AM, Catalin Marinas
>  wrote:
> > On Fri, Mar 21, 2014 at 09:08:46PM +, Rob Herring wrote:
> >> From: Rob Herring 
> >>
> >> Add earlycon support for the pl011 serial port. This allows enabling
> >> the pl011 for console when early_params are processed. This is based
> >> on the arm64 earlyprintk support and is intended to replace it.
> >>
> >> Signed-off-by: Rob Herring 
> >> Cc: Russell King 
> >> Cc: Greg Kroah-Hartman 
> >> Cc: Jiri Slaby 
> >> ---
> >>  Documentation/kernel-parameters.txt |  5 +++--
> >>  drivers/tty/serial/amba-pl011.c | 30 +-
> >>  2 files changed, 32 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/Documentation/kernel-parameters.txt 
> >> b/Documentation/kernel-parameters.txt
> >> index 5ce8b7a..81bdd52 100644
> >> --- a/Documentation/kernel-parameters.txt
> >> +++ b/Documentation/kernel-parameters.txt
> >> @@ -887,8 +887,9 @@ bytes respectively. Such letter suffixes can also be 
> >> entirely omitted.
> >>   uart[8250],io,[,options]
> >>   uart[8250],mmio,[,options]
> >>   uart[8250],mmio32,[,options]
> >> - Start an early, polled-mode console on the 8250/16550
> >> - UART at the specified I/O port or MMIO address.
> >> + pl011,
> >> + Start an early, polled-mode console on a serial port
> >> + at the specified I/O port or MMIO address. 8250
> >>   MMIO inter-register address stride is either 8-bit
> >>   (mmio) or 32-bit (mmio32).
> >>   The options are the same as for ttyS, above.
> >
> > I think the last line is a bit misleading. Or did you intend to leave it
> > with the uart[8250] parameter? See below:
> 
> How about this (excuse the gmail lack of tabs):
> 
> earlycon= [KNL] Output early console device and options.
> 
>  uart[8250],io,[,options]
>  uart[8250],mmio,[,options]
>  uart[8250],mmio32,[,options]
>   Start an early, polled-mode console on an 8250 serial
>   port at the specified I/O port or MMIO address. 8250
>   MMIO inter-register address stride is either 8-bit
>   (mmio) or 32-bit (mmio32).
> 
>   The options are the same as for ttyS, above.
> 
>  pl011,
>   Start an early, polled-mode console on a pl011 serial
>   port at the specified address. The pl011 serial port
>   must already be setup and configured. Options are not
>   yet supported.
> 
>  smh Use ARM semihosting calls for early console.

It looks fine.

> >> diff --git a/drivers/tty/serial/amba-pl011.c 
> >> b/drivers/tty/serial/amba-pl011.c
> >> index d4eda24..4227c0a 100644
> >> --- a/drivers/tty/serial/amba-pl011.c
> >> +++ b/drivers/tty/serial/amba-pl011.c
> > [...]
> >> +static int __init pl011_early_console_setup(struct earlycon_device 
> >> *device,
> >> + const char *opt)
> >> +{
> >> + if (!device->port.membase)
> >> + return -ENODEV;
> >> +
> >> + device->con->write = pl011_early_write;
> >> + return 0;
> >> +}
> >> +EARLYCON_DECLARE(pl011, pl011_early_console_setup);
> >
> > Here we expect the PL011 to be already initialised by the boot loader
> > and the kernel continues using the same settings. So maybe clarify this
> > in the pl011 kernel parameter doc and we can add proper configuration
> > using a separate patch.
> 
> Enabling and setup would not be too hard, but either baud rate will
> always have to be configured or we'll have to specify the input clock
> rate too. The 8250 driver basically does the former or assumes a fixed
> clock.
> 
> Adding any setup will also break any non-pl011 based SBSA compliant
> uart since the configuration registers are not standardized. I guess
> we can add "sbsauart" when/if that happens.

I think for now we can assume that the pl011 is initialised at the
right baud rate prior to starting the kernel.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: fixmap: fix missing sub-page offset for earlyprintk

2014-04-29 Thread Catalin Marinas
On Tue, Apr 29, 2014 at 12:51:18AM +0100, Rob Herring wrote:
> On Mon, Apr 28, 2014 at 1:50 PM, Marc Zyngier  wrote:
> > Commit d57c33c5daa4 (add generic fixmap.h) added (among other
> > similar things) set_fixmap_io to deal with early ioremap of devices.
> >
> > More recently, commit bf4b558eba92 (arm64: add early_ioremap support)
> > converted the arm64 earlyprintk to use set_fixmap_io. A side effect of
> > this conversion is that my virtual machines have stopped booting when
> > I pass "earlyprintk=uart8250-8bit,0x3f8" to the guest kernel.
> >
> > Turns out that the new earlyprintk code doesn't care at all about
> > sub-page offsets, and just assumes that the earlyprintk device will
> > be page-aligned. Obviously, that doesn't play well with the above example.
> >
> > Further investigation shows that set_fixmap_io uses __set_fixmap instead
> > of __set_fixmap_offset. A fix is to introduce a set_fixmap_offset_io that
> > uses the latter, and to remove the superflous call to fix_to_virt
> > (which only returns the value that set_fixmap_io has already given us).
> >
> > With this applied, my VMs are back in business. Tested on a Cortex-A57
> > platform with kvmtool as platform emulation.
> >
> > Cc: Mark Salter 
> > Cc: Catalin Marinas 
> > Cc: Will Deacon 
> > Signed-off-by: Marc Zyngier 
> > ---
> >  arch/arm64/kernel/early_printk.c | 6 ++
> >  include/asm-generic/fixmap.h | 3 +++
> >  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> This will be fixed already in 3.16 with my earlycon series[1] if this
> is not taken for 3.15.

I'd like to take this for 3.15 as it's currently broken. I just need
another ack from Arnd on the generic header change.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] arm64: adjust el0_sync so that a function can be called

2014-04-29 Thread Catalin Marinas
On Sun, Apr 27, 2014 at 08:44:12PM +0100, Larry Bassel wrote:
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 39ac630..eda7755 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
[...]
> @@ -421,28 +421,30 @@ el0_da:
>   /*
>* Data abort handling
>*/
> - mrs x0, far_el1
> - bic x0, x0, #(0xff << 56)
>   disable_step x1
>   isb
>   enable_dbg
>   // enable interrupts before calling the main handler
>   enable_irq
> + mrs x0, far_el1
> + bic x0, x0, #(0xff << 56)
>   mov x1, x25
>   mov x2, sp
> + adr lr, ret_from_exception
>   b   do_mem_abort

Reading the far_el1 after enable_dbg and enable_irq is racy, we can no
longer guarantee its value in the original data abort context.

>  el0_ia:
>   /*
>* Instruction abort handling
>*/
> - mrs x0, far_el1
>   disable_step x1
>   isb
>   enable_dbg
>   // enable interrupts before calling the main handler
>   enable_irq
> + mrs x0, far_el1
>   orr x1, x25, #1 << 24   // use reserved ISS bit for 
> instruction aborts
>   mov x2, sp
> + adr lr, ret_from_exception
>   b   do_mem_abort
>  el0_fpsimd_acc:

Same here.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] arm64: enable context tracking

2014-04-29 Thread Catalin Marinas
On Sun, Apr 27, 2014 at 08:44:14PM +0100, Larry Bassel wrote:
> Make calls to ct_user_enter when the kernel is exited
> and ct_user_exit when the kernel is entered (in el0_da,
> el0_ia, el0_svc, el0_irq).
> 
> These macros expand to function calls which will only work
> properly if el0_sync and related code has been rearranged
> (in a previous patch of this series).
> 
> The calls to ct_user_exit are made after hw debugging has been
> enabled (enable_dbg).
> 
> The call to ct_user_enter is made at the end of the kernel_exit
> macro.
> 
> Signed-off-by: Kevin Hilman 
> Signed-off-by: Larry Bassel 

You could actually merge this patch with 2/3.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/10] arm64: UEFI support

2014-04-29 Thread Catalin Marinas
On Fri, Apr 25, 2014 at 05:09:04PM +0100, Leif Lindholm wrote:
> This set adds support for UEFI to the arm64 port - a stub loader, as
> well as runtime services support for efivars.
> 
> It depends on some core EFI patches currently in linux-next.

The patches look fine to me, they've been through several rounds of
review already. How do we propose these get merged as the series
contains both generic and arm64 patches? And there are dependencies
already in linux-next.

Are the EFI patches in -next pulled from some non-rebaseable branch?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/7] Generic serial earlycon

2014-04-29 Thread Catalin Marinas
On Fri, Apr 18, 2014 at 11:19:53PM +0100, Rob Herring wrote:
> Rob Herring (7):
>   x86: move FIX_EARLYCON_MEM kconfig into x86
>   tty/serial: add generic serial earlycon
>   tty/serial: convert 8250 to generic earlycon
>   tty/serial: pl011: add generic earlycon support
>   tty/serial: add arm/arm64 semihosting earlycon
>   arm64: enable FIX_EARLYCON_MEM kconfig
>   arm64: remove arch specific earlyprintk

The series looks fine, you can add:

Acked-by: Catalin Marinas 

BTW, are you merging all of them via some other tree or would prefer me
to take the arm64-specific patches?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/10] arm64: UEFI support

2014-04-29 Thread Catalin Marinas
On Tue, Apr 29, 2014 at 12:43:56PM +0100, Matt Fleming wrote:
> (Pulling in Peter and Stephen)
> 
> On Tue, 29 Apr, at 11:28:17AM, Catalin Marinas wrote:
> > 
> > The patches look fine to me, they've been through several rounds of
> > review already. How do we propose these get merged as the series
> > contains both generic and arm64 patches? And there are dependencies
> > already in linux-next.
> > 
> > Are the EFI patches in -next pulled from some non-rebaseable branch?
> 
> Peter suggsted a plan when he took the generic EFI stuff that's in tip
> (and hence currently in linux-next),
> 
>   It doesn't hurt to inform Stephen, although I think it will simply fall
>   out automatically since he uses git to merge and git will recognize the
>   graph.
> 
>   During the merge window, it means they should not push their patches
>   until Linus has accepted the precondition patches from the tip tree.
>   Since Ingo and I try to push most of the tip tree as early as possible
>   in the merge window, this is usually not a problem.
> 
> So we currently have the prerequisites in tip/x86/efi, and assuming that
> this 10-patch series gets merged into a single branch somewhere, things
> should work automatically for linux-next.
> 
> It may be prudent to negotiate a plan now for when the merge window
> opens because, as Peter mentions above, the stuff in tip/x86/efi needs
> to be merged by Linus first to avoid build breakage with the arm64
> stuff.

Waiting for the tip/x86/efi to be merged first is not a problem. We
also need a stable base for testing the arm64 UEFI series, so I assume
this series can be based onto tip/x86/efi (would such branch be rebased
before hitting mainline?).

Given that Leif's series contains both generic efi and arm64 patches,
what's your preference for merging them? I'm happy to add my ack and
they go via your tree (or the other way around).

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/7] arm64: Decouple page size from level of translation tables

2014-04-29 Thread Catalin Marinas
Jungseok,

On Tue, Apr 29, 2014 at 05:59:20AM +0100, Jungseok Lee wrote:
> +choice
> + prompt "Level of translation tables"
> + default ARM64_3_LEVELS if ARM64_4K_PAGES
> + default ARM64_2_LEVELS if ARM64_64K_PAGES
> + help
> +   Allows level of translation tables.
> +
> +config ARM64_2_LEVELS
> + bool "2 level"
> + depends on ARM64_64K_PAGES
> + help
> +   This feature enables 2 levels of translation tables.
> +
> +config ARM64_3_LEVELS
> + bool "3 level"
> + depends on ARM64_4K_PAGES
> + help
> +   This feature enables 3 levels of translation tables.
> +
> +endchoice

As I mentioned previously
(http://www.spinics.net/linux/lists/arm-kernel/msg319552.html), just
expose options for the VA space bits rather than the number of levels.
You can still keep the number of levels config options but not visible
in menuconfig (though I think you could also hide them in some header
and avoid config altogether). The VA bits config options can be:

VA_BITS_39 if 4K (3 levels)
VA_BITS_42 if 64K (2 levels)
VA_BITS_47 if 16K (3 levels)
VA_BITS_48 if 4K || 16K || 64K (4/4/3 levels depending on page size)

That's more meaningful to people configuring the kernel.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 3/7] arm64: Introduce a kernel configuration option for VA_BITS

2014-04-29 Thread Catalin Marinas
On Tue, Apr 29, 2014 at 05:59:23AM +0100, Jungseok Lee wrote:
> +config ARM64_VA_BITS
> + int "Virtual address space size"
> + range 39 39 if ARM64_4K_PAGES && ARM64_3_LEVELS
> + range 42 42 if ARM64_64K_PAGES && ARM64_2_LEVELS
> + help
> +   This feature is determined by a combination of page size and
> +   level of translation tables.

OK, so you are doing the VA bits selection already. But see my other
email about setting only exposing this and hiding the number of levels
(though number of levels can be mentioned in the help).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/7] arm64: Add a description on 48-bit address space with 4KB pages

2014-04-29 Thread Catalin Marinas
On Tue, Apr 29, 2014 at 05:59:27AM +0100, Jungseok Lee wrote:
> --- a/Documentation/arm64/memory.txt
> +++ b/Documentation/arm64/memory.txt
> @@ -8,10 +8,11 @@ This document describes the virtual memory layout used by 
> the AArch64
>  Linux kernel. The architecture allows up to 4 levels of translation
>  tables with a 4KB page size and up to 3 levels with a 64KB page size.
>  
> -AArch64 Linux uses 3 levels of translation tables with the 4KB page
> -configuration, allowing 39-bit (512GB) virtual addresses for both user
> -and kernel. With 64KB pages, only 2 levels of translation tables are
> -used but the memory layout is the same.
> +AArch64 Linux uses 3 levels and 4 levels of translation tables with
> +the 4KB page configuration, allowing 39-bit (512GB) and 48-bit (256TB)
> +virtual addresses, respectively, for both user and kernel. With 64KB
> +pages, only 2 levels of translation tables are used but the memory layout
> +is the same.

Any reason why we couldn't use 48-bit address space with 64K pages
(implying 3 levels)?

> -AArch64 Linux memory layout with 64KB pages:
> +AArch64 Linux memory layout with 4KB pages + 4 levels:
> +
> +StartEnd SizeUse
> +---
> +  256TB  user
> +
> + 7bfe~124TB  vmalloc

BTW, maybe as a separate patch we should change the "end" to be
exclusive. It becomes harder to modify (I've been through this a few
times already ;)) and even follow the changes.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 00/10] arm64: UEFI support

2014-04-29 Thread Catalin Marinas
On Tue, Apr 29, 2014 at 03:47:13PM +0100, Matt Fleming wrote:
> On Tue, 29 Apr, at 02:47:28PM, Catalin Marinas wrote:
> > Given that Leif's series contains both generic efi and arm64 patches,
> > what's your preference for merging them? I'm happy to add my ack and
> > they go via your tree (or the other way around).
> 
> I'm happy either way, though if I take them through my tree (and
> subsequently through tip) you won't have to worry about the merge window
> rigmarole, which is a plus.
> 
> So, eveyone happy for me to take these with Catalin's Acked-by?

Fine by me. Just in case I haven't stated it explicitly for this series:

Acked-by: Catalin Marinas 

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 6/7] arm64: mm: Implement 4 levels of translation tables

2014-04-29 Thread Catalin Marinas
On Tue, Apr 29, 2014 at 05:59:33AM +0100, Jungseok Lee wrote:
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 0fd5650..03ec424 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -37,8 +37,9 @@
> 
>  /*
>   * swapper_pg_dir is the virtual address of the initial page table. We place
> - * the page tables 3 * PAGE_SIZE below KERNEL_RAM_VADDR. The idmap_pg_dir has
> - * 2 pages and is placed below swapper_pg_dir.
> + * the page tables 3 * PAGE_SIZE (2 or 3 levels) or 4 * PAGE_SIZE (4 levels)
> + * below KERNEL_RAM_VADDR. The idmap_pg_dir has 2 pages (2 or 3 levels) or
> + * 3 pages (4 levels) and is placed below swapper_pg_dir.
>   */
>  #define KERNEL_RAM_VADDR   (PAGE_OFFSET + TEXT_OFFSET)
> 
> @@ -46,8 +47,13 @@
>  #error KERNEL_RAM_VADDR must start at 0xXXX8
>  #endif
> 
> +#ifdef CONFIG_ARM64_4_LEVELS
> +#define SWAPPER_DIR_SIZE   (4 * PAGE_SIZE)
> +#define IDMAP_DIR_SIZE (3 * PAGE_SIZE)
> +#else
>  #define SWAPPER_DIR_SIZE   (3 * PAGE_SIZE)
>  #define IDMAP_DIR_SIZE (2 * PAGE_SIZE)
> +#endif

Mark Rutland was doing some clean-up of this code to no longer place
swapper_pg_dir and idmap_pg_dir below the kernel image. I'm not sure
whether the patches ended up on the list yet (not a problem for now,
just a slight change for your patches).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 4/7] arm64: Add a description on 48-bit address space with 4KB pages

2014-04-30 Thread Catalin Marinas
On Wed, Apr 30, 2014 at 07:41:40AM +0100, Jungseok Lee wrote:
> On Tuesday, April 29, 2014 11:48 PM, Catalin Marinas wrote:
> > On Tue, Apr 29, 2014 at 05:59:27AM +0100, Jungseok Lee wrote:
> > > --- a/Documentation/arm64/memory.txt
> > > +++ b/Documentation/arm64/memory.txt
> > > @@ -8,10 +8,11 @@ This document describes the virtual memory layout
> > > used by the AArch64  Linux kernel. The architecture allows up to 4
> > > levels of translation  tables with a 4KB page size and up to 3 levels 
> > > with a 64KB page size.
> > >
> > > -AArch64 Linux uses 3 levels of translation tables with the 4KB page
> > > -configuration, allowing 39-bit (512GB) virtual addresses for both
> > > user -and kernel. With 64KB pages, only 2 levels of translation tables
> > > are -used but the memory layout is the same.
> > > +AArch64 Linux uses 3 levels and 4 levels of translation tables with
> > > +the 4KB page configuration, allowing 39-bit (512GB) and 48-bit
> > > +(256TB) virtual addresses, respectively, for both user and kernel.
> > > +With 64KB pages, only 2 levels of translation tables are used but the
> > > +memory layout is the same.
> > 
> > Any reason why we couldn't use 48-bit address space with 64K pages 
> > (implying 3 levels)?
> 
> No technical reason.
> Since 64K+3levels is not implemented in this set, I didn't add it.
> 
> Should 64K+3levels be prepared in this patchset?
> 
> > > -AArch64 Linux memory layout with 64KB pages:
> > > +AArch64 Linux memory layout with 4KB pages + 4 levels:
> > > +
> > > +StartEnd SizeUse
> > > +---
> > > +  256TB  user
> > > +
> > > + 7bfe~124TB  vmalloc
> > 
> > BTW, maybe as a separate patch we should change the "end" to be exclusive. 
> > It becomes harder to modify
> > (I've been through this a few times already ;)) and even follow the changes.
> 
> Does "exclusive" mean that  is changed to 0001?
> Or Does it mean that "End" column is dropped?

Not dropped but changed to 0001 (the kernel already prints the
memory layout in a similar way).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 15/21] of/fdt: move memreserve and dtb memory reservations into core

2014-05-01 Thread Catalin Marinas
On Tue, Apr 22, 2014 at 08:18:15PM -0500, Rob Herring wrote:
> From: Rob Herring 
> 
> Move the /memreserve/ processing and dtb memory reservations into
> early_init_fdt_scan_reserved_mem. This converts arm, arm64, and powerpc
> as they are the only users of early_init_fdt_scan_reserved_mem.
> 
> memblock_reserve is safe to call on the same region twice, so the
> reservation check for the dtb in powerpc 32-bit reservations is safe to
> remove.
> 
> Signed-off-by: Rob Herring 
> Cc: Russell King 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> ---
> v2: No change
> 
>  arch/arm/include/asm/prom.h |  2 --
>  arch/arm/kernel/devtree.c   | 27 ---
>  arch/arm/mm/init.c  |  1 -
>  arch/arm64/mm/init.c| 21 -
>  arch/powerpc/kernel/prom.c  | 22 --
>  drivers/of/fdt.c| 16 
>  6 files changed, 16 insertions(+), 73 deletions(-)

For arm64:

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] kmemleak on __radix_tree_preload

2014-05-01 Thread Catalin Marinas
On Fri, Apr 25, 2014 at 10:45:40AM +0900, Jaegeuk Kim wrote:
> 2. Bug
>  This is one of the results, but all the results indicate
> __radix_tree_preload.
> 
> unreferenced object 0x88002ae2a238 (size 576):
> comm "fsstress", pid 25019, jiffies 4295651360 (age 2276.104s)
> hex dump (first 32 bytes):
> 01 00 00 00 81 ff ff ff 00 00 00 00 00 00 00 00  
> 40 7d 37 81 ff ff ff ff 50 a2 e2 2a 00 88 ff ff  @}7.P..*
> backtrace:
>  [] kmemleak_alloc+0x26/0x50
>  [] kmem_cache_alloc+0xdc/0x190
>  [] __radix_tree_preload+0x49/0xc0
>  [] radix_tree_maybe_preload+0x21/0x30
>  [] add_to_page_cache_lru+0x3c/0xc0
>  [] grab_cache_page_write_begin+0x98/0xf0
>  [] f2fs_write_begin+0xa1/0x370 [f2fs]
>  [] generic_perform_write+0xc7/0x1e0
>  [] __generic_file_aio_write+0x1d0/0x400
>  [] generic_file_aio_write+0x60/0xe0
>  [] do_sync_write+0x5a/0x90
>  [] vfs_write+0xc5/0x1f0
>  [] SyS_write+0x52/0xb0
>  [] system_call_fastpath+0x16/0x1b
>  [] 0x

Do all the backtraces look like the above (coming from
add_to_page_cache_lru)?

There were some changes in lib/radix-tree.c since 3.14, maybe you could
try reverting them and see if the leaks still appear (cc'ing Johannes).
It could also be a false positive.

An issue with debugging such cases is that the preloading is common for
multiple radix trees, so the actual radix_tree_node_alloc() could be on
a different path. You could give the patch below a try to see what
backtrace you get (it updates backtrace in radix_tree_node_alloc()).


diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt
index a7563ec4ea7b..b772418bf064 100644
--- a/Documentation/kmemleak.txt
+++ b/Documentation/kmemleak.txt
@@ -142,6 +142,7 @@ kmemleak_alloc_percpu- notify of a percpu memory 
block allocation
 kmemleak_free   - notify of a memory block freeing
 kmemleak_free_part  - notify of a partial memory block freeing
 kmemleak_free_percpu- notify of a percpu memory block freeing
+kmemleak_update_trace   - update object allocation stack trace
 kmemleak_not_leak   - mark an object as not a leak
 kmemleak_ignore - do not scan or report an object as leak
 kmemleak_scan_area  - add scan areas inside a memory block
diff --git a/include/linux/kmemleak.h b/include/linux/kmemleak.h
index 5bb424659c04..057e95971014 100644
--- a/include/linux/kmemleak.h
+++ b/include/linux/kmemleak.h
@@ -30,6 +30,7 @@ extern void kmemleak_alloc_percpu(const void __percpu *ptr, 
size_t size) __ref;
 extern void kmemleak_free(const void *ptr) __ref;
 extern void kmemleak_free_part(const void *ptr, size_t size) __ref;
 extern void kmemleak_free_percpu(const void __percpu *ptr) __ref;
+extern void kmemleak_update_trace(const void *ptr) __ref;
 extern void kmemleak_not_leak(const void *ptr) __ref;
 extern void kmemleak_ignore(const void *ptr) __ref;
 extern void kmemleak_scan_area(const void *ptr, size_t size, gfp_t gfp) __ref;
@@ -83,6 +84,9 @@ static inline void kmemleak_free_recursive(const void *ptr, 
unsigned long flags)
 static inline void kmemleak_free_percpu(const void __percpu *ptr)
 {
 }
+static inline void kmemleak_update_trace(const void *ptr)
+{
+}
 static inline void kmemleak_not_leak(const void *ptr)
 {
 }
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 9599aa72d7a0..5297f8e09096 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -200,6 +201,11 @@ radix_tree_node_alloc(struct radix_tree_root *root)
rtp->nodes[rtp->nr - 1] = NULL;
rtp->nr--;
}
+   /*
+* Update the allocation stack trace as this is more useful
+* for debugging.
+*/
+   kmemleak_update_trace(ret);
}
if (ret == NULL)
ret = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 3a36e2b16cba..61a64ed2fbef 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -990,6 +990,40 @@ void __ref kmemleak_free_percpu(const void __percpu *ptr)
 EXPORT_SYMBOL_GPL(kmemleak_free_percpu);
 
 /**
+ * kmemleak_update_trace - update object allocation stack trace
+ * @ptr:   pointer to beginning of the object
+ *
+ * Override the object allocation stack trace for cases where the actual
+ * allocation place is not always useful.
+ */
+void __ref kmemleak_update_trace(const void *ptr)
+{
+   struct kmemleak_object *object;
+   unsigned long flags;
+
+   pr_debug("%s(0x%p)\n", __func__, ptr);
+
+   if (!kmemleak_enabled || IS_ERR_OR_NULL(ptr))
+   return;
+
+   object = find_and_get_object((unsigned long)ptr, 1);
+   if (!object) {
+#ifdef DEBUG
+   kmemleak_warn("Updating stack trace for unknown object at %p\n",
+ ptr);
+#endif
+   return;
+   }
+
+   spin_lock_irqsave(&o

[PATCH 5/6] mm: Call kmemleak directly from memblock_(alloc|free)

2014-05-02 Thread Catalin Marinas
Kmemleak could ignore memory blocks allocated via memblock_alloc()
leading to false positives during scanning. This patch adds the
corresponding callbacks and removes kmemleak_free_* calls in
mm/nobootmem.c to avoid duplication. The kmemleak_alloc() in
mm/nobootmem.c is kept since __alloc_memory_core_early() does not use
memblock_alloc() directly.

Signed-off-by: Catalin Marinas 
Cc: Andrew Morton 
---
 mm/memblock.c  | 9 -
 mm/nobootmem.c | 2 --
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index e9d6ca9a01a9..8813a31d7fbd 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -681,6 +681,7 @@ int __init_memblock memblock_free(phys_addr_t base, 
phys_addr_t size)
 (unsigned long long)base + size - 1,
 (void *)_RET_IP_);
 
+   kmemleak_free_part(__va(base), size);
return __memblock_remove(&memblock.reserved, base, size);
 }
 
@@ -985,8 +986,14 @@ static phys_addr_t __init 
memblock_alloc_base_nid(phys_addr_t size,
align = SMP_CACHE_BYTES;
 
found = memblock_find_in_range_node(size, align, 0, max_addr, nid);
-   if (found && !memblock_reserve(found, size))
+   if (found && !memblock_reserve(found, size)) {
+   /*
+* The min_count is set to 0 so that memblock allocations are
+* never reported as leaks.
+*/
+   kmemleak_alloc(__va(found), size, 0, 0);
return found;
+   }
 
return 0;
 }
diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index 04a9d94333a5..7ed58602e71b 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -197,7 +197,6 @@ unsigned long __init free_all_bootmem(void)
 void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
  unsigned long size)
 {
-   kmemleak_free_part(__va(physaddr), size);
memblock_free(physaddr, size);
 }
 
@@ -212,7 +211,6 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned 
long physaddr,
  */
 void __init free_bootmem(unsigned long addr, unsigned long size)
 {
-   kmemleak_free_part(__va(addr), size);
memblock_free(addr, size);
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/6] lib: Update the kmemleak stack trace for radix tree allocations

2014-05-02 Thread Catalin Marinas
Since radix_tree_preload() stack trace is not always useful for
debugging an actual radix tree memory leak, this patch updates the
kmemleak allocation stack trace in the radix_tree_node_alloc() function.

Signed-off-by: Catalin Marinas 
Cc: Andrew Morton 
Cc: Johannes Weiner 
---
 lib/radix-tree.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 9599aa72d7a0..5297f8e09096 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -200,6 +201,11 @@ radix_tree_node_alloc(struct radix_tree_root *root)
rtp->nodes[rtp->nr - 1] = NULL;
rtp->nr--;
}
+   /*
+* Update the allocation stack trace as this is more useful
+* for debugging.
+*/
+   kmemleak_update_trace(ret);
}
if (ret == NULL)
ret = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/6] mm: Introduce kmemleak_update_trace()

2014-05-02 Thread Catalin Marinas
The memory allocation stack trace is not always useful for debugging
a memory leak (e.g. radix_tree_preload). This function, when called,
updates the stack trace for an already allocated object.

Signed-off-by: Catalin Marinas 
Cc: Andrew Morton 
Cc: Johannes Weiner 
---
 Documentation/kmemleak.txt |  1 +
 include/linux/kmemleak.h   |  4 
 mm/kmemleak.c  | 34 ++
 3 files changed, 39 insertions(+)

diff --git a/Documentation/kmemleak.txt b/Documentation/kmemleak.txt
index a7563ec4ea7b..b772418bf064 100644
--- a/Documentation/kmemleak.txt
+++ b/Documentation/kmemleak.txt
@@ -142,6 +142,7 @@ kmemleak_alloc_percpu- notify of a percpu memory 
block allocation
 kmemleak_free   - notify of a memory block freeing
 kmemleak_free_part  - notify of a partial memory block freeing
 kmemleak_free_percpu- notify of a percpu memory block freeing
+kmemleak_update_trace   - update object allocation stack trace
 kmemleak_not_leak   - mark an object as not a leak
 kmemleak_ignore - do not scan or report an object as leak
 kmemleak_scan_area  - add scan areas inside a memory block
diff --git a/include/linux/kmemleak.h b/include/linux/kmemleak.h
index 5bb424659c04..057e95971014 100644
--- a/include/linux/kmemleak.h
+++ b/include/linux/kmemleak.h
@@ -30,6 +30,7 @@ extern void kmemleak_alloc_percpu(const void __percpu *ptr, 
size_t size) __ref;
 extern void kmemleak_free(const void *ptr) __ref;
 extern void kmemleak_free_part(const void *ptr, size_t size) __ref;
 extern void kmemleak_free_percpu(const void __percpu *ptr) __ref;
+extern void kmemleak_update_trace(const void *ptr) __ref;
 extern void kmemleak_not_leak(const void *ptr) __ref;
 extern void kmemleak_ignore(const void *ptr) __ref;
 extern void kmemleak_scan_area(const void *ptr, size_t size, gfp_t gfp) __ref;
@@ -83,6 +84,9 @@ static inline void kmemleak_free_recursive(const void *ptr, 
unsigned long flags)
 static inline void kmemleak_free_percpu(const void __percpu *ptr)
 {
 }
+static inline void kmemleak_update_trace(const void *ptr)
+{
+}
 static inline void kmemleak_not_leak(const void *ptr)
 {
 }
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 3a36e2b16cba..61a64ed2fbef 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -990,6 +990,40 @@ void __ref kmemleak_free_percpu(const void __percpu *ptr)
 EXPORT_SYMBOL_GPL(kmemleak_free_percpu);
 
 /**
+ * kmemleak_update_trace - update object allocation stack trace
+ * @ptr:   pointer to beginning of the object
+ *
+ * Override the object allocation stack trace for cases where the actual
+ * allocation place is not always useful.
+ */
+void __ref kmemleak_update_trace(const void *ptr)
+{
+   struct kmemleak_object *object;
+   unsigned long flags;
+
+   pr_debug("%s(0x%p)\n", __func__, ptr);
+
+   if (!kmemleak_enabled || IS_ERR_OR_NULL(ptr))
+   return;
+
+   object = find_and_get_object((unsigned long)ptr, 1);
+   if (!object) {
+#ifdef DEBUG
+   kmemleak_warn("Updating stack trace for unknown object at %p\n",
+ ptr);
+#endif
+   return;
+   }
+
+   spin_lock_irqsave(&object->lock, flags);
+   object->trace_len = __save_stack_trace(object->trace);
+   spin_unlock_irqrestore(&object->lock, flags);
+
+   put_object(object);
+}
+EXPORT_SYMBOL(kmemleak_update_trace);
+
+/**
  * kmemleak_not_leak - mark an allocated object as false positive
  * @ptr:   pointer to beginning of the object
  *
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/6] Kmemleak updates

2014-05-02 Thread Catalin Marinas
Hi,

This series contains a few kmemleak updates:

- Avoid false positives caused by not tracking all memblock allocations
  and disabling the kmemleak early logging slightly earlier
- Debugging improvements for places where pre-allocation happens
  (mempool and radix tree)
- minor printk correction

Catalin Marinas (5):
  mm: Introduce kmemleak_update_trace()
  lib: Update the kmemleak allocation stack trace for kmemleak
  mm: Update the kmemleak stack trace for mempool allocations
  mm: Call kmemleak directly from memblock_(alloc|free)
  mm: Postpone the disabling of kmemleak early logging

Jianpeng Ma (1):
  mm/kmemleak.c: Use %u to print ->checksum.

 Documentation/kmemleak.txt |  1 +
 include/linux/kmemleak.h   |  4 
 lib/radix-tree.c   |  6 ++
 mm/kmemleak.c  | 43 ++-
 mm/memblock.c  |  9 -
 mm/mempool.c   |  6 ++
 mm/nobootmem.c |  2 --
 7 files changed, 63 insertions(+), 8 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/6] mm/kmemleak.c: Use %u to print ->checksum.

2014-05-02 Thread Catalin Marinas
From: Jianpeng Ma 

Signed-off-by: Jianpeng Ma 
Signed-off-by: Catalin Marinas 
Cc: Andrew Morton 
---
 mm/kmemleak.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 91d67eaee050..3a36e2b16cba 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -387,7 +387,7 @@ static void dump_object_info(struct kmemleak_object *object)
pr_notice("  min_count = %d\n", object->min_count);
pr_notice("  count = %d\n", object->count);
pr_notice("  flags = 0x%lx\n", object->flags);
-   pr_notice("  checksum = %d\n", object->checksum);
+   pr_notice("  checksum = %u\n", object->checksum);
pr_notice("  backtrace:\n");
print_stack_trace(&trace, 4);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/6] mm: Update the kmemleak stack trace for mempool allocations

2014-05-02 Thread Catalin Marinas
When mempool_alloc() returns an existing pool object, kmemleak_alloc()
is no longer called and the stack trace corresponds to the original
object allocation. This patch updates the kmemleak allocation stack
trace for such objects to make it more useful for debugging.

Signed-off-by: Catalin Marinas 
Cc: Andrew Morton 
---
 mm/mempool.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/mm/mempool.c b/mm/mempool.c
index 905434f18c97..e7c4be024f1a 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -220,6 +221,11 @@ repeat_alloc:
spin_unlock_irqrestore(&pool->lock, flags);
/* paired with rmb in mempool_free(), read comment there */
smp_wmb();
+   /*
+* Update the allocation stack trace as this is more useful
+* for debugging.
+*/
+   kmemleak_update_trace(element);
return element;
}
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/6] mm: Postpone the disabling of kmemleak early logging

2014-05-02 Thread Catalin Marinas
Currently, kmemleak_early_log is disabled at the beginning of the
kmemleak_init() function, before the full kmemleak tracing is actually
enabled. In this small window, kmem_cache_create() is called by kmemleak
which triggers additional memory allocation that are not traced. This
patch moves the kmemleak_early_log disabling further down and at the
same time with full kmemleak enabling.

Signed-off-by: Catalin Marinas 
Cc: Andrew Morton 
---
 mm/kmemleak.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 61a64ed2fbef..0cd6aabd45a0 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1809,8 +1809,6 @@ void __init kmemleak_init(void)
int i;
unsigned long flags;
 
-   kmemleak_early_log = 0;
-
 #ifdef CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF
if (!kmemleak_skip_disable) {
kmemleak_disable();
@@ -1833,8 +1831,9 @@ void __init kmemleak_init(void)
if (kmemleak_error) {
local_irq_restore(flags);
return;
-   } else
-   kmemleak_enabled = 1;
+   }
+   kmemleak_early_log = 0;
+   kmemleak_enabled = 1;
local_irq_restore(flags);
 
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: fix pud_huge() for 2-level pagetables

2014-05-16 Thread Catalin Marinas
On Fri, May 16, 2014 at 04:54:11PM +0100, Mark Salter wrote:
> On Fri, 2014-05-16 at 11:04 +0100, Catalin Marinas wrote:
> > On Thu, May 15, 2014 at 03:19:22PM +0100, Mark Salter wrote:
> > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> > > index 5e9aec3..9bed38f 100644
> > > --- a/arch/arm64/mm/hugetlbpage.c
> > > +++ b/arch/arm64/mm/hugetlbpage.c
> > > @@ -51,7 +51,11 @@ int pmd_huge(pmd_t pmd)
> > >  
> > >  int pud_huge(pud_t pud)
> > >  {
> > > +#ifndef __PAGETABLE_PMD_FOLDED
> > >   return !(pud_val(pud) & PUD_TABLE_BIT);
> > > +#else
> > > + return 0;
> > > +#endif
> > >  }
> > >  
> > >  int pmd_huge_support(void)
> > > @@ -64,8 +68,10 @@ static __init int setup_hugepagesz(char *opt)
> > >   unsigned long ps = memparse(opt, &opt);
> > >   if (ps == PMD_SIZE) {
> > >   hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
> > > +#ifndef __PAGETABLE_PMD_FOLDED
> > >   } else if (ps == PUD_SIZE) {
> > >   hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
> > > +#endif
> > 
> > Since PMD_SIZE == PUD_SIZE when __PAGETABLE_PMD_FOLDED, do we need the
> > #ifndef here? Maybe the compiler is smart enough to remove it but it's
> > not on a critical path anyway, so I wouldn't bother.
> 
> Yes, I think it would remove it. In any case, one less ifdef would be
> a good thing.

I merged this patch and dropped the last #ifndef.

I still have doubts about the kvm code calling put_page more than
necessary, especially since pud == pmd and the loop continues after
pud_huge() returns true, but your patch looks harmless.

Unless Steve has any objection, I'll push it to mainline. I also added
Cc: stable # v3.11+

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] arm64: Support arch_irq_work_raise() via self IPIs

2014-05-16 Thread Catalin Marinas
On Mon, May 12, 2014 at 04:48:51PM +0100, Larry Bassel wrote:
> Support for arch_irq_work_raise() was missing from
> arm64 (a prerequisite for FULL_NOHZ).
> 
> This patch is based on the arm32 patches ARM 7872/1
> and 7887/1.

Applied (with some clean-up of the commit log). Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] arm64 fixes for 3.15

2014-05-16 Thread Catalin Marinas
Hi Linus,

Please pull the arm64 fixes below. Thanks.


The following changes since commit 89ca3b881987f5a4be4c5dbaa7f0df12bbdde2fd:

  Linux 3.15-rc4 (2014-05-04 18:14:42 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux tags/arm64-fixes

for you to fetch changes up to 4797ec2dc83a43be35bad56037d1b53db9e2b5d5:

  arm64: fix pud_huge() for 2-level pagetables (2014-05-16 17:34:40 +0100)


- arm64 migrate_irqs() fix following commit ffde1de64012 (irqchip: Gic:
  Support forced affinity setting)
- fix arm64 pud_huge() to return 0 when only 2 levels page tables are
  used (__PAGETABLE_PMD_FOLDED defined and pmd_huge already covers block
  entries at the first level), otherwise KVM gets confused


Mark Salter (1):
  arm64: fix pud_huge() for 2-level pagetables

Sudeep Holla (1):
  arm64: use cpu_online_mask when using forced irq_set_affinity

 arch/arm64/kernel/irq.c | 10 +++---
 arch/arm64/mm/hugetlbpage.c |  4 
 2 files changed, 11 insertions(+), 3 deletions(-)

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: remove ARM710 specific assembler code

2014-05-17 Thread Catalin Marinas
On Fri, May 16, 2014 at 01:55:46PM +0100, Russell King - ARM Linux wrote:
> There was a CPU called the ARM710, it was ARMv3 and it had no Thumb support.
> 
> There is also a CPU called the ARM710T, which is ARMv4 and has Thumb support.
> 
> These are two completely different CPUs, the former was removed along with
> the removal of ARMv3 support.  The latter remains because we still support
> ARMv4.

BTW, while clearly this patch was removing code for the wrong reasons, I
think we should set a longer term timeline for getting rid of some of
old features. Let's say in 10 years time we remove everything ARMv4,
another 10 years ARMv5 and so on. We could make these milestones shorter
but it really depends on what people use, we should not force them out
of the kernel if still in use.

We can start with ARM core and SoC code that we suspect people haven't
used in a while (or at least not with mainline). We should not remove
them straight away but give some advanced warning. My proposal is to add
a CONFIG_DEPRECATED option and update it with a 2 year cadence. Code
that we want to remove will depend on DEPRECATED and explicitly not
covered by defconfig. This way we can get interested parties sending
patches to remove the DEPRECATED dependency. Something like this:

config DEPRECATED
bool "Enable deprecated kernel features"
help
  Kernel features no longer in use are marked as DEPRECATED for
  two years and removed at the end of this period. This option
  should only be enabled explicitly and must not be included in
  defconfig files. If you think a DEPRECATED kernel feature is
  still needed, please contact the corresponding maintainers to
  remove the DEPRECATED dependency.

  The next scheduled DEPRECATED code removal is planned for 2016.

It would be even better if we make such option across the whole kernel,
especially since we have significant ARM SoC code into drivers.
Otherwise calling it ARM_DEPRECATED would work as well.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: remove ARM710 specific assembler code

2014-05-17 Thread Catalin Marinas
On 17 May 2014, at 10:46, Russell King - ARM Linux  
wrote:
> On Sat, May 17, 2014 at 10:23:37AM +0100, Catalin Marinas wrote:
>> On Fri, May 16, 2014 at 01:55:46PM +0100, Russell King - ARM Linux wrote:
>>> There was a CPU called the ARM710, it was ARMv3 and it had no Thumb support.
>>> 
>>> There is also a CPU called the ARM710T, which is ARMv4 and has Thumb 
>>> support.
>>> 
>>> These are two completely different CPUs, the former was removed along with
>>> the removal of ARMv3 support.  The latter remains because we still support
>>> ARMv4.
>> 
>> BTW, while clearly this patch was removing code for the wrong reasons, I
>> think we should set a longer term timeline for getting rid of some of
>> old features. Let's say in 10 years time we remove everything ARMv4,
>> another 10 years ARMv5 and so on. We could make these milestones shorter
>> but it really depends on what people use, we should not force them out
>> of the kernel if still in use.
> 
> I still use StrongARM based machines here, and I don't see that changing
> unless some suitably designed ARM boards come my way which (a) offer the
> same features and (b) out perform it.
> 
> The problem is that there's lots of ARM boards which satisfy (b) - the
> iMX6 stuff clearly does - but hardly anything which satisfies (a).
> 
> There's also been some recent effort with SA1100 SoC code, so there's
> also other interest there still.
> 
> So, ARMv4 is still very much in use with modern kernels.

That’s why I said maybe aim for removing it in 10 year. But if code is
still in use by that time, we keep it.

> The difference between what you're proposing and what happened to ARMv3
> is that ARMv3 was broken for quite some time (we read from some of the
> CP15 registers which are read-only in ARMv3) and no one ever raised a
> problem with that.  So, after a sufficient period of time, it got removed
> - and no one batted an eyelid.  That's the correct way to do it - allow
> code to age, and if no one notices it's been broken, then it can be
> removed.

I’m more for pro-actively “breaking” it with a DEPRECATED
dependency. For example, if you suspect that some code like ARM710T is
no longer in use, we mark it and see if anyone complains about this over
a two years period. If not, it gets removed.

Waiting for code to get broken is another way but it’s less
predictable.

Catalin--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: remove ARM710 specific assembler code

2014-05-19 Thread Catalin Marinas
On Sat, May 17, 2014 at 11:26:38AM +0100, Russell King - ARM Linux wrote:
> On Sat, May 17, 2014 at 10:56:02AM +0100, Catalin Marinas wrote:
> > > The difference between what you're proposing and what happened to ARMv3
> > > is that ARMv3 was broken for quite some time (we read from some of the
> > > CP15 registers which are read-only in ARMv3) and no one ever raised a
> > > problem with that.  So, after a sufficient period of time, it got removed
> > > - and no one batted an eyelid.  That's the correct way to do it - allow
> > > code to age, and if no one notices it's been broken, then it can be
> > > removed.
> > 
> > I’m more for pro-actively “breaking” it with a DEPRECATED
> > dependency. For example, if you suspect that some code like ARM710T is
> > no longer in use, we mark it and see if anyone complains about this over
> > a two years period. If not, it gets removed.
> > 
> > Waiting for code to get broken is another way but it’s less
> > predictable.
> 
> When code being used gets broken, it's nice to think that we'll get
> bug reports which will tell us if it's still being used.

But if you don't get any reports, you can't really whether it's broken
(just because it compiles doesn't mean it still works). So we end up
with keeping code in the kernel for much longer than necessary.

> The problem
> with DEPRECATED is that it will get lost amongst all the thousands
> of other config options and won't be noticed.  Just like EXPERIMENTAL
> or any of the other similar options we've had.

DEPRECATED is meant for documenting the planned removal. If people
complain afterwards, it's their problem for not reporting it earlier.

But we can make it even clearer by adding "depends on n" for DEPRECATED
or just making it not user selectable, so that people would need to
actively change the Kconfig source. They can't complain they haven't
noticed.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] mm/kmemleak-test.c: use pr_fmt for logging

2014-05-20 Thread Catalin Marinas
On Mon, May 19, 2014 at 08:25:13PM +0100, Fabian Frederick wrote:
> 
> Cc: Catalin Marinas 
> Cc: Andrew Morton 
> Signed-off-by: Fabian Frederick 
> ---
>  mm/kmemleak-test.c | 36 +++-
>  1 file changed, 19 insertions(+), 17 deletions(-)

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: remove ARM710 specific assembler code

2014-05-20 Thread Catalin Marinas
On Tue, May 20, 2014 at 04:48:19PM +0100, One Thousand Gnomes wrote:
> > old features. Let's say in 10 years time we remove everything ARMv4,
> > another 10 years ARMv5 and so on. We could make these milestones shorter
> > but it really depends on what people use, we should not force them out
> > of the kernel if still in use.
> 
> Why do you care ?

For now, just some negative diffstats in the arm tree ;). Longer term,
supporting only ARMv6+ could bring further core arm code clean-up but
that's not planned for another ~20 years.

> Worry about it at the point nobody can remember needing
> the support, or when it creates some horrible situation that is painful
> to keep supporting and nobody seems to care. We've only recently dropped
> 80386 support, and we still support MC68000.

It's not necessarily how far back we go but rather the wide variation in
ARM processors implementation, especially for the older architecture
versions, and not knowing whether the code still works since no-one
seems to remember having such hardware.

> Or just wait until 2038 approaches and the 32bit panic stations begins,
> then clean out 8)

This kind of matches my proposal to remove ARMv5 support in 20 years
(though we still keep ARMv6 and ARMv7 with 32-bit).

Anyway, the feedback seems to be that we keep them around until they can
no longer be supported (which probably means compiled).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] EFI changes for arm64

2014-05-20 Thread Catalin Marinas
On Mon, May 19, 2014 at 11:50:42AM +0100, Matt Fleming wrote:
> On Wed, 30 Apr, at 09:03:32PM, Matt Fleming wrote:
> > I pulled the arm64 EFI changes into the following topic branch. Catalin
> > was happy for this to go through tip, which definitely makes things
> > easier for the next merge window because of the dependency these patches
> > have on tip/x86/efi.
> > 
> > The following changes since commit e33655a386ed3b26ad36fb97a47ebb1c2ca1e928:
> > 
> >   efivars: Add compatibility code for compat tasks (2014-04-17 13:53:53 
> > +0100)
> > 
> > are available in the git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git arm64-efi
> > 
> > for you to fetch changes up to 345c736edd07b657a8c48190baed2719b85d0938:
> > 
> >   efi/arm64: ignore dtb= when UEFI SecureBoot is enabled (2014-04-30 
> > 19:57:06 +0100)
> 
> Ping?

Alternatively, I can merge the arm64-efi tree and wait for tip x86/efi
to go in before sending my pull request. Peter, Ingo, do you have any
preference?

Either way, I'd like to see this branch in -next to make sure there are
no (significant) conflicts before the merging window.

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/7] acpi, apei, ghes: Make unmapping functionality independent from architecture.

2014-05-14 Thread Catalin Marinas
On Wed, May 14, 2014 at 01:35:42PM +0100, Will Deacon wrote:
> On Wed, May 14, 2014 at 01:32:27PM +0100, Tomasz Nowicki wrote:
> > On 13.05.2014 22:11, Borislav Petkov wrote:
> > > On Wed, Apr 09, 2014 at 05:14:34PM +0200, Tomasz Nowicki wrote:
> > >> Till now __flush_tlb_one was used for unmapping virtual memory which
> > >> is x86 specific function. Replace it with more generic
> > >> flush_tlb_kernel_range.
> > >>
> > >> Signed-off-by: Tomasz Nowicki 
> > >> ---
> > >>   drivers/acpi/apei/ghes.c |4 ++--
> > >>   1 file changed, 2 insertions(+), 2 deletions(-)
> > >>
> > >> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> > >> index aaf8db3..624878b 100644
> > >> --- a/drivers/acpi/apei/ghes.c
> > >> +++ b/drivers/acpi/apei/ghes.c
> > >> @@ -185,7 +185,7 @@ static void ghes_iounmap_nmi(void __iomem *vaddr_ptr)
> > >>
> > >>  BUG_ON(vaddr != (unsigned long)GHES_IOREMAP_NMI_PAGE(base));
> > >>  unmap_kernel_range_noflush(vaddr, PAGE_SIZE);
> > >> -__flush_tlb_one(vaddr);
> > >> +flush_tlb_kernel_range(vaddr, vaddr + PAGE_SIZE);
> > >>   }
> > >>
> > >>   static void ghes_iounmap_irq(void __iomem *vaddr_ptr)
> > >> @@ -195,7 +195,7 @@ static void ghes_iounmap_irq(void __iomem *vaddr_ptr)
> > >>
> > >>  BUG_ON(vaddr != (unsigned long)GHES_IOREMAP_IRQ_PAGE(base));
> > >>  unmap_kernel_range_noflush(vaddr, PAGE_SIZE);
> > >> -__flush_tlb_one(vaddr);
> > >> +flush_tlb_kernel_range(vaddr, vaddr + PAGE_SIZE);
> > >
> > > flush_tlb_kernel_range() does send an IPI to every core on x86 which is
> > > much more expensive than what __flush_tlb_one does.
> > >
> > > Fairer it would be if you added a __flush_tlb_one() version for arm
> > > which does flush_tlb_kernel_range for you.
> > >
> > 
> > Thanks for comment. I am not sure if maintainers will allow me to add 
> > sth like __flush_tlb_one() for arm/arm64. Let me ask them directly. 
> > Catalin, Russell what do you think?
> 
> I don't have the background for this, but if you don't need broadcasting
> (if this avoids IPIs on x86, I guess you don't) then why not use
> local_flush_tlb_kernel_range instead?

Is this generic enough (we don't have it on arm64)?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/27] arm64: Override defaults from generic/tlb.h

2014-05-15 Thread Catalin Marinas
On Wed, May 14, 2014 at 07:59:33PM +0100, Richard Weinberger wrote:
> --- a/arch/arm64/include/asm/tlb.h
> +++ b/arch/arm64/include/asm/tlb.h
> @@ -19,7 +19,14 @@
>  #ifndef __ASM_TLB_H
>  #define __ASM_TLB_H
>  
> +/* These defines are needed to override the defaults from asm-generic/tlb.h 
> */
>  #define  __tlb_remove_pmd_tlb_entry __tlb_remove_pmd_tlb_entry
> +#define tlb_start_vma tlb_start_vma
> +#define tlb_end_vma tlb_end_vma
> +#define __tlb_remove_tlb_entry __tlb_remove_tlb_entry
> +#define tlb_flush tlb_flush
> +#define __pte_free_tlb __pte_free_tlb
> +#define __pmd_free_tlb __pmd_free_tlb
>  
>  #include 

Acked-by: Catalin Marinas 

I hope subsequent series would consolidate some more of the above (for
example tlb_start_vma() is not really arm64 specific.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 2/2] ARM: ioremap: Add IO mapping space reused support.

2014-05-15 Thread Catalin Marinas
On Tue, May 13, 2014 at 09:45:08AM +0800, Richard Lee wrote:
> > On Mon, May 12, 2014 at 3:51 PM, Arnd Bergmann  wrote:
> > On Monday 12 May 2014 10:19:55 Richard Lee wrote:
> >> For the IO mapping, for the same physical address space maybe
> >> mapped more than one time, for example, in some SoCs:
> >> 0x2000 ~ 0x20001000: are global control IO physical map,
> >> and this range space will be used by many drivers.
> >> And then if each driver will do the same ioremap operation, we
> >> will waste to much malloc virtual spaces.
> >>
> >> This patch add IO mapping space reused support.
> >>
> >> Signed-off-by: Richard Lee 
> >
> > What happens if the first driver then unmaps the area?
> 
> If the first driver will unmap the area, it shouldn't do any thing
> except decreasing the 'used' counter.

It's still racy. What if the first driver manage to decrement the used
counter, unmaps the regions but doesn't yet free the vm_struct while
another driver finds the vm_struct, increments the used count and
assumes it can use it?

BTW, vm_area_is_aready_to_free() name implies a query but it has
side-effects like decrementing the counter.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: fix pud_huge() for 2-level pagetables

2014-05-16 Thread Catalin Marinas
On Thu, May 15, 2014 at 07:39:17PM +0100, Mark Salter wrote:
> On Thu, 2014-05-15 at 18:55 +0100, Steve Capper wrote:
> > On 15 May 2014 17:27, Mark Salter  wrote:
> > > On Thu, 2014-05-15 at 15:44 +0100, Steve Capper wrote:
> > >> On Thu, May 15, 2014 at 10:19:22AM -0400, Mark Salter wrote:
> > >> > In arch/arm/kvm/mmu.c:unmap_range(), we end up doing an extra 
> > >> > put_page()
> > >> > on the stage2 pgd which leads to the BUG in put_page_testzero(). This
> > >> > happens because a pud_huge() test in unmap_range() returns true when it
> > >> > should always be false with 2-level pages tables used by 64k pages.
> > >> > This patch removes support for huge puds if 2-level pagetables are
> > >> > being used.
[...]
> > Yeah I agree for 64K granule it doesn't make sense to have a huge_pud.
> > The patch looks sound now, but checking for a folded pmd may run into
> > problems if/when we get to 3-levels and 64K pages in future.
> > 
> > Perhaps checking for PAGE_SHIFT==12 (or something similar) would be a
> > bit more robust?
> 
> I don't think testing based on granule size is generally correct either.
> Maybe support for 3-level page tables with 64k granule gets added as an
> option. That would break the pagesize based test. With a folded pmd, we
> know there is no pud, so pud_huge() should always be false.

I agree, pud_huge() should be false in the same way we define
pud_present() to be 1 when __PGTABLE_PMD_FOLDED. The *_huge() macros
aren't covered by the generic headers unfortunately (some clean-up would
be useful at some point but for now this patch is fine).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: fix pud_huge() for 2-level pagetables

2014-05-16 Thread Catalin Marinas
On Thu, May 15, 2014 at 03:19:22PM +0100, Mark Salter wrote:
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index 5e9aec3..9bed38f 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -51,7 +51,11 @@ int pmd_huge(pmd_t pmd)
>  
>  int pud_huge(pud_t pud)
>  {
> +#ifndef __PAGETABLE_PMD_FOLDED
>   return !(pud_val(pud) & PUD_TABLE_BIT);
> +#else
> + return 0;
> +#endif
>  }
>  
>  int pmd_huge_support(void)
> @@ -64,8 +68,10 @@ static __init int setup_hugepagesz(char *opt)
>   unsigned long ps = memparse(opt, &opt);
>   if (ps == PMD_SIZE) {
>   hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
> +#ifndef __PAGETABLE_PMD_FOLDED
>   } else if (ps == PUD_SIZE) {
>   hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
> +#endif

Since PMD_SIZE == PUD_SIZE when __PAGETABLE_PMD_FOLDED, do we need the
#ifndef here? Maybe the compiler is smart enough to remove it but it's
not on a critical path anyway, so I wouldn't bother.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/28] nios2 Linux kernel port

2014-04-24 Thread Catalin Marinas
On Thu, Apr 24, 2014 at 09:55:25AM +0100, Chung-Lin Tang wrote:
> On 2014/4/24 02:26 PM, Chung-Lin Tang wrote:
> > On 2014/4/24 上午 02:15, Pinski, Andrew wrote:
> >>
>  On Apr 23, 2014, at 10:59 AM, "Chung-Lin Tang"  
>  wrote:
> 
> >> On 2014/4/22 07:20 PM, Ley Foon Tan wrote:
> >> On Tue, Apr 22, 2014 at 6:56 PM, Arnd Bergmann  wrote:
> >> On Tuesday 22 April 2014 18:37:11 Ley Foon Tan wrote:
> >> Hi Arnd and Peter Anvin,
> >>
> >> Other than 64-bit time_t, clock_t and suseconds_t, can you 
> >> confirm
> >> that we don't need to have 64 bit off_t? See detail in link 
> >> below.
> >> I can submit the patches for 64-bit time changes
> >> (include/asm-generic/posix_types.h and other archs) if 
> >> everyone is
> >> agreed on this.
> >>
> >> Yes.
> >> Okay, will doing that.
> 
>  I believe that arm64 ILP32 will also be affected. What is the status of
>  this configuration? Has the glibc/kernel ABI been finalized?
> >> Not yet.  I am still working out the signal handling part. But we
> >> already agreed on 64bit time_t, clock_t, and suseconds_t.  And we
> >> agreed to a 64bit offset_t too. 
> >>
> >> On a related note suseconds in the timespec in posix is defined to
> >> be long. So it would nice if the kernel ignores the upper 32bits so
> >> we (glibc developers) can fix this for new targets including x32
> >> and arm64/ilp32. 
> > 
> > Hmm, but that means for purely 32-bit architectures like nios2, which
> > unlike x86_64 or arm64, never has a 64-bit mode, suseconds_t as a 64-bit
> > type in the kernel is simply wasted.
> 
> The more I think of this, the more I feel that suseconds_t should jsut
> be 'long', not strictly 64-bitified. An ILP32 sub-mode in a 64-bit
> kernel should be using compat_* code paths, something like a
> COMPAT_USE_32BIT_SUSECONDS case.

ILP32 mode should use LP64 syscalls as much as possible and that's the
aim with arm64 as well (of course, we still have a few that wouldn't be
possible and we route them via compat).

But here if time_t is 64-bit while susecconds_t is 32-bit, the compat
code wouldn't help.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: add OProfile support

2014-04-26 Thread Catalin Marinas
On 26 Apr 2014, at 09:38, Ding Tianhong  wrote:
> Add OProfile support for arm64,  using the perf backend, and failing back
> to generic timer based sampling if PMU interrupt is not supported.
> 
> I have test this patch on Cortex-A53 and Cortex-A57 motherboard, the OProfile
> could work well by PMU irq or arch timer irq.

This came up before a few times and we also had an implementation but
decided not to merge it. We should rather get the user space oprofile to
use the perf kernel API.

That’s an old thread, it may have even made it into mainline oprofile
but I haven’t followed the development:

http://marc.info/?l=oprofile-list&m=133002515616302&w=2

Catalin--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] arm64 fixes for 3.15

2014-04-26 Thread Catalin Marinas
Hi Linus,

Please pull the arm64 fixes below for 3.15. Thanks.

The following changes since commit a798c10faf62a505d24e5f6213fbaf904a39623f:

  Linux 3.15-rc2 (2014-04-20 11:08:50 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux tags/arm64-fixes

for you to fetch changes up to bc3ee18a7a57243721ecfd879319e3d2e882f289:

  arm64: init: Move of_clk_init to time_init (2014-04-25 18:15:56 +0100)


- compat renameat2 syscall wiring and __NR_compat_syscalls fix
- TLB fix for transparent huge pages following switch to generic
  mmu_gather
- spinlock initialisation for init_mm's context
- move of_clk_init() earlier
- Kconfig duplicate entry fix


Chanho Min (1):
  arm64: init: Move of_clk_init to time_init

Hanjun Guo (1):
  ARM64: Remove duplicated Kconfig entry for "kernel/power/Kconfig"

Leo Yan (1):
  arm64: initialize spinlock for init_mm's context

Miklos Szeredi (2):
  arm64: __NR_compat_syscalls fix
  arm64: add renameat2 compat syscall

Steve Capper (1):
  arm64: mm: Add THP TLB entries to general mmu_gather

Will Deacon (1):
  arm64: debug: remove noisy, pointless warning

 arch/arm64/Kconfig | 2 --
 arch/arm64/include/asm/mmu.h   | 3 +++
 arch/arm64/include/asm/tlb.h   | 6 ++
 arch/arm64/include/asm/unistd32.h  | 3 ++-
 arch/arm64/kernel/debug-monitors.c | 3 ---
 arch/arm64/kernel/setup.c  | 1 -
 arch/arm64/kernel/time.c   | 2 ++
 7 files changed, 13 insertions(+), 7 deletions(-)

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESUBMIT RFC PATCH v2 3/3] drivers: mfd: Add support for Exynos PMU driver

2014-04-28 Thread Catalin Marinas
On Mon, Apr 28, 2014 at 01:26:46PM +0100, Lee Jones wrote:
> > This patch moves Exynos PMU driver implementation from
> > "arm/mach-exynos" to "drivers/mfd".
> > This driver is mainly used for setting misc bits of register from PMU IP
> > of Exynos SoC which will be required to configure before Suspend/Resume.
> > Currently all these settings are done in "arch/arm/mach-exynos/pmu.c" but
> > moving ahead for ARM64 based SoC support, there is a need of DT based
> > implementation of PMU driver.
> > This driver uses already existing DT binding information.
> > 
> > CC: Sangbeom Kim 
> > CC: Samuel Ortiz 
> > CC: Lee Jones 
> > Signed-off-by: Pankaj Dubey 
> > ---
> >  arch/arm/mach-exynos/Kconfig   |2 ++
> >  arch/arm/mach-exynos/Makefile  |2 --
> >  drivers/mfd/Kconfig|9 +
> >  drivers/mfd/Makefile   |1 +
> >  arch/arm/mach-exynos/pmu.c => drivers/mfd/exynos-pmu.c |0
> >  5 files changed, 12 insertions(+), 2 deletions(-)
> >  rename arch/arm/mach-exynos/pmu.c => drivers/mfd/exynos-pmu.c (100%)
> 
> So I just took a look at the code as zero changes looks suspicious to
> me. The driver can not simply be copied and pasted into the MFD
> subsystem in its current state.
> 
> The fundamental question is; is this chip actually an MFD? What does
> it do besides Power Management?

I looked at the code briefly as well and I don't think it matches the
mfd idea. Maybe it could be merged together with
arch/arm/mach-exynos/pm.c and moved to drivers/power/ or a more
appropriate directory for platform_suspend_ops.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] kmemleak on __radix_tree_preload

2014-05-08 Thread Catalin Marinas
On Thu, May 08, 2014 at 06:16:51PM +0900, Jaegeuk Kim wrote:
> 2014-05-07 (수), 12:39 +0100, Catalin Marinas:
> > On Wed, May 07, 2014 at 03:58:08AM +0100, Jaegeuk Kim wrote:
> > > unreferenced object 0x880004226da0 (size 576):
> > >   comm "fsstress", pid 14590, jiffies 4295191259 (age 706.308s)
> > >   hex dump (first 32 bytes):
> > > 01 00 00 00 81 ff ff ff 00 00 00 00 00 00 00 00  
> > > 50 89 34 81 ff ff ff ff b8 6d 22 04 00 88 ff ff  P.4..m".
> > >   backtrace:
> > > [] kmemleak_update_trace+0x58/0x80
> > > [] radix_tree_node_alloc+0x77/0xa0
> > > [] __radix_tree_create+0x1d8/0x230
> > > [] __add_to_page_cache_locked+0x9c/0x1b0
> > > [] add_to_page_cache_lru+0x28/0x80
> > > [] grab_cache_page_write_begin+0x98/0xf0
> > > [] f2fs_write_begin+0xb4/0x3c0 [f2fs]
> > > [] generic_perform_write+0xc7/0x1c0
> > > [] __generic_file_aio_write+0x1cd/0x3f0
> > > [] generic_file_aio_write+0x5e/0xe0
> > > [] do_sync_write+0x5a/0x90
> > > [] vfs_write+0xc2/0x1d0
> > > [] SyS_write+0x4f/0xb0
> > > [] system_call_fastpath+0x16/0x1b
> > > [] 0x
> > 
> > OK, it shows that the allocation happens via add_to_page_cache_locked()
> > and I guess it's page_cache_tree_insert() which calls
> > __radix_tree_create() (the latter reusing the preloaded node). I'm not
> > familiar enough to this code (radix-tree.c and filemap.c) to tell where
> > the node should have been freed, who keeps track of it.
> > 
> > At a quick look at the hex dump (assuming that the above leak is struct
> > radix_tree_node):
> > 
> > .path = 1
> > .count = -0x7f (or 0xff81 as unsigned int)
> > union {
> > {
> > .parent = NULL
> > .private_data = 0x81348950
> > }
> > {
> > .rcu_head.next = NULL
> > .rcu_head.func = 0x81348950
> > }
> > }
> > 
> > The count is a bit suspicious.
> > 
> > From the union, it looks most likely like rcu_head information. Is
> > radix_tree_node_rcu_free() function at the above rcu_head.func?

Thanks for the config. Could you please confirm that 0x81348950
address corresponds to the radix_tree_node_rcu_free() function in your
System.map (or something else)?

> > Also, if you run echo scan > /sys/kernel/debug/kmemleak a few times, do
> > any of the above leaks disappear (in case the above are some transient
> > rcu freeing reports; normally this shouldn't happen as the objects are
> > still referred but I'll look at the relevant code once I have your
> > .config).
> 
> Once I run the echo, the leaks are still remained.

OK, so they aren't just transient.

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] kmemleak on __radix_tree_preload

2014-05-08 Thread Catalin Marinas
On Thu, May 08, 2014 at 10:37:40AM +0100, Jaegeuk Kim wrote:
> 2014-05-08 (목), 10:26 +0100, Catalin Marinas:
> > On Thu, May 08, 2014 at 06:16:51PM +0900, Jaegeuk Kim wrote:
> > > 2014-05-07 (수), 12:39 +0100, Catalin Marinas:
> > > > On Wed, May 07, 2014 at 03:58:08AM +0100, Jaegeuk Kim wrote:
> > > > > unreferenced object 0x880004226da0 (size 576):
> > > > >   comm "fsstress", pid 14590, jiffies 4295191259 (age 706.308s)
> > > > >   hex dump (first 32 bytes):
> > > > > 01 00 00 00 81 ff ff ff 00 00 00 00 00 00 00 00  
> > > > > 50 89 34 81 ff ff ff ff b8 6d 22 04 00 88 ff ff  P.4..m".
> > > > >   backtrace:
> > > > > [] kmemleak_update_trace+0x58/0x80
> > > > > [] radix_tree_node_alloc+0x77/0xa0
> > > > > [] __radix_tree_create+0x1d8/0x230
> > > > > [] __add_to_page_cache_locked+0x9c/0x1b0
> > > > > [] add_to_page_cache_lru+0x28/0x80
> > > > > [] grab_cache_page_write_begin+0x98/0xf0
> > > > > [] f2fs_write_begin+0xb4/0x3c0 [f2fs]
> > > > > [] generic_perform_write+0xc7/0x1c0
> > > > > [] __generic_file_aio_write+0x1cd/0x3f0
> > > > > [] generic_file_aio_write+0x5e/0xe0
> > > > > [] do_sync_write+0x5a/0x90
> > > > > [] vfs_write+0xc2/0x1d0
> > > > > [] SyS_write+0x4f/0xb0
> > > > > [] system_call_fastpath+0x16/0x1b
> > > > > [] 0x
> > > >
> > > > OK, it shows that the allocation happens via add_to_page_cache_locked()
> > > > and I guess it's page_cache_tree_insert() which calls
> > > > __radix_tree_create() (the latter reusing the preloaded node). I'm not
> > > > familiar enough to this code (radix-tree.c and filemap.c) to tell where
> > > > the node should have been freed, who keeps track of it.
> > > >
> > > > At a quick look at the hex dump (assuming that the above leak is struct
> > > > radix_tree_node):
> > > >
> > > > .path = 1
> > > > .count = -0x7f (or 0xff81 as unsigned int)
> > > > union {
> > > > {
> > > > .parent = NULL
> > > > .private_data = 0x81348950
> > > > }
> > > > {
> > > > .rcu_head.next = NULL
> > > > .rcu_head.func = 0x81348950
> > > > }
> > > > }
> > > >
> > > > The count is a bit suspicious.
> > > >
> > > > From the union, it looks most likely like rcu_head information. Is
> > > > radix_tree_node_rcu_free() function at the above rcu_head.func?
> >
> > Thanks for the config. Could you please confirm that 0x81348950
> > address corresponds to the radix_tree_node_rcu_free() function in your
> > System.map (or something else)?
> 
> Yap, the address is matched to radix_tree_node_rcu_free().

Cc'ing Paul as well, not that I blame RCU ;), but maybe he could shed
some light on why kmemleak can't track this object.

My summary so far:

- radix_tree_node reported by kmemleak as it cannot find any trace of it
  when scanning the memory
- at allocation time, radix_tree_node is memzero'ed by
  radix_tree_node_ctor(). Given that node->rcu_head.func ==
  radix_tree_node_rcu_free, my guess is that radix_tree_node_free() has
  been called
- some time later, kmemleak still hasn't received any callback for
  kmem_cache_free(node). Possibly radix_tree_node_rcu_free() hasn't been
  called either since node->count is not NULL.

For RCU queued objects, kmemleak should still track references to them
via rcu_sched_state and rcu_head members. But even if this went wrong, I
would expect the object to be freed eventually and kmemleak notified (so
just a temporary leak report which doesn't seem to be the case here).

I still cannot explain the node->count value above and how it can get
there (too many node->count--?). Maybe Johannes could shed some light.

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm/xen: Remove definiition of virt_to_pfn in asm/xen/page.h

2014-05-08 Thread Catalin Marinas
On Wed, May 07, 2014 at 05:58:21PM +0100, Stefano Stabellini wrote:
> On Wed, 7 May 2014, David Vrabel wrote:
> > On 07/05/14 12:41, Stefano Stabellini wrote:
> > > On Fri, 25 Apr 2014, Stefano Stabellini wrote:
> > >> On Thu, 24 Apr 2014, David Vrabel wrote:
> > >>> On 24/04/14 13:49, Julien Grall wrote:
> > >>>> Hi David,
> > >>>>
> > >>>> On 04/24/2014 01:22 PM, David Vrabel wrote:
> > >>>>> On 18/04/14 16:54, Julien Grall wrote:
> > >>>>>> virt_to_pfn has been defined in asm/memory.h by the commit e26a9e0 
> > >>>>>> "ARM: Better
> > >>>>>> virt_to_page() handling"
> > >>>>>>
> > >>>>>> This will result of a compilation warning when CONFIG_XEN is enabled.
> > >>>>>>
> > >>>>>> arch/arm/include/asm/xen/page.h:80:0: warning: "virt_to_pfn" 
> > >>>>>> redefined [enabled by default]
> > >>>>>>  #define virt_to_pfn(v)  (PFN_DOWN(__pa(v)))
> > >>>>>>  ^
> > >>>>>> In file included from arch/arm/include/asm/page.h:163:0,
> > >>>>>>  from arch/arm/include/asm/xen/page.h:4,
> > >>>>>>  from include/xen/page.h:4,
> > >>>>>>  from arch/arm/xen/grant-table.c:33:
> > >>>>>>
> > >>>>>> The definition in memory.h is nearly the same (it directly expand 
> > >>>>>> PFN_DOWN),
> > >>>>>> so we can safely drop virt_to_pfn in xen include.
> > >>>>>
> > >>>>>
> > >>>>> This breaks the arm build for me.
> > >>>>>
> > >>>>> /local/davidvr/work/k.org/tip/drivers/block/xen-blkfront.c: In 
> > >>>>> function
> > >>>>> ‘setup_blkring’:
> > >>>>> /local/davidvr/work/k.org/tip/drivers/block/xen-blkfront.c:1236:2:
> > >>>>> error: implicit declaration of function ‘virt_to_pfn’
> > >>>>> [-Werror=implicit-function-declaration]
> > >>>>>   err = xenbus_grant_ring(dev, virt_to_mfn(info->ring.sring));
> > >>>>>   ^
> > >>>>
> > >>>> I don't have any issue to build the following branch with this patch 
> > >>>> on:
> > >>>>- v3.15-rc2
> > >>>>- xentip master
> > >>>>- xentip for-linus-3.16
> > >>>
> > >>> Applied to devel/for-linus-3.16.
> > >>>
> > >>> If something else turns up for 3.15 (and I remember) I'll take it for
> > >>> 3.15 instead.
> > >>
> > >> David,
> > >> thank you very much for taking the patch in my absence.
> > >>
> > >> Considering that the problem is affecting everybody enabling CONFIG_XEN
> > >> on ARM on v3.15, I don't think we can wait for the next merge window to
> > >> send this fix upstream.
> > >> Too many warnings for too many people.
> > > 
> > > Unfortunately this commit breaks arm64 compilation, as virt_to_pfn has
> > > not been introduced to arm64/include/asm/memory.h.
> > > Has the patch been sent upstream yet?
> > 
> > No.
> > 
> > > We need this additional change for arm64:
> > > 
> > > diff --git a/arch/arm64/include/asm/memory.h 
> > > b/arch/arm64/include/asm/memory.h
> > > index e94f945..993bce5 100644
> > > --- a/arch/arm64/include/asm/memory.h
> > > +++ b/arch/arm64/include/asm/memory.h
> > > @@ -138,6 +138,7 @@ static inline void *phys_to_virt(phys_addr_t x)
> > >  #define __pa(x)  __virt_to_phys((unsigned long)(x))
> > >  #define __va(x)  ((void 
> > > *)__phys_to_virt((phys_addr_t)(x)))
> > >  #define pfn_to_kaddr(pfn)__va((pfn) << PAGE_SHIFT)
> > > +#define virt_to_pfn(x)  __phys_to_pfn(__virt_to_phys(x))
> > 
> > This would need an ack from an arm64 maintainer.
> 
> Certainly. Catalin is in CC.

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] kmemleak on __radix_tree_preload

2014-05-08 Thread Catalin Marinas
On Thu, May 08, 2014 at 04:00:27PM +0100, Paul E. McKenney wrote:
> On Thu, May 08, 2014 at 11:24:36AM +0100, Catalin Marinas wrote:
> > On Thu, May 08, 2014 at 10:37:40AM +0100, Jaegeuk Kim wrote:
> > > 2014-05-08 (목), 10:26 +0100, Catalin Marinas:
> > > > On Thu, May 08, 2014 at 06:16:51PM +0900, Jaegeuk Kim wrote:
> > > > > 2014-05-07 (수), 12:39 +0100, Catalin Marinas:
> > > > > > On Wed, May 07, 2014 at 03:58:08AM +0100, Jaegeuk Kim wrote:
> > > > > > > unreferenced object 0x880004226da0 (size 576):
> > > > > > >   comm "fsstress", pid 14590, jiffies 4295191259 (age 706.308s)
> > > > > > >   hex dump (first 32 bytes):
> > > > > > > 01 00 00 00 81 ff ff ff 00 00 00 00 00 00 00 00  
> > > > > > > 
> > > > > > > 50 89 34 81 ff ff ff ff b8 6d 22 04 00 88 ff ff  
> > > > > > > P.4..m".
> > > > > > >   backtrace:
> > > > > > > [] kmemleak_update_trace+0x58/0x80
> > > > > > > [] radix_tree_node_alloc+0x77/0xa0
> > > > > > > [] __radix_tree_create+0x1d8/0x230
> > > > > > > [] __add_to_page_cache_locked+0x9c/0x1b0
> > > > > > > [] add_to_page_cache_lru+0x28/0x80
> > > > > > > [] grab_cache_page_write_begin+0x98/0xf0
> > > > > > > [] f2fs_write_begin+0xb4/0x3c0 [f2fs]
> > > > > > > [] generic_perform_write+0xc7/0x1c0
> > > > > > > [] __generic_file_aio_write+0x1cd/0x3f0
> > > > > > > [] generic_file_aio_write+0x5e/0xe0
> > > > > > > [] do_sync_write+0x5a/0x90
> > > > > > > [] vfs_write+0xc2/0x1d0
> > > > > > > [] SyS_write+0x4f/0xb0
> > > > > > > [] system_call_fastpath+0x16/0x1b
> > > > > > > [] 0x
> > > > > >
> > > > > > OK, it shows that the allocation happens via 
> > > > > > add_to_page_cache_locked()
> > > > > > and I guess it's page_cache_tree_insert() which calls
> > > > > > __radix_tree_create() (the latter reusing the preloaded node). I'm 
> > > > > > not
> > > > > > familiar enough to this code (radix-tree.c and filemap.c) to tell 
> > > > > > where
> > > > > > the node should have been freed, who keeps track of it.
> > > > > >
> > > > > > At a quick look at the hex dump (assuming that the above leak is 
> > > > > > struct
> > > > > > radix_tree_node):
> > > > > >
> > > > > > .path = 1
> > > > > > .count = -0x7f (or 0xff81 as unsigned int)
> > > > > > union {
> > > > > > {
> > > > > > .parent = NULL
> > > > > > .private_data = 0x81348950
> > > > > > }
> > > > > > {
> > > > > > .rcu_head.next = NULL
> > > > > > .rcu_head.func = 0x81348950
> > > > > > }
> > > > > > }
> > > > > >
> > > > > > The count is a bit suspicious.
> > > > > >
> > > > > > From the union, it looks most likely like rcu_head information. Is
> > > > > > radix_tree_node_rcu_free() function at the above rcu_head.func?
> > > >
> > > > Thanks for the config. Could you please confirm that 0x81348950
> > > > address corresponds to the radix_tree_node_rcu_free() function in your
> > > > System.map (or something else)?
> > > 
> > > Yap, the address is matched to radix_tree_node_rcu_free().
> > 
> > Cc'ing Paul as well, not that I blame RCU ;), but maybe he could shed
> > some light on why kmemleak can't track this object.
> 
> Do we have any information on how long it has been since that data
> structure was handed to call_rcu()?  If that time is short, then it
> is quite possible that its grace period simply has not yet completed.

kmemleak scans every 10 minutes but Jaegeuk can confirm how long he has
waited.

> It might also be that one of the CPUs is stuck (e.g., spinning with
> interrupts disabled), which would prevent the grace period from

Re: [BUG] kmemleak on __radix_tree_preload

2014-05-08 Thread Catalin Marinas
On Thu, May 08, 2014 at 04:53:30PM +0100, Paul E. McKenney wrote:
> On Thu, May 08, 2014 at 04:29:48PM +0100, Catalin Marinas wrote:
> > BTW, is it safe to have a union overlapping node->parent and
> > node->rcu_head.next? I'm still staring at the radix-tree code but a
> > scenario I have in mind is that call_rcu() has been raised for a few
> > nodes, other CPU may have some reference to one of them and set
> > node->parent to NULL (e.g. concurrent calls to radix_tree_shrink()),
> > breaking the RCU linking. I can't confirm this theory yet ;)
> 
> If this were reproducible, I would suggest retrying with non-overlapping
> node->parent and node->rcu_head.next, but you knew that already.  ;-)

Reading the code, I'm less convinced about this scenario (though it's
worth checking without the union).

> But the usual practice would be to make node removal exclude shrinking.
> And the radix-tree code seems to delegate locking to the caller.
> 
> So, is the correct locking present in the page cache?  The radix-tree
> code seems to assume that all update operations for a given tree are
> protected by a lock global to that tree.

The calling code in mm/filemap.c holds mapping->tree_lock when deleting
radix-tree nodes, so no concurrent calls.

> Another diagnosis approach would be to build with
> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y, which would complain about double
> call_rcu() invocations.  Rumor has it that is is necessary to turn off
> other kmem debugging for this to tell you anything -- I have seen cases
> where the kmem debugging obscures the debug-objects diagnostics.

Another test Jaegeuk could run (hopefully he has some time to look into
this).

Thanks for suggestions.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] kmemleak on __radix_tree_preload

2014-05-08 Thread Catalin Marinas
On 8 May 2014, at 18:52, Johannes Weiner  wrote:
> On Thu, May 08, 2014 at 08:53:30AM -0700, Paul E. McKenney wrote:
>> On Thu, May 08, 2014 at 04:29:48PM +0100, Catalin Marinas wrote:
>>> On Thu, May 08, 2014 at 04:00:27PM +0100, Paul E. McKenney wrote:
>>>> On Thu, May 08, 2014 at 11:24:36AM +0100, Catalin Marinas wrote:
>>>>> My summary so far:
>>>>> 
>>>>> - radix_tree_node reported by kmemleak as it cannot find any trace of it
>>>>>  when scanning the memory
>>>>> - at allocation time, radix_tree_node is memzero'ed by
>>>>>  radix_tree_node_ctor(). Given that node->rcu_head.func ==
>>>>>  radix_tree_node_rcu_free, my guess is that radix_tree_node_free() has
>>>>>  been called
> 
> The constructor is called once when the slab is initially allocated,
> not on every object allocation.  The user is expected to return
> objects in a pristine form or overwrite fields on reallocation, so
> it's possible that the RCU values are left over from the previous
> allocation.

You are right, I missed this one.

>>>>> - some time later, kmemleak still hasn't received any callback for
>>>>>  kmem_cache_free(node). Possibly radix_tree_node_rcu_free() hasn't been
>>>>>  called either since node->count is not NULL.
>>>>> 
>>>>> For RCU queued objects, kmemleak should still track references to them
>>>>> via rcu_sched_state and rcu_head members. But even if this went wrong, I
>>>>> would expect the object to be freed eventually and kmemleak notified (so
>>>>> just a temporary leak report which doesn't seem to be the case here).

[…]

>>>> Of course, if the value of node->count is preventing call_rcu() from
>>>> being invoked in the first place, then the needed grace period won't
>>>> start, much less finish.  ;-)
>>> 
>>> Given the rcu_head.func value, my assumption is that call_rcu() has
>>> already been called.
>> 
>> Fair point -- given that it is a union, you would expect this field to
>> be overwritten upon reuse.
> 
> .parent is overwritten immediately on reuse, but .private_data is
> actually unlikely to be used during the lifetime of the node.
> 
> This could explain why .rcu.head.next is NULL like parent, and
> .private_data/.rcu.head.func is untouched and retains RCU stuff: to me
> it doesn't look like the node is lost in RCU-freeing, rather it was
> previously RCU freed and then lost somewhere after reallocation.

This would be a simpler explanation, and even simpler to test, just
reset rcu_head.func in radix_tree_node_rcu_free() before being returned
to the slab allocator.

Does the negative count give us any clue? This one is reset before
freeing the object.

Thanks,

Catalin--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] kmemleak on __radix_tree_preload

2014-05-09 Thread Catalin Marinas
On Fri, May 09, 2014 at 01:01:31AM +0100, Jaegeuk Kim wrote:
> > > > > > > > On Wed, May 07, 2014 at 03:58:08AM +0100, Jaegeuk Kim wrote:
> > > > > > > > > unreferenced object 0x880004226da0 (size 576):
> > > > > > > > >   comm "fsstress", pid 14590, jiffies 4295191259 (age 
> > > > > > > > > 706.308s)
> > > > > > > > >   hex dump (first 32 bytes):
> > > > > > > > > 01 00 00 00 81 ff ff ff 00 00 00 00 00 00 00 00  
> > > > > > > > > 
> > > > > > > > > 50 89 34 81 ff ff ff ff b8 6d 22 04 00 88 ff ff  
> > > > > > > > > P.4..m".
> > > > > > > > >   backtrace:
> > > > > > > > > [] kmemleak_update_trace+0x58/0x80
> > > > > > > > > [] radix_tree_node_alloc+0x77/0xa0
> > > > > > > > > [] __radix_tree_create+0x1d8/0x230
> > > > > > > > > [] __add_to_page_cache_locked+0x9c/0x1b0
> > > > > > > > > [] add_to_page_cache_lru+0x28/0x80
> > > > > > > > > [] grab_cache_page_write_begin+0x98/0xf0
> > > > > > > > > [] f2fs_write_begin+0xb4/0x3c0 [f2fs]
> > > > > > > > > [] generic_perform_write+0xc7/0x1c0
> > > > > > > > > [] __generic_file_aio_write+0x1cd/0x3f0
> > > > > > > > > [] generic_file_aio_write+0x5e/0xe0
> > > > > > > > > [] do_sync_write+0x5a/0x90
> > > > > > > > > [] vfs_write+0xc2/0x1d0
> > > > > > > > > [] SyS_write+0x4f/0xb0
> > > > > > > > > [] system_call_fastpath+0x16/0x1b
> > > > > > > > > [] 0x
[...]
> Under existing the kmemleak messeages, the fsstress test has been
> running over 12 hours.
> For sure now, I quit the test and umount the file system, which drops
> the whole page caches used by f2fs.
> Then do, echo scan > $DEBUGFS/kmemleak again, but there still exist a
> bunch of leak messages.
> 
> The oldest one is:
> unreferenced object 0x88007b167478 (size 576):
>   comm "fsstress", pid 1636, jiffies 4294945289 (age 164639.728s)
>   hex dump (first 32 bytes):
> 01 00 00 00 81 ff ff ff 00 00 00 00 00 00 00 00  
> 50 89 34 81 ff ff ff ff 90 74 16 7b 00 88 ff ff  P.4..t.{
>   backtrace:
> [snip]

As Johannes pointed out, the simplest explanation would be that the
radix tree node is leaked after allocation. So let's ignore radix-tree.c
filemap.c or RCU for now.

As I read the code, a radix tree node allocated via the above call path
would be stored in the page_tree of the address_space structure. This
address_space object is inode.i_data and the inode is allocated by the
f2fs code. When the inode is destroyed by the f2fs code, can you add
some checks to make sure there are no nodes left in the radix tree? If
there are, they would just leak and have to figure out where they should
have been freed.

You could also revert some of the f2fs changes since 3.14 (assuming 3.14
didn't show leaks) and see if you still get the leaks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 6/8] sched,idle: Avoid spurious wakeup IPIs

2014-05-09 Thread Catalin Marinas
Hi Peter,

On Fri, May 09, 2014 at 03:15:20PM +0100, Peter Zijlstra wrote:
> On Fri, May 09, 2014 at 02:37:27PM +0100, James Hogan wrote:
> > On 11 April 2014 14:42, Peter Zijlstra  wrote:
> > > +   return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & 
> > > _TIF_POLLING_NRFLAG);
> > 
> > This breaks the build on metag, and I suspect arm64 too:
> 
> Yep, I just got a patch for arm64.

[...]

> Any SMP arch that has a polling idle function of any kind (including the
> default cpu_idle_poll()).
> 
> That said, even if that's true, not having TIF_POLLING_NRFLAG isn't
> fatal, just sub-optimal in that we'll send an unconditional IPI to wake
> the CPU even though its polling TIF_NEED_RESCHED and doesn't need
> anything other than that write to wake up.
> 
> Most archs have (x86) hlt or (arm) wfi like idle instructions, and if
> that is your only possible idle function, you'll require the interrupt
> to wake up and there's really no point to having the POLLING bit.

I wonder why we still need TIF_POLLING_NRFLAG for arm64. It was on arm
until commit 16a8016372c42c7628eb (sanitize tsk_is_polling()). On arm64
we use wfi for idle or a firmware call but in both cases the assumption
is that we need an interrupt for waking up.

So I think we should remove this macro for arm64.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 6/8] sched,idle: Avoid spurious wakeup IPIs

2014-05-09 Thread Catalin Marinas
On Fri, May 09, 2014 at 03:50:02PM +0100, Peter Zijlstra wrote:
> On Fri, May 09, 2014 at 03:40:34PM +0100, Catalin Marinas wrote:
> 
> > I wonder why we still need TIF_POLLING_NRFLAG for arm64. It was on arm
> > until commit 16a8016372c42c7628eb (sanitize tsk_is_polling()). On arm64
> > we use wfi for idle or a firmware call but in both cases the assumption
> > is that we need an interrupt for waking up.
> > 
> > So I think we should remove this macro for arm64.
> 
> Does ARM64 support idle=poll? If so, you could keep it for that,
> otherwise it does indeed appear to be pointless.

We don't support idle=poll either.

> As to 32bit ARM, are there SMP chips which do not have WFI?

No. WFI is even used for the secondary booting protocol (we need to send
an IPI to get them going).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: Support arch_irq_work_raise() via self IPIs

2014-05-09 Thread Catalin Marinas
On Mon, May 05, 2014 at 09:48:27PM +0100, Larry Bassel wrote:
> Support for arch_irq_work_raise() was missing from
> arm64 (a prerequisite for FULL_NOHZ).
> 
> This patch is based on the arm32 patch ARM 7872/1
> which ports cleanly.
[...]
> +#ifdef CONFIG_IRQ_WORK
> +void arch_irq_work_raise(void)
> +{
> + smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK);
> +}
> +#endif

There was a subsequent patch adding is_smp() check here (c682e51dbc98
ARM: 7887/1: Don't smp_cross_call() on UP devices in
arch_irq_work_raise()). Don't we need it?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 6/8] sched,idle: Avoid spurious wakeup IPIs

2014-05-09 Thread Catalin Marinas
On Fri, May 09, 2014 at 06:06:49PM +0100, Peter Zijlstra wrote:
> Subject: arm64: Remove TIF_POLLING_NRFLAG
> From: Peter Zijlstra 
> Date: Fri May  9 19:04:00 CEST 2014
> 
> The only idle method for arm64 is WFI and it therefore
> unconditionally requires the reschedule interrupt when idle.
> 
> Suggested-by: Catalin Marinas 
> Signed-off-by: Peter Zijlstra 

There's a tag with my name already but just in case you need another:

Acked-by: Catalin Marinas 

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-discuss] Reminder for kernel summit nominations

2014-05-10 Thread Catalin Marinas
On 9 May 2014, at 18:31, j...@joshtriplett.org wrote:
> On Fri, May 09, 2014 at 11:23:19AM -0400, Theodore Ts'o wrote:
>> Also, if you have time to double check the e-mail addresses that we
>> have, the Google Spreadsheet is here: 
>> 
>>  http://goo.gl/FsVUFX
> 
>> Peter P Waskiewicz Jr
> 
> PJ left Intel and now works for SolidFire, so his @intel.com address
> won't work anymore.  I don't know his preferred new address.  (Hopefully
> he plans to switch to something company-independent now.)

For Debian and StGit work he switched to pjwaskiew...@gmail.com.

Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 0/4] arm64: prerequisites for audit and ftrace

2014-05-12 Thread Catalin Marinas
On Wed, Apr 30, 2014 at 10:51:28AM +0100, AKASHI Takahiro wrote:
> AKASHI Takahiro (4):
>   arm64: make a single hook to syscall_trace() for all syscall features
>   arm64: split syscall_trace() into separate functions for enter/exit
>   arm64: Add regs_return_value() in syscall.h
>   arm64: is_compat_task is defined both in asm/compat.h and
> linux/compat.h

Patches picked by Will and applied to the arm64 for-next/core branch
(should appear in -next at some point).

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the tip tree with the arm64 tree

2014-05-23 Thread Catalin Marinas
On Fri, May 23, 2014 at 07:28:44AM +0100, Stephen Rothwell wrote:
> diff --cc arch/arm64/include/asm/thread_info.h
> index 9c086c63f911,7b8e3a2a00fb..
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@@ -103,12 -99,7 +102,11 @@@ static inline struct thread_info *curre
>   #define TIF_SIGPENDING  0
>   #define TIF_NEED_RESCHED1
>   #define TIF_NOTIFY_RESUME   2   /* callback before returning to user */
>  +#define TIF_FOREIGN_FPSTATE 3   /* CPU's FP state is not current's */
>   #define TIF_SYSCALL_TRACE   8
>  +#define TIF_SYSCALL_AUDIT   9
>  +#define TIF_SYSCALL_TRACEPOINT  10
>  +#define TIF_SECCOMP 11
> - #define TIF_POLLING_NRFLAG  16
>   #define TIF_MEMDIE  18  /* is terminating due to OOM killer */
>   #define TIF_FREEZE  19
>   #define TIF_RESTORE_SIGMASK 20

It looks fine, thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the tip tree with the arm64 tree

2014-05-23 Thread Catalin Marinas
On Fri, May 23, 2014 at 07:44:02AM +0100, Stephen Rothwell wrote:
> Today's linux-next merge of the tip tree got a conflict in
> arch/arm64/mm/mmu.c between commit a501e32430d4 ("arm64: Clean up the
> default pgprot setting") and 206a2a73a62d ("arm64: mm: Create gigabyte
> kernel logical mappings where possible") from the arm64 tree and commit
> d7ecbddf4cae ("arm64: Add function to create identity mappings") from
> the tip tree.
> 
> I fixed it up (maybe - see below - this may not be complete) and can
> carry the fix as necessary (no action is required).

Thanks for fixing it up, it is correct.

(but I now have to go after the arm64 EFI_STUB guys as it breaks non-EFI
booting).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 1/2] arm64: adjust el0_sync so that a function can be called

2014-05-23 Thread Catalin Marinas
On Thu, May 22, 2014 at 11:35:20PM +0100, Larry Bassel wrote:
> > On 05/22/2014 03:27 PM, Larry Bassel wrote:
> > > To implement the context tracker properly on arm64,
> > > a function call needs to be made after debugging and
> > > interrupts are turned on, but before the lr is changed
> > > to point to ret_to_user(). If the function call
> > > is made after the lr is changed the function will not
> > > return to the correct place.
> > > 
> > > For similar reasons, defer the setting of x0 so that
> > > it doesn't need to be saved around the function call
> > > (save far_el1 in x26 temporarily instead).
> > > 
> > > Signed-off-by: Larry Bassel 
> > > ---
> > >  arch/arm64/kernel/entry.S | 24 +---
> > >  1 file changed, 17 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > > index e8b23a3..20b336e 100644
> > > --- a/arch/arm64/kernel/entry.S
> > > +++ b/arch/arm64/kernel/entry.S
> > > @@ -354,7 +354,6 @@ el0_sync:
> > >   lsr x24, x25, #ESR_EL1_EC_SHIFT // exception class
> > >   cmp x24, #ESR_EL1_EC_SVC64  // SVC in 64-bit state
> > >   b.eqel0_svc
> > > - adr lr, ret_to_user
> > >   cmp x24, #ESR_EL1_EC_DABT_EL0   // data abort in EL0
> > >   b.eqel0_da
> > >   cmp x24, #ESR_EL1_EC_IABT_EL0   // instruction abort in EL0
> > > @@ -383,7 +382,6 @@ el0_sync_compat:
> > >   lsr x24, x25, #ESR_EL1_EC_SHIFT // exception class
> > >   cmp x24, #ESR_EL1_EC_SVC32  // SVC in 32-bit state
> > >   b.eqel0_svc_compat
> > > - adr lr, ret_to_user
> > >   cmp x24, #ESR_EL1_EC_DABT_EL0   // data abort in EL0
> > >   b.eqel0_da
> > >   cmp x24, #ESR_EL1_EC_IABT_EL0   // instruction abort in EL0
> > > @@ -426,22 +424,26 @@ el0_da:
> > >   /*
> > >* Data abort handling
> > >*/
> > > - mrs x0, far_el1
> > > - bic x0, x0, #(0xff << 56)
> > > + mrs x26, far_el1
> > >   // enable interrupts before calling the main handler
> > >   enable_dbg_and_irq
> > > + mov x0, x26
> > > + bic x0, x0, #(0xff << 56)
> > 
> > Nit: I believe you can bit clear with x26 as the source register and omit 
> > the
> > move instruction.
> 
> Is that really an improvement (assuming it works)? Are we saving
> any cycles here? If so, does it matter? It is easy to see what
> the move instruction is doing.

Even if it's not noticeable, I would still reduce the number of lines by
one. BIC with immediate is just an alias for AND and it supports
different source and destination.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/2] arm64: enable context tracking

2014-05-23 Thread Catalin Marinas
On Fri, May 23, 2014 at 01:11:38AM +0100, Kevin Hilman wrote:
> Christopher Covington  writes:
> > On 05/22/2014 03:27 PM, Larry Bassel wrote:
> >> Make calls to ct_user_enter when the kernel is exited
> >> and ct_user_exit when the kernel is entered (in el0_da,
> >> el0_ia, el0_svc, el0_irq and all of the "error" paths).
> >> 
> >> These macros expand to function calls which will only work
> >> properly if el0_sync and related code has been rearranged
> >> (in a previous patch of this series).
> >> 
> >> The calls to ct_user_exit are made after hw debugging has been
> >> enabled (enable_dbg_and_irq).
> >> 
> >> The call to ct_user_enter is made at the beginning of the
> >> kernel_exit macro.
> >> 
> >> This patch is based on earlier work by Kevin Hilman.
> >> Save/restore optimizations were also done by Kevin.
> >
> >> --- a/arch/arm64/kernel/entry.S
> >> +++ b/arch/arm64/kernel/entry.S
> >> @@ -30,6 +30,44 @@
> >>  #include 
> >>  
> >>  /*
> >> + * Context tracking subsystem.  Used to instrument transitions
> >> + * between user and kernel mode.
> >> + */
> >> +  .macro ct_user_exit, restore = 0
> >> +#ifdef CONFIG_CONTEXT_TRACKING
> >> +  bl  context_tracking_user_exit
> >> +  .if \restore == 1
> >> +  /*
> >> +   * Save/restore needed during syscalls.  Restore syscall arguments from
> >> +   * the values already saved on stack during kernel_entry.
> >> +   */
> >> +  ldp x0, x1, [sp]
> >> +  ldp x2, x3, [sp, #S_X2]
> >> +  ldp x4, x5, [sp, #S_X4]
> >> +  ldp x6, x7, [sp, #S_X6]
> >> +  .endif
> >> +#endif
> >> +  .endm
> >> +
> >> +  .macro ct_user_enter, save = 0
> >> +#ifdef CONFIG_CONTEXT_TRACKING
> >> +  .if \save == 1
> >> +  /*
> >> +   * Save/restore only needed on syscall fastpath, which uses
> >> +   * x0-x2.
> >> +   */
> >> +  pushx2, x3
> >
> > Why is x3 saved?
> 
> I'll respond here since I worked with Larry on the context save/restore
> part.
> 
> [insert rather embarassing disclamer of ignorance of arm64 assembly]
> 
> Based on my reading of the code, I figured only x0-x2 needed to be
> saved.  However, based on some experiments with intentionally clobbering
> the registers[1] (as suggested by Mark Rutland) in order to make sure
> we're saving/restoring the right things, I discovered x3 was needed too
> (I missed updating the comment to mention x0-x3.)
> 
> Maybe Will/Catalin/Mark R. can shed some light here?

I haven't checked all the code paths but at least for pushing onto the
stack we must keep it 16-bytes aligned (architecture requirement).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] arm64 fixes for 3.15

2014-05-04 Thread Catalin Marinas
Hi Linus,

Please pull the arm64 patches below for 3.15. These are mostly arm64
fixes with an additional arm(64) platform fix for the initialisation of
vexpress clocks (the latter only affecting arm64; the arch/arm64 code
is SoC agnostic and does not rely on early SoC-specific calls). Thanks.

The following changes since commit d1db0eea852497762cab43b905b879dfcd3b8987:

  Linux 3.15-rc3 (2014-04-27 19:29:27 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux tags/arm64-fixes

for you to fetch changes up to e715eb2e73918f4cefbba0b717ff8902e8030b39:

  vexpress: Initialise the sysregs before setting up the clocks (2014-05-04 
11:35:29 +0100)


- vexpress platform clocks initialisation moved earlier following the
  arm64 move of of_clk_init() call in a previous commit
- Default DMA ops changed to non-coherent to preserve compatibility with
  32-bit ARM DT files. The "dma-coherent" property can be used to
  explicitly mark a device coherent. The Applied Micro DT file has been
  updated to avoid DMA cache maintenance for the X-Gene SATA controller
  (the only arm64 related driver with such assumption in -rc mainline)
- Fixmap correction for earlyprintk
- kern_addr_valid() fix for huge pages

--------
Catalin Marinas (3):
  arm64: Use bus notifiers to set per-device coherent DMA ops
  arm64: Mark the Applied Micro X-Gene SATA controller as DMA coherent
  vexpress: Initialise the sysregs before setting up the clocks

Dave Anderson (1):
  arm64: Fix for the arm64 kern_addr_valid() function

Marc Zyngier (1):
  arm64: fixmap: fix missing sub-page offset for earlyprintk

Ritesh Harjani (1):
  arm64: Make default dma_ops to be noncoherent

 .../devicetree/bindings/ata/apm-xgene.txt  |  3 ++
 arch/arm64/boot/dts/apm-storm.dtsi |  3 ++
 arch/arm64/kernel/early_printk.c   |  6 ++--
 arch/arm64/kernel/setup.c  |  2 +-
 arch/arm64/mm/dma-mapping.c| 35 --
 arch/arm64/mm/mmu.c|  3 ++
 drivers/clk/versatile/clk-vexpress-osc.c   |  2 ++
 include/asm-generic/fixmap.h   |  3 ++
 8 files changed, 50 insertions(+), 7 deletions(-)

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] mm: Postpone the disabling of kmemleak early logging

2014-05-06 Thread Catalin Marinas
On Tue, May 06, 2014 at 04:20:27PM +0100, Sasha Levin wrote:
> On 05/02/2014 09:41 AM, Catalin Marinas wrote:
> > Currently, kmemleak_early_log is disabled at the beginning of the
> > kmemleak_init() function, before the full kmemleak tracing is actually
> > enabled. In this small window, kmem_cache_create() is called by kmemleak
> > which triggers additional memory allocation that are not traced. This
> > patch moves the kmemleak_early_log disabling further down and at the
> > same time with full kmemleak enabling.
> > 
> > Signed-off-by: Catalin Marinas 
> > Cc: Andrew Morton 
> 
> This patch makes the kernel die during the boot process:
> 
> [   24.471801] BUG: unable to handle kernel paging request at 922f2b93
> [   24.472496] IP: [] log_early+0x0/0xcd

Thanks for reporting this. I assume you run with
CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF enabled and kmemleak_early_log remains
set even though kmemleak is not in use.

Does the patch below fix it?

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 0cd6aabd45a0..e7f74091c024 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1811,6 +1811,7 @@ void __init kmemleak_init(void)
 
 #ifdef CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF
if (!kmemleak_skip_disable) {
+   kmemleak_early_log = 0;
kmemleak_disable();
return;
}

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] mm: Postpone the disabling of kmemleak early logging

2014-05-06 Thread Catalin Marinas
On 6 May 2014, at 19:15, Sasha Levin  wrote:
> On 05/06/2014 01:05 PM, Catalin Marinas wrote:
>> On Tue, May 06, 2014 at 04:20:27PM +0100, Sasha Levin wrote:
>>> On 05/02/2014 09:41 AM, Catalin Marinas wrote:
>>>> Currently, kmemleak_early_log is disabled at the beginning of the
>>>> kmemleak_init() function, before the full kmemleak tracing is actually
>>>> enabled. In this small window, kmem_cache_create() is called by kmemleak
>>>> which triggers additional memory allocation that are not traced. This
>>>> patch moves the kmemleak_early_log disabling further down and at the
>>>> same time with full kmemleak enabling.
>>>> 
>>>> Signed-off-by: Catalin Marinas 
>>>> Cc: Andrew Morton 
>>> 
>>> This patch makes the kernel die during the boot process:
>>> 
>>> [   24.471801] BUG: unable to handle kernel paging request at 
>>> 922f2b93
>>> [   24.472496] IP: [] log_early+0x0/0xcd
>> 
>> Thanks for reporting this. I assume you run with
>> CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF enabled and kmemleak_early_log remains
>> set even though kmemleak is not in use.
>> 
>> Does the patch below fix it?
> 
> Nope, that didn't help as I don't have DEBUG_KMEMLEAK_DEFAULT_OFF enabled.
> 
> For reference:
> 
> $ cat .config | grep KMEMLEAK
> CONFIG_HAVE_DEBUG_KMEMLEAK=y
> CONFIG_DEBUG_KMEMLEAK=y
> CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=400
> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set

I assume your dmesg shows some kmemleak error during boot? I’ll send
another patch tomorrow.

The code around kmemleak_init was changed by commit 8910ae896c8c
(kmemleak: change some global variables to int). It looks like it
wasn’t just a simple conversion but slightly changed the
kmemleak_early_log logic which led to false positives for the kmemleak
cache objects and that’s what my patch was trying to solve.

The failure is caused by kmemleak_alloc() still calling log_early() much
later after the __init section has been freed because kmemleak_early_log
hasn’t been set to 0 (the default off is one path, another is the
kmemleak_error path).

Catalin--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] kmemleak on __radix_tree_preload

2014-05-07 Thread Catalin Marinas
On Wed, May 07, 2014 at 03:58:08AM +0100, Jaegeuk Kim wrote:
> And then when I tested again with Catalin's patch, it still throws the
> following warning.
> Is it false alarm?

BTW, you can try this kmemleak branch:

git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64.git 
kmemleak

> unreferenced object 0x880004226da0 (size 576):
>   comm "fsstress", pid 14590, jiffies 4295191259 (age 706.308s)
>   hex dump (first 32 bytes):
> 01 00 00 00 81 ff ff ff 00 00 00 00 00 00 00 00  
> 50 89 34 81 ff ff ff ff b8 6d 22 04 00 88 ff ff  P.4..m".
>   backtrace:
> [] kmemleak_update_trace+0x58/0x80
> [] radix_tree_node_alloc+0x77/0xa0
> [] __radix_tree_create+0x1d8/0x230
> [] __add_to_page_cache_locked+0x9c/0x1b0
> [] add_to_page_cache_lru+0x28/0x80
> [] grab_cache_page_write_begin+0x98/0xf0
> [] f2fs_write_begin+0xb4/0x3c0 [f2fs]
> [] generic_perform_write+0xc7/0x1c0
> [] __generic_file_aio_write+0x1cd/0x3f0
> [] generic_file_aio_write+0x5e/0xe0
> [] do_sync_write+0x5a/0x90
> [] vfs_write+0xc2/0x1d0
> [] SyS_write+0x4f/0xb0
> [] system_call_fastpath+0x16/0x1b
> [] 0x

OK, it shows that the allocation happens via add_to_page_cache_locked()
and I guess it's page_cache_tree_insert() which calls
__radix_tree_create() (the latter reusing the preloaded node). I'm not
familiar enough to this code (radix-tree.c and filemap.c) to tell where
the node should have been freed, who keeps track of it.

At a quick look at the hex dump (assuming that the above leak is struct
radix_tree_node):

.path = 1
.count = -0x7f (or 0xff81 as unsigned int)
union {
{
.parent = NULL
.private_data = 0x81348950
}
{
.rcu_head.next = NULL
.rcu_head.func = 0x81348950
}
}

The count is a bit suspicious.

>From the union, it looks most likely like rcu_head information. Is
radix_tree_node_rcu_free() function at the above rcu_head.func?

Could you please send us your .config file?

Also, if you run echo scan > /sys/kernel/debug/kmemleak a few times, do
any of the above leaks disappear (in case the above are some transient
rcu freeing reports; normally this shouldn't happen as the objects are
still referred but I'll look at the relevant code once I have your
.config).

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: Postpone the disabling of kmemleak early logging

2014-05-07 Thread Catalin Marinas
Commit 8910ae896c8c (kmemleak: change some global variables to int), in
addition to the atomic -> int conversion, moved the kmemleak_early_log
disabling at the beginning of the kmemleak_init() function, before the
full kmemleak tracing is actually enabled. In this small window,
kmem_cache_create() is called by kmemleak which triggers additional
memory allocation that are not traced. This patch restores the original
logic with kmemleak_early_log disabling when kmemleak is fully
functional.

Fixes: 8910ae896c8c (kmemleak: change some global variables to int)
Signed-off-by: Catalin Marinas 
Cc: Andrew Morton 
Cc: Sasha Levin 
Cc: Li Zefan 
---
 mm/kmemleak.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 61a64ed2fbef..33599ba0cd8d 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1809,10 +1809,9 @@ void __init kmemleak_init(void)
int i;
unsigned long flags;
 
-   kmemleak_early_log = 0;
-
 #ifdef CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF
if (!kmemleak_skip_disable) {
+   kmemleak_early_log = 0;
kmemleak_disable();
return;
}
@@ -1830,6 +1829,7 @@ void __init kmemleak_init(void)
 
/* the kernel is still in UP mode, so disabling the IRQs is enough */
local_irq_save(flags);
+   kmemleak_early_log = 0;
if (kmemleak_error) {
local_irq_restore(flags);
return;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] arm64 fixes for 3.8

2013-01-23 Thread Catalin Marinas
Hi Linus,

Please pull the arm64 fixes below. Thanks.

The following changes since commit 7d1f9aeff1ee4a20b1aeb377dd0f579fe9647619:

  Linux 3.8-rc4 (2013-01-17 19:25:45 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64.git 
tags/arm64-fixes

for you to fetch changes up to f1b99392caf120d7533da260318fae0eb5053737:

  arm64: makefile: fix uname munging when setting ARCH on native machine 
(2013-01-22 17:51:00 +)


- ELF coredump fix (more registers dumped than what user space expects)
- SUBARCH name generation (s/aarch64/arm64/)


Will Deacon (2):
  arm64: elf: fix core dumping to match what glibc expects
  arm64: makefile: fix uname munging when setting ARCH on native machine

 Makefile | 2 +-
 arch/arm64/include/asm/elf.h | 5 -
 tools/perf/Makefile  | 2 +-
 3 files changed, 6 insertions(+), 3 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] arm64: Fix task tracing

2013-04-15 Thread Catalin Marinas
On Tue, Apr 09, 2013 at 01:33:34PM +0100, Christopher Covington wrote:
> For accurate accounting pass contextidr_thread_switch the prev
> task pointer, since cpu_switch_to has at that point changed the
> the stack pointer.
> 
> Signed-off-by: Christopher Covington 
> ---
>  arch/arm64/kernel/process.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 0337cdb..a49b25a 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -315,7 +315,7 @@ struct task_struct *__switch_to(struct task_struct *prev,
>   /* the actual thread switch */
>   last = cpu_switch_to(prev, next);
>  
> - contextidr_thread_switch(next);
> + contextidr_thread_switch(prev);

The original code was indeed wrong but using prev isn't any better. For
a newly created thread, prev is probably 0 (if it's in a register,
cpu_context has been zeroed by copy_thread()) or some random stack
value.

So we either use current or move the call before cpu_switch_to() (I
would go for the former).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] arm64: Fix task tracing

2013-04-15 Thread Catalin Marinas
On Mon, Apr 15, 2013 at 11:45:42AM +0100, Will Deacon wrote:
> On Mon, Apr 15, 2013 at 11:11:59AM +0100, Catalin Marinas wrote:
> > On Tue, Apr 09, 2013 at 01:33:34PM +0100, Christopher Covington wrote:
> > > For accurate accounting pass contextidr_thread_switch the prev
> > > task pointer, since cpu_switch_to has at that point changed the
> > > the stack pointer.
> > > 
> > > Signed-off-by: Christopher Covington 
> > > ---
> > >  arch/arm64/kernel/process.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > > index 0337cdb..a49b25a 100644
> > > --- a/arch/arm64/kernel/process.c
> > > +++ b/arch/arm64/kernel/process.c
> > > @@ -315,7 +315,7 @@ struct task_struct *__switch_to(struct task_struct 
> > > *prev,
> > >   /* the actual thread switch */
> > >   last = cpu_switch_to(prev, next);
> > >  
> > > - contextidr_thread_switch(next);
> > > + contextidr_thread_switch(prev);
> > 
> > The original code was indeed wrong but using prev isn't any better. For
> > a newly created thread, prev is probably 0 (if it's in a register,
> > cpu_context has been zeroed by copy_thread()) or some random stack
> > value.
> 
> Really? If prev is NULL in context_switch(...), the scheduler will implode,
> and I can't see where else switch_to is called from.
> 
> Which code path are you thinking of?

copy_thread() zeros cpu_context which is used by cpu_switch_to() to load
the next saved registers. The switch_to() function sets prev to last as
returned by __switch_to(), so this is valid but in __switch_to() we
don't have a valid prev (nor next) after cpu_switch_to() for newly
created threads.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] arm64: Fix task tracing

2013-04-15 Thread Catalin Marinas
On Mon, Apr 15, 2013 at 11:58:40AM +0100, Catalin Marinas wrote:
> On Mon, Apr 15, 2013 at 11:45:42AM +0100, Will Deacon wrote:
> > On Mon, Apr 15, 2013 at 11:11:59AM +0100, Catalin Marinas wrote:
> > > On Tue, Apr 09, 2013 at 01:33:34PM +0100, Christopher Covington wrote:
> > > > For accurate accounting pass contextidr_thread_switch the prev
> > > > task pointer, since cpu_switch_to has at that point changed the
> > > > the stack pointer.
> > > > 
> > > > Signed-off-by: Christopher Covington 
> > > > ---
> > > >  arch/arm64/kernel/process.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > > > index 0337cdb..a49b25a 100644
> > > > --- a/arch/arm64/kernel/process.c
> > > > +++ b/arch/arm64/kernel/process.c
> > > > @@ -315,7 +315,7 @@ struct task_struct *__switch_to(struct task_struct 
> > > > *prev,
> > > > /* the actual thread switch */
> > > > last = cpu_switch_to(prev, next);
> > > >  
> > > > -   contextidr_thread_switch(next);
> > > > +   contextidr_thread_switch(prev);
> > > 
> > > The original code was indeed wrong but using prev isn't any better. For
> > > a newly created thread, prev is probably 0 (if it's in a register,
> > > cpu_context has been zeroed by copy_thread()) or some random stack
> > > value.
> > 
> > Really? If prev is NULL in context_switch(...), the scheduler will implode,
> > and I can't see where else switch_to is called from.
> > 
> > Which code path are you thinking of?
> 
> copy_thread() zeros cpu_context which is used by cpu_switch_to() to load
> the next saved registers. The switch_to() function sets prev to last as
> returned by __switch_to(), so this is valid but in __switch_to() we
> don't have a valid prev (nor next) after cpu_switch_to() for newly
> created threads.

Correction - newly created threads return to ret_from_fork rather than
__switch_to(), which means that we miss the first
contextidr_thread_switch() call for a new thread. I would vote for
Christopher's original patch moving the call before cpu_switch_to(). The
alternative is to define finish_arch_switch() and add the call there. If
you are primarily tracing user space, it doesn't really matter whether
the stack was switched or not when we set the contextidr. For kernel
tracking, it could be a problem as we have the next task with the old
stack. But the same could be said about the prev task with the new
stack.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] arm64: Fix task tracing

2013-04-15 Thread Catalin Marinas
On Mon, Apr 15, 2013 at 02:09:20PM +0100, Christopher Covington wrote:
> On 04/15/2013 07:43 AM, Catalin Marinas wrote:
> > On Mon, Apr 15, 2013 at 11:58:40AM +0100, Catalin Marinas wrote:
> >> On Mon, Apr 15, 2013 at 11:45:42AM +0100, Will Deacon wrote:
> >>> On Mon, Apr 15, 2013 at 11:11:59AM +0100, Catalin Marinas wrote:
> >>>> On Tue, Apr 09, 2013 at 01:33:34PM +0100, Christopher Covington wrote:
> >>>>> For accurate accounting pass contextidr_thread_switch the prev
> >>>>> task pointer, since cpu_switch_to has at that point changed the
> >>>>> the stack pointer.
> >>>>>
> >>>>> Signed-off-by: Christopher Covington 
> >>>>> ---
> >>>>>  arch/arm64/kernel/process.c | 2 +-
> >>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> >>>>> index 0337cdb..a49b25a 100644
> >>>>> --- a/arch/arm64/kernel/process.c
> >>>>> +++ b/arch/arm64/kernel/process.c
> >>>>> @@ -315,7 +315,7 @@ struct task_struct *__switch_to(struct task_struct 
> >>>>> *prev,
> >>>>> /* the actual thread switch */
> >>>>> last = cpu_switch_to(prev, next);
> >>>>>  
> >>>>> -   contextidr_thread_switch(next);
> >>>>> +   contextidr_thread_switch(prev);
> >>>>
> >>>> The original code was indeed wrong but using prev isn't any better. For
> >>>> a newly created thread, prev is probably 0 (if it's in a register,
> >>>> cpu_context has been zeroed by copy_thread()) or some random stack
> >>>> value.
> 
> 
> I have to I disagree with the statement that using prev isn't _any_ better.
> Even if there are unhandled cases, from my observations, using prev is
> _measurably_ better. On the other hand, I agree that 100% accuracy is 
> essential.
> 

It is indeed better but we still miss the task creation (we only start
tracing once the task is scheduled out and scheduled back in.

> >>> Really? If prev is NULL in context_switch(...), the scheduler will 
> >>> implode,
> >>> and I can't see where else switch_to is called from.
> >>>
> >>> Which code path are you thinking of?
> >>
> >> copy_thread() zeros cpu_context which is used by cpu_switch_to() to load
> >> the next saved registers. The switch_to() function sets prev to last as
> >> returned by __switch_to(), so this is valid but in __switch_to() we
> >> don't have a valid prev (nor next) after cpu_switch_to() for newly
> >> created threads.
> > 
> > Correction - newly created threads return to ret_from_fork rather than
> > __switch_to(), which means that we miss the first
> > contextidr_thread_switch() call for a new thread. I would vote for
> > Christopher's original patch moving the call before cpu_switch_to(). The
> > alternative is to define finish_arch_switch() and add the call there. If
> > you are primarily tracing user space, it doesn't really matter whether
> > the stack was switched or not when we set the contextidr. For kernel
> > tracking, it could be a problem as we have the next task with the old
> > stack. But the same could be said about the prev task with the new
> > stack.
> 
> I'm fine with using either of my previous patches (or are there still cases
> where the second one is suspected to be wrong?) or rolling a new one using
> finish_arch_switch(). Let me know if you all would prefer for me to start on
> the latter.

The second patch is not wrong but insufficient since it doesn't cover
ret_from_fork. Will has a point that debuggers may use the contextidr
event to look into the state of the tread which would still have the old
stack with your first patch. But at least it is consistent with the
arch/arm implementation which uses notifiers.

So I would go with your first patch until we hear otherwise from the
debuggers people, in which case we would probably need to fix both
arch/arm and arch/arm64.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 1/3] pstore-ram: use write-combine mappings

2013-04-16 Thread Catalin Marinas
On Tue, Apr 16, 2013 at 01:58:27PM +0100, Rob Herring wrote:
> On 04/16/2013 03:44 AM, Will Deacon wrote:
> > On Tue, Apr 16, 2013 at 01:43:09AM +0100, Colin Cross wrote:
> >> On Mon, Apr 15, 2013 at 4:59 PM, Rob Herring  wrote:
> >>> Exclusive accesses still have further restrictions. From section 3.4.5:
> >>>
> >>> • It is IMPLEMENTATION DEFINED whether LDREX and STREX operations can be
> >>> performed to a memory region
> >>>with the Device or Strongly-ordered memory attribute. Unless the
> >>> implementation documentation explicitly
> >>>   states that LDREX and STREX operations to a memory region with the
> >>> Device or Strongly-ordered attribute are
> >>>  permitted, the effect of such operations is UNPREDICTABLE.
> >>>
> >>>
> >>> Given that it is implementation defined, I don't see how Linux can rely
> >>> on that behavior.
> >>
> >> I see, the problem is that while noncached and writecombined appear to
> >> be similar mappings, noncached is mapped in PRRR to strongly-ordered,
> >> while writecombined is mapped to unbufferable normal memory.
> >>
> >> I think adding a wmb() to persistent_ram_write is going to be
> >> expensive on cpus with outer caches like the L2X0, where wmb() will
> >> result in a spinlock.  Is there a real SoC where this doesn't work?
> > 
> > A real SoC where exclusives don't work to memory not mapped as normal? Take
> > your pick...
> 
> This patch doesn't actually fix problems for me. Exclusives to DDR work
> for any memory type for me as the DDR controller has an exclusive
> monitor. It takes write-thru cache mapping to get internal RAM to work.

I can't find any reference in the ARM ARM but I think you would need
cacheable memory for the exclusives to work. A9 for example uses the
cacheline exclusiveness to emulate the global monitor.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: Do not select GENERIC_HARDIRQS_NO_DEPRECATED

2013-03-19 Thread Catalin Marinas
On Tue, Mar 05, 2013 at 08:43:42PM +, Paul Bolle wrote:
> Config option GENERIC_HARDIRQS_NO_DEPRECATED was removed in commit
> 78c89825649a9a5ed526c507603196f467d781a5 ("genirq: Remove the now obsolete
> config options and select statements"), but the select was accidentally
> reintroduced in commit 8c2c3df31e3b87cb5348e48776c366ebd1dc5a7a ("arm64:
> Build infrastructure").
> 
> Signed-off-by: Paul Bolle 

Applied, thanks (it will appear in mainline at some point).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    3   4   5   6   7   8   9   10   11   12   >