On Sat, Oct 23, 2021 at 06:04:57PM +0200, Arnd Bergmann wrote:
> On Sat, Oct 23, 2021 at 3:37 AM Waiman Long wrote:
> >> On 10/22/21 7:59 AM, Arnd Bergmann wrote:
> > > From: Arnd Bergmann
> > >
> > > As this is all dead code, just remove it and the helper functions built
> > > around it. For arc
On Mon, Oct 18, 2021 at 02:46:18PM +1100, Michael Ellerman wrote:
> Peter Zijlstra writes:
> > On Wed, Oct 06, 2021 at 07:36:50PM +0530, Kajol Jain wrote:
> >
> >> Kajol Jain (4):
> >> perf: Add comment about current state of PERF_MEM_LVL_* namespace
On Fri, Oct 08, 2021 at 04:22:27PM +0100, Valentin Schneider wrote:
> So x86 has it default yes, and a lot of others (e.g. arm64) have it default
> no.
>
> IMO you don't gain much by disabling them. SCHED_MC and SCHED_CLUSTER only
> control the presence of a sched_domain_topology_level - if it's
On Fri, Oct 15, 2021 at 02:20:33PM -0400, Steven Rostedt wrote:
> On Fri, 15 Oct 2021 20:04:29 +0200
> Peter Zijlstra wrote:
>
> > On Fri, Oct 15, 2021 at 01:58:06PM -0400, Steven Rostedt wrote:
> > > Something like this:
> >
> > I think having one cop
On Fri, Oct 15, 2021 at 01:58:06PM -0400, Steven Rostedt wrote:
> Something like this:
I think having one copy of that in a header is better than having 3
copies. But yes, something along them lines.
On Fri, Oct 15, 2021 at 11:00:35AM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)"
>
> While writing an email explaining the "bit = 0" logic for a discussion on
> bit = trace_get_context_bit() + start;
While there, you were also going to update that function to match/use
ge
On Tue, Oct 12, 2021 at 01:40:31PM +0800, 王贇 wrote:
> diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
> index 6aed10e..33c2f76 100644
> --- a/kernel/trace/trace_event_perf.c
> +++ b/kernel/trace/trace_event_perf.c
> @@ -441,12 +441,19 @@ void perf_trace_buf_update(vo
--
> tools/include/uapi/linux/perf_event.h | 19 ---
> tools/perf/util/mem-events.c | 20 ++--
> 5 files changed, 73 insertions(+), 13 deletions(-)
Acked-by: Peter Zijlstra (Intel)
How do we want this routed? Shall I take it, or does Michael want it in
the Power tree?
On Tue, Oct 05, 2021 at 02:48:35PM +0530, Kajol Jain wrote:
> Going forward, future generation systems can have more hierarchy
> within the chip/package level but currently we don't have any data source
> encoding field in perf, which can be used to represent this level of data.
>
> Add a new fiel
Neri
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Joel Fernandes (Google)
Reviewed-by: Len Brown
Link:
https://lkml.kernel.org/r/20210911011819.12184-7-ricardo.neri-calde...@linux.intel.com
---
kernel/sched/fair.c | 92
1 file chang
On Tue, Sep 14, 2021 at 08:40:38PM +1000, Michael Ellerman wrote:
> Peter Zijlstra writes:
> > I'm thinking we ought to keep hops as steps along the NUMA fabric, with
> > 0 hops being the local node. That only gets us:
> >
> > L2, remote=0, hops=HOPS_0 -- our L
On Thu, Sep 09, 2021 at 10:45:54PM +1000, Michael Ellerman wrote:
> > The 'new' composite doesnt have a hops field because the hardware that
> > nessecitated that change doesn't report it, but we could easily add a
> > field there.
> >
> > Suppose we add, mem_hops:3 (would 6 hops be too small?) an
On Wed, Sep 08, 2021 at 05:17:53PM +1000, Michael Ellerman wrote:
> Kajol Jain writes:
> > diff --git a/include/uapi/linux/perf_event.h
> > b/include/uapi/linux/perf_event.h
> > index f92880a15645..030b3e990ac3 100644
> > --- a/include/uapi/linux/perf_event.h
> > +++ b/include/uapi/linux/perf_ev
you'd tried PREEMPT_DYNAMIC, since that should really
stress the thing, but I see that also requires GENERIC_ENTRY and you
don't have that. Alas.
Acked-by: Peter Zijlstra (Intel)
On Tue, Aug 31, 2021 at 01:12:26PM +, Christophe Leroy wrote:
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 36b72d972568..a0fe69d8ec83 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -247,6 +247,7 @@ config PPC
> select HAVE_SOFTIRQ_ON_OWN_STACK
On Tue, Aug 31, 2021 at 08:05:21AM +, Christophe Leroy wrote:
> +#define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) \
> + asm(".pushsection .text, \"ax\" \n" \
> + ".align 4 \n" \
> +
On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> > +/**
> > + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull
> > tasks
> > + * @dst_cpu: Destination CPU of the load balancing
> > + * @sds: Load-balancing data with statistics of the local group
>
On Fri, Aug 27, 2021 at 09:45:37AM +, Christophe Leroy wrote:
> This RFC is to validate the concept of static_call on powerpc.
>
> Highly copied from x86.
>
> It replaces ppc_md.get_irq() which is called at every IRQ, by
> a static call.
The code looks saner, but does it actually improve per
On Mon, Aug 23, 2021 at 03:04:37PM +0530, Srikar Dronamraju wrote:
> * Peter Zijlstra [2021-08-23 10:33:30]:
>
> > On Sat, Aug 21, 2021 at 03:55:32PM +0530, Srikar Dronamraju wrote:
> > > Scheduler expects unique number of node distances to be available
> > > at
On Sat, Aug 21, 2021 at 03:55:32PM +0530, Srikar Dronamraju wrote:
> Scheduler expects unique number of node distances to be available
> at boot. It uses node distance to calculate this unique node
> distances. On Power Servers, node distances for offline nodes is not
> available. However, Power Se
On Wed, Aug 04, 2021 at 04:39:44PM +1000, Nicholas Piggin wrote:
> For that matter, I wonder if we shouldn't do something like this
> (untested) so the low level batch flush has visibility to the high
> level flush range.
>
> x86 could use this too AFAIKS, just needs to pass the range a bit
> fur
On Fri, Jun 25, 2021 at 02:23:16PM +0530, Bharata B Rao wrote:
> On Fri, Jun 25, 2021 at 09:28:09AM +0200, Peter Zijlstra wrote:
> > On Fri, Jun 25, 2021 at 11:16:08AM +0530, Srikar Dronamraju wrote:
> > > * Bharata B Rao [2021-06-24 21:25:09]:
> > >
> >
On Fri, Jun 25, 2021 at 11:16:08AM +0530, Srikar Dronamraju wrote:
> * Bharata B Rao [2021-06-24 21:25:09]:
>
> > A PowerPC KVM guest gets the following BUG message when booting
> > linux-next-20210623:
> >
> > smp: Bringing up secondary CPUs ...
> > BUG: scheduling while atomic: swapper/1/0/0x0
On Wed, Jun 23, 2021 at 01:40:38PM +0530, kajoljain wrote:
>
>
> On 6/22/21 6:44 PM, Peter Zijlstra wrote:
> > On Thu, Jun 17, 2021 at 06:56:13PM +0530, Kajol Jain wrote:
> >> ---
> >> Kajol Jain (4):
> >> drivers/nvdimm: Add nvdimm pmu structure
&g
ent papr_scm sysfs event format entries
Don't see anything obviously wrong with this one.
Acked-by: Peter Zijlstra (Intel)
On Tue, Jun 08, 2021 at 05:26:58PM +0530, Kajol Jain wrote:
> +static int nvdimm_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
> +{
> + struct nvdimm_pmu *nd_pmu;
> + u32 target;
> + int nodeid;
> + const struct cpumask *cpumask;
> +
> + nd_pmu = hlist_entry_safe(no
On Mon, May 24, 2021 at 09:48:29PM +0530, Srikar Dronamraju wrote:
> * Valentin Schneider [2021-05-24 15:16:09]:
> > I suppose one way to avoid the hook would be to write some "fake" distance
> > values into your distance_lookup_table[] for offline nodes using your
> > distance_ref_point_depth th
On Wed, May 26, 2021 at 12:56:58PM +0530, kajoljain wrote:
> On 5/25/21 7:46 PM, Peter Zijlstra wrote:
> > On Tue, May 25, 2021 at 06:52:16PM +0530, Kajol Jain wrote:
> >> It adds cpumask to designate a cpu to make HCALL to
> >> collect the counter data for the nvdimm
On Tue, May 25, 2021 at 06:52:16PM +0530, Kajol Jain wrote:
> Patch here adds cpu hotplug functions to nvdimm pmu.
I'm thinking "Patch here" qualifies for "This patch", see
Documentation/process/submitting-patches.rst .
> It adds cpumask to designate a cpu to make HCALL to
> collect the counter d
On Fri, May 21, 2021 at 08:08:02AM +0530, Srikar Dronamraju wrote:
> * Peter Zijlstra [2021-05-20 20:56:31]:
>
> > On Thu, May 20, 2021 at 09:14:25PM +0530, Srikar Dronamraju wrote:
> > > Currently scheduler populates the distance map by looking at distance
> > &g
On Thu, May 20, 2021 at 09:14:25PM +0530, Srikar Dronamraju wrote:
> Currently scheduler populates the distance map by looking at distance
> of each node from all other nodes. This should work for most
> architectures and platforms.
>
> However there are some architectures like POWER that may not
On Tue, May 18, 2021 at 12:07:40PM -0700, Ricardo Neri wrote:
> On Fri, May 14, 2021 at 07:14:15PM -0700, Ricardo Neri wrote:
> > On Fri, May 14, 2021 at 11:47:45AM +0200, Peter Zijlstra wrote:
> > > So I'm thinking that this is a property of having ASYM_PACKING at a core
&
On Thu, May 13, 2021 at 05:56:14PM +0530, kajoljain wrote:
> But yes the current read/add/del functions are not adding value. We
> could add an arch/platform specific function which could handle the
> capturing of the counter data and do the rest of the operation here,
> is this approach better?
On Wed, May 12, 2021 at 10:08:21PM +0530, Kajol Jain wrote:
> +static void nvdimm_pmu_read(struct perf_event *event)
> +{
> + struct nvdimm_pmu *nd_pmu = to_nvdimm_pmu(event->pmu);
> +
> + /* jump to arch/platform specific callbacks if any */
> + if (nd_pmu && nd_pmu->read)
> +
On Fri, May 07, 2021 at 03:03:51PM -0500, Christopher M. Riedl wrote:
> On Thu May 6, 2021 at 5:51 AM CDT, Peter Zijlstra wrote:
> > On Wed, May 05, 2021 at 11:34:51PM -0500, Christopher M. Riedl wrote:
> > > Powerpc allows for multiple CPUs to patch concurrently. When p
On Wed, May 05, 2021 at 11:34:51PM -0500, Christopher M. Riedl wrote:
> Powerpc allows for multiple CPUs to patch concurrently. When patching
> with STRICT_KERNEL_RWX a single patching_mm is allocated for use by all
> CPUs for the few times that patching occurs. Use a spinlock to protect
> the patc
On Wed, Apr 07, 2021 at 08:52:08AM +0900, Stafford Horne wrote:
> Why doesn't RISC-V add the xchg16 emulation code similar to OpenRISC? For
> OpenRISC we added xchg16 and xchg8 emulation code to enable qspinlocks. So
> one thought is with CONFIG_ARCH_USE_QUEUED_SPINLOCKS_XCHG32=y, can we remove
On Thu, Mar 25, 2021 at 10:01:35AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Mar 24, 2021 at 10:05:23AM +0530, Madhavan Srinivasan escreveu:
> >
> > On 3/22/21 8:27 PM, Athira Rajeev wrote:
> > > Performance Monitoring Unit (PMU) registers in powerpc provides
> > > information on cycles ela
r) != 0)
> + is_kernel_addr(addr) && event->attr.exclude_kernel)
> continue;
>
> /* Branches are read most recent first (ie. mfbhrb 0 is
Acked-by: Peter Zijlstra (Intel)
On Tue, Feb 23, 2021 at 01:31:49AM -0500, Athira Rajeev wrote:
> Running "perf mem record" in powerpc platforms with selinux enabled
> resulted in soft lockup's. Below call-trace was seen in the logs:
>
> CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2
> NIP: c0dff3d4 LR: c000
On Tue, Feb 02, 2021 at 09:54:36AM +, Nadav Amit wrote:
> > On Feb 2, 2021, at 1:31 AM, Peter Zijlstra wrote:
> >
> > On Tue, Feb 02, 2021 at 07:20:55AM +, Nadav Amit wrote:
> >> Arm does not define tlb_end_vma, and consequently it flushes the TLB after
> &g
On Tue, Feb 02, 2021 at 02:41:13PM +0530, Aneesh Kumar K.V wrote:
> pmd/pud_populate is the right interface to be used to set the respective
> page table entries. Some architectures do assume that set_pmd/pud_at
> can only be used to set a hugepage PTE. Since we are not setting up a hugepage
> PTE
On Tue, Feb 02, 2021 at 07:20:55AM +, Nadav Amit wrote:
> Arm does not define tlb_end_vma, and consequently it flushes the TLB after
> each VMA. I suspect it is not intentional.
ARM is one of those that look at the VM_EXEC bit to explicitly flush
ITLB IIRC, so it has to.
On Sun, Jan 31, 2021 at 07:57:01AM +, Nadav Amit wrote:
> > On Jan 30, 2021, at 7:30 PM, Nicholas Piggin wrote:
> > I'll go through the patches a bit more closely when they all come
> > through. Sparc and powerpc of course need the arch lazy mode to get
> > per-page/pte information for oper
On Sat, Jan 30, 2021 at 04:11:23PM -0800, Nadav Amit wrote:
> diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> index 427bfcc6cdec..b97136b7010b 100644
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -334,8 +334,8 @@ static inline void __tlb_reset_range(
On Thu, Jan 28, 2021 at 02:10:37PM +0100, Dietmar Eggemann wrote:
> Dietmar Eggemann (3):
> sched: Remove MAX_USER_RT_PRIO
> sched: Remove USER_PRIO, TASK_USER_PRIO and MAX_USER_PRIO
> sched/core: Update task_prio() function header
Thanks!
On Tue, Jan 05, 2021 at 08:20:51AM -0800, Andy Lutomirski wrote:
> > Interestingly, the architecture recently added a control bit to remove
> > this synchronisation from exception return, so if we set that then we'd
> > have a problem with SYNC_CORE and adding an ISB would be necessary
On Mon, Dec 07, 2020 at 04:54:21PM -0800, Dan Williams wrote:
> [ add perf maintainers ]
>
> On Sun, Nov 8, 2020 at 1:16 PM Vaibhav Jain wrote:
> >
> > Implement support for exposing generic nvdimm statistics via newly
> > introduced dimm-command ND_CMD_GET_STAT that can be handled by nvdimm
> >
On Thu, Dec 03, 2020 at 03:54:42PM +0100, Heiko Carstens wrote:
> On Thu, Dec 03, 2020 at 08:28:21AM -0500, Sasha Levin wrote:
> > From: Peter Zijlstra
> >
> > [ Upstream commit 58c644ba512cfbc2e39b758dd979edd1d6d00e27 ]
> >
> > We call arch_cpu_idle(
On Wed, Dec 02, 2020 at 09:25:51PM -0800, Andy Lutomirski wrote:
> power: same as ARM, except that the loop may be rather larger since
> the systems are bigger. But I imagine it's still faster than Nick's
> approach -- a cmpxchg to a remote cacheline should still be faster than
> an IPI shootdown
On Wed, Dec 02, 2020 at 06:38:12AM -0800, Andy Lutomirski wrote:
>
> > On Dec 2, 2020, at 6:20 AM, Peter Zijlstra wrote:
> >
> > On Sun, Nov 29, 2020 at 02:01:39AM +1000, Nicholas Piggin wrote:
> >> + * - A delayed freeing and RCU-like quiescing sequence
On Sun, Nov 29, 2020 at 02:01:39AM +1000, Nicholas Piggin wrote:
> + * - A delayed freeing and RCU-like quiescing sequence based on
> + * mm switching to avoid IPIs completely.
That one's interesting too. so basically you want to count switch_mm()
invocations on each CP
On Wed, Dec 02, 2020 at 12:17:31PM +0100, Peter Zijlstra wrote:
> So the obvious 'improvement' here would be something like:
>
> for_each_online_cpu(cpu) {
> p = rcu_dereference(cpu_rq(cpu)->curr;
> if (p->active_mm != mm)
&
On Sun, Nov 29, 2020 at 02:01:39AM +1000, Nicholas Piggin wrote:
> +static void shoot_lazy_tlbs(struct mm_struct *mm)
> +{
> + if (IS_ENABLED(CONFIG_MMU_LAZY_TLB_SHOOTDOWN)) {
> + /*
> + * IPI overheads have not found to be expensive, but they could
> + * b
On Mon, Nov 30, 2020 at 10:30:00AM +0100, Peter Zijlstra wrote:
> On Sat, Nov 28, 2020 at 07:54:57PM -0800, Andy Lutomirski wrote:
> > This means that mm_cpumask operations won't need to be full barriers
> > forever, and we might not want to take the implied full barriers
On Sat, Nov 28, 2020 at 07:54:57PM -0800, Andy Lutomirski wrote:
> This means that mm_cpumask operations won't need to be full barriers
> forever, and we might not want to take the implied full barriers in
> set_bit() and clear_bit() for granted.
There is no implied full barrier for those ops.
On Sat, Nov 28, 2020 at 07:54:57PM -0800, Andy Lutomirski wrote:
> Version (b) seems fairly straightforward to implement -- add RCU
> protection and a atomic_t special_ref_cleared (initially 0) to struct
> mm_struct itself. After anyone clears a bit to mm_cpumask (which is
> already a barrier),
N
On Sun, Nov 29, 2020 at 12:16:26PM -0800, Andy Lutomirski wrote:
> On Sat, Nov 28, 2020 at 7:54 PM Andy Lutomirski wrote:
> >
> > On Sat, Nov 28, 2020 at 8:02 AM Nicholas Piggin wrote:
> > >
> > > On big systems, the mm refcount can become highly contented when doing
> > > a lot of context switch
On Thu, Nov 26, 2020 at 12:56:06PM +, Matthew Wilcox wrote:
> On Thu, Nov 26, 2020 at 01:42:07PM +0100, Peter Zijlstra wrote:
> > + pgdp = pgd_offset(mm, addr);
> > + pgd = READ_ONCE(*pgdp);
>
> I forget how x86-32-PAE maps to Linux's PGD/P4D/PUD/PMD scheme, b
On Thu, Nov 26, 2020 at 12:43:00PM +, Matthew Wilcox wrote:
> On Thu, Nov 26, 2020 at 01:01:15PM +0100, Peter Zijlstra wrote:
> > +#ifdef CONFIG_GUP_GET_PTE_LOW_HIGH
> > +/*
> > + * WARNING: only to be used in the get_user_pages_fast() implementation.
> > + * Wit
Now with pmd_cont() defined...
---
Subject: arm64/mm: Implement pXX_leaf_size() support
From: Peter Zijlstra
Date: Fri Nov 13 11:46:06 CET 2020
ARM64 has non-pagetable aligned large page support with PTE_CONT, when
this bit is set the page is part of a super-page. Match the hugetlb
code and
On Thu, Nov 26, 2020 at 12:34:58PM +, Matthew Wilcox wrote:
> On Thu, Nov 26, 2020 at 01:01:17PM +0100, Peter Zijlstra wrote:
> > The (new) page-table walker in arch_perf_get_page_size() is broken in
> > various ways. Specifically while it is used in a lockless manner, it
> &
In order to write another lockless page-table walker, we need
gup_get_pte() exposed. While doing that, rename it to
ptep_get_lockless() to match the existing ptep_get() naming.
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/pgtable.h | 55
Hi,
These patches provide generic infrastructure to determine TLB page size from
page table entries alone. Perf will use this (for either data or code address)
to aid in profiling TLB issues.
While most architectures only have page table aligned large pages, some
(notably ARM64, Sparc64 and Power
-by: Peter Zijlstra (Intel)
---
arch/arm64/include/asm/pgtable.h |3 +++
1 file changed, 3 insertions(+)
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -503,6 +503,9 @@ extern pgprot_t phys_mem_access_prot(str
PMD_TYPE_SECT
ot;perf,mm: Handle non-page-table-aligned hugetlbfs")
Fixes: 8d97e71811aa ("perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE")
Signed-off-by: Peter Zijlstra (Intel)
Tested-by: Kan Liang
---
arch/arm64/include/asm/pgtable.h|3 +
arch/sparc/include/asm/pgtable_64.h | 13
ar
Sparc64 has non-pagetable aligned large page support; wire up the
pXX_leaf_size() functions to report the correct pagetable page size.
This enables PERF_SAMPLE_{DATA,CODE}_PAGE_SIZE to report accurate
pagetable leaf sizes.
Signed-off-by: Peter Zijlstra (Intel)
---
arch/sparc/include/asm
8M entry.
>
> In the PTE, we have two bits: _PAGE_SPS and _PAGE_HUGE
>
> _PAGE_HUGE means it is a 512k page
> _PAGE_SPS means it is not a 4k page
>
> The kernel can by build either with 4k pages as standard page size, or
> 16k pages. It doesn't change the page table l
A number of architectures have non-pagetable aligned huge/large pages.
For such architectures a leaf can actually be part of a larger entry.
Provide generic helpers to determine the size of a page-table leaf.
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/pgtable.h | 16
On Fri, Nov 20, 2020 at 01:20:04PM +0100, Peter Zijlstra wrote:
> > > I can help with powerpc 8xx. It is a 32 bits powerpc. The PGD has 1024
> > > entries, that means each entry maps 4M.
> > >
> > > Page sizes are 4k, 16k, 512k and 8M.
> > >
>
On Fri, Nov 20, 2020 at 12:18:22PM +0100, Christophe Leroy wrote:
> Hi Peter,
>
> Le 13/11/2020 à 14:44, Christophe Leroy a écrit :
> > Hi
> >
> > Le 13/11/2020 à 12:19, Peter Zijlstra a écrit :
> > > Hi,
> > >
> > > These patches pro
r.
>
> Add an arch override allowing powerpc to use clear_tasks_mm_cpumask().
>
> Signed-off-by: Nicholas Piggin
Seems reasonable enough..
Acked-by: Peter Zijlstra (Intel)
> ---
> kernel/cpu.c | 6 +-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> di
On Mon, Nov 16, 2020 at 08:36:36AM -0800, Dave Hansen wrote:
> On 11/16/20 8:32 AM, Matthew Wilcox wrote:
> >>
> >> That's really the best we can do from software without digging into
> >> microarchitecture-specific events.
> > I mean this is perf. Digging into microarch specific events is what it
On Mon, Nov 16, 2020 at 08:28:23AM -0800, Dave Hansen wrote:
> On 11/16/20 7:54 AM, Matthew Wilcox wrote:
> > It gets even more complicated with CPUs with multiple levels of TLB
> > which support different TLB entry sizes. My CPU reports:
> >
> > TLB info
> > Instruction TLB: 2M/4M pages, fully
On Fri, Nov 13, 2020 at 12:19:03PM +0100, Peter Zijlstra wrote:
> A number of architectures have non-pagetable aligned huge/large pages.
> For such architectures a leaf can actually be part of a larger TLB
> entry.
>
> Provide generic helpers to determine the TLB size of a pa
Sparc64 has non-pagetable aligned large page support; wire up the
pXX_leaf_size() functions to report the correct TLB page size.
This enables PERF_SAMPLE_{DATA,CODE}_PAGE_SIZE to report accurate TLB
page sizes.
Signed-off-by: Peter Zijlstra (Intel)
---
arch/sparc/include/asm/pgtable_64.h
Hi,
These patches provide generic infrastructure to determine TLB page size from
page table entries alone. Perf will use this (for either data or code address)
to aid in profiling TLB issues.
While most architectures only have page table aligned large pages, some
(notably ARM64, Sparc64 and Power
Handle non-page-table-aligned hugetlbfs")
Fixes: 8d97e71811aa ("perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE")
Signed-off-by: Peter Zijlstra (Intel)
---
arch/arm64/include/asm/pgtable.h|3 +
arch/sparc/include/asm/pgtable_64.h | 13
arch/sparc/mm/hugetlbpage.c |
In order to write another lockless page-table walker, we need
gup_get_pte() exposed. While doing that, rename it to
ptep_get_lockless() to match the existing ptep_get() naming.
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/pgtable.h | 55
: Peter Zijlstra (Intel)
---
arch/arm64/include/asm/pgtable.h |3 +++
1 file changed, 3 insertions(+)
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -503,6 +503,9 @@ extern pgprot_t phys_mem_access_prot(str
PMD_TYPE_SECT
A number of architectures have non-pagetable aligned huge/large pages.
For such architectures a leaf can actually be part of a larger TLB
entry.
Provide generic helpers to determine the TLB size of a page-table
leaf.
Signed-off-by: Peter Zijlstra (Intel)
---
include/linux/pgtable.h | 16
On Wed, Nov 11, 2020 at 02:39:01PM +0100, Christophe Leroy wrote:
> Hello,
>
> Le 11/11/2020 à 12:07, Nicholas Piggin a écrit :
> > This passes atomic64 selftest on ppc32 on qemu (uniprocessor only)
> > both before and after powerpc is converted to use ARCH_ATOMIC.
>
> Can you explain what this c
On Fri, Oct 16, 2020 at 07:56:16AM +0100, Christoph Hellwig wrote:
> On Thu, Oct 15, 2020 at 10:01:54AM -0500, Christopher M. Riedl wrote:
> > Functions called between user_*_access_begin() and user_*_access_end()
> > should be either inlined or marked 'notrace' to prevent leaving
> > userspace acc
On Thu, Sep 24, 2020 at 09:51:38AM -0400, Steven Rostedt wrote:
> > It turns out, that getting selected for pull-balance is exactly that
> > condition, and clearly a migrate_disable() task cannot be pulled, but we
> > can use that signal to try and pull away the running task that's in the
> > way.
On Thu, Sep 24, 2020 at 08:32:41AM -0400, Steven Rostedt wrote:
> Anyway, instead of blocking. What about having a counter of number of
> migrate disabled tasks per cpu, and when taking a migrate_disable(), and
> there's
> already another task with migrate_disabled() set, and the current task has
On Thu, Jul 23, 2020 at 08:56:14PM +1000, Nicholas Piggin wrote:
> diff --git a/arch/powerpc/include/asm/hw_irq.h
> b/arch/powerpc/include/asm/hw_irq.h
> index 3a0db7b0b46e..35060be09073 100644
> --- a/arch/powerpc/include/asm/hw_irq.h
> +++ b/arch/powerpc/include/asm/hw_irq.h
> @@ -200,17 +200,14
On Fri, Jul 24, 2020 at 03:10:59PM -0400, Waiman Long wrote:
> On 7/24/20 4:16 AM, Will Deacon wrote:
> > On Thu, Jul 23, 2020 at 08:47:59PM +0200, pet...@infradead.org wrote:
> > > On Thu, Jul 23, 2020 at 02:32:36PM -0400, Waiman Long wrote:
> > > > BTW, do you have any comment on my v2 lock holde
On Thu, Jul 23, 2020 at 11:11:03PM +1000, Nicholas Piggin wrote:
> Excerpts from Peter Zijlstra's message of July 23, 2020 9:40 pm:
> > On Thu, Jul 23, 2020 at 08:56:14PM +1000, Nicholas Piggin wrote:
> >
> >> diff --git a/arch/powerpc/include/asm/hw_irq.h
> >> b/arch/powerpc/include/asm/hw_irq.h
On Thu, Jul 09, 2020 at 12:06:13PM -0400, Waiman Long wrote:
> We don't really need to do a pv_spinlocks_init() if pv_kick() isn't
> supported.
Waiman, if you cannot explain how not having kick is a sane thing, what
are you saying here?
On Thu, Jul 23, 2020 at 08:56:14PM +1000, Nicholas Piggin wrote:
> diff --git a/arch/powerpc/include/asm/hw_irq.h
> b/arch/powerpc/include/asm/hw_irq.h
> index 3a0db7b0b46e..35060be09073 100644
> --- a/arch/powerpc/include/asm/hw_irq.h
> +++ b/arch/powerpc/include/asm/hw_irq.h
> @@ -200,17 +200,1
On Wed, Jul 22, 2020 at 01:48:22PM +0530, Srikar Dronamraju wrote:
> * pet...@infradead.org [2020-07-22 09:46:24]:
>
> > On Tue, Jul 21, 2020 at 05:08:10PM +0530, Srikar Dronamraju wrote:
> > > Currently "CACHE" domain happens to be the 2nd sched domain as per
> > > powerpc_topology. This domain
On Tue, Jul 21, 2020 at 11:15:13AM -0400, Mathieu Desnoyers wrote:
> - On Jul 21, 2020, at 11:06 AM, Peter Zijlstra pet...@infradead.org wrote:
>
> > On Tue, Jul 21, 2020 at 08:04:27PM +1000, Nicholas Piggin wrote:
> >
> >> That being said, the x86 sync core g
On Wed, Jul 15, 2020 at 10:18:20PM -0700, Andy Lutomirski wrote:
> > On Jul 15, 2020, at 9:15 PM, Nicholas Piggin wrote:
> > CPU0 CPU1
> > 1. user stuff
> > a. membarrier() 2. enter kernel
> > b. read rq->curr 3. rq->curr switched to kt
On Tue, Jul 14, 2020 at 07:31:03PM +0300, Jarkko Sakkinen wrote:
> On Tue, Jul 14, 2020 at 03:01:09PM +0200, Peter Zijlstra wrote:
> > to help with text_alloc() usage in generic code, but I think
> > fundamentally, there's only these two options.
>
> There is one
On Tue, Jul 14, 2020 at 03:19:24PM +0300, Ard Biesheuvel wrote:
> So perhaps the answer is to have text_alloc() not with a 'where'
> argument but with a 'why' argument. Or more simply, just have separate
> alloc/free APIs for each case, with generic versions that can be
> overridden by the architec
On Tue, Jul 14, 2020 at 05:46:05AM -0700, Andy Lutomirski wrote:
> x86 has this exact problem. At least no more than 64*8 CPUs share the cache
> line :)
I've seen patches for a 'sparse' bitmap to solve related problems.
It's basically the same code, except it multiplies everything (size,
bit-nr)
On Tue, Jul 14, 2020 at 11:33:33AM +0100, Russell King - ARM Linux admin wrote:
> For 32-bit ARM, our bpf code uses "blx/bx" (or equivalent code
> sequences) rather than encoding a "bl" or "b", so BPF there doesn't
> care where the executable memory is mapped, and doesn't need any
> PLTs. Given th
On Tue, Jul 14, 2020 at 11:28:27AM +0100, Will Deacon wrote:
> As Ard says, module_alloc() _is_ special, in the sense that the virtual
> memory it allocates wants to be close to the kernel text, whereas the
> concept of allocating executable memory is broader and doesn't have these
> restrictions.
On Fri, Jul 10, 2020 at 11:56:44AM +1000, Nicholas Piggin wrote:
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 73199470c265..ad95812d2a3f 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1253,7 +1253,7 @@ void start_secondary(void *unu
On Fri, Jul 10, 2020 at 11:56:43AM +1000, Nicholas Piggin wrote:
> And get rid of the generic sync_core_before_usermode facility.
>
> This helper is the wrong way around I think. The idea that membarrier
> state requires a core sync before returning to user is the easy one
> that does not need hid
301 - 400 of 1108 matches
Mail list logo