> We fall back to the rmap when it's obviously not smart to do so. There
> is still a lot of room for improvement in this function though, i.e.,
> it should be per VMA and NUMA aware.
Okay so it's more a question to tune the cross over heuristic. That
sounds much easier than replacing everything.
> The CNA code, if enabled, will be in vmlinux, not in a kernel module. As a
> result, I think a module parameter will be no different from a kernel
> command line parameter in this regard.
You can still change it in /sys at runtime, even if it's in the vmlinux.
-Andi
> Now imagine we have an 8 node system, and memory
> pressure in the DMA32 zone of node 0.
The question is how much do we still care about DMA32.
If there are problems they can probably just turn on the IOMMU for
these IO mappings.
-Andi
>2) It will not scan PTE tables under non-leaf PMD entries that do not
> have the accessed bit set, when
> CONFIG_HAVE_ARCH_PARENT_PMD_YOUNG=y.
This assumes that workloads have reasonable locality. Could there
be a worst case where only one or two pages in each PTE are used,
so
> > ms granularity seems very coarse grained for this. Surely
> > at some point of spinning you can afford a ktime_get? But ok.
> We are reading time when we are at the head of the (main) queue, but
> don’t have the lock yet. Not sure about the latency of ktime_get(), but
> anything reasonably
Andi Kleen writes:
> Alex Kogan writes:
>>
>> +numa_spinlock_threshold=[NUMA, PV_OPS]
>> +Set the time threshold in milliseconds for the
>> +number of intra-node lock hand-offs before the
>> +
Alex Kogan writes:
>
> + numa_spinlock_threshold=[NUMA, PV_OPS]
> + Set the time threshold in milliseconds for the
> + number of intra-node lock hand-offs before the
> + NUMA-aware spinlock is forced to be passed to
> +
> The reason why soft lockup happens may be the unmapped EPT pages. So, do we
> have a way to map all gpa
> before we use pebs on Skylake?
Can you configure a VT-d device, that will implicitly pin all pages for the
IOMMU. I *think* that should be enough for testing.
-Andi
> I have actually seen real user programs poke MSR_SYSCALL_MASK.
Hmm, what was the use case?
-Andi
Borislav Petkov writes:
> From: Borislav Petkov
> Date: Sat, 10 Apr 2021 14:08:13 +0200
>
> There are a bunch of MSRs which luserspace has no business poking at,
> whatsoever. Add a ban list and put the TSC-related MSRs in there. Issue
> a big juicy splat to catch offenders.
Have you ever seen
Except where I commented, for the series
Acked-by: Andi Kleen
-Andi
> +--threads=::
> +Write collected trace data into several data files using parallel threads.
> + value can be user defined list of masks. Masks separated by colon
> +define cpus to be monitored by a thread and affinity mask of that thread
> +is separated by slash. For example user specification
> + err = write(thread->pipes.ack[1], , sizeof(msg));
> + if (err == -1)
> + pr_err("threads[%d]: failed to notify on start. Error %m",
> thread->tid);
It might be safer to not use %m. I'm not sure if all the non glibc
libcs that people use support it.
> + } else {
> + thread_data[t].tid = syscall(SYS_gettid);
That always fills in the tid of the setup thread instead of the target
threads?
> Hmm, I forgot about that quirk. I would expect the TDX Module to inject a #GP
> for that case. I can't find anything in the spec that confirms or denies
> that,
> but injecting #VE would be weird and pointless.
>
> Andi/Sathya, the TDX Module spec should be updated to state that XSETBV will
David Hildenbrand writes:
> I have no idea how expensive would be bouncing writes (and reads?)
> through the kernel. Did you ever experiment with that/evaluate that?
I would expect it to be quite expensive, as in virtio IO performance
tanking.
-Andi
On Wed, Apr 07, 2021 at 11:05:20AM +0800, Liuxiangdong (Aven, Cloud
Infrastructure Service Product Dept.) wrote:
>
>
> On 2021/4/6 20:47, Andi Kleen wrote:
> > > AFAIK, Icelake supports adaptive PEBS and extended PEBS which Skylake
> > > doesn't.
> > > But w
Christophe de Dinechin writes:
> Is there even a theoretical way to restore an encrypted page e.g. from (host)
> swap without breaking the integrity check? Or will that only be possible with
> assistance from within the encrypted enclave?
Only the later.
You would need balloning. It's in
> Now the hottest block is reported at the top of output.
>
> Fixes: b65a7d372b1a ("perf hist: Support block formats with
> compare/sort/display")
> Signed-off-by: Jin Yao
Reviewed-by: Andi Kleen
-Andi
> AFAIK, Icelake supports adaptive PEBS and extended PEBS which Skylake
> doesn't.
> But we can still use IA32_PEBS_ENABLE MSR to indicate general-purpose
> counter in Skylake.
> Is there anything else that only Icelake supports in this patches set?
Only Icelake server has the support for
On Sat, Apr 03, 2021 at 09:26:24AM -0700, Dave Hansen wrote:
> On 4/2/21 2:32 PM, Andi Kleen wrote:
> >> If we go this route, what are the rules and restrictions? Do we have to
> >> say "no MMIO in #VE"?
> >
> > All we have to say is "No M
> If we go this route, what are the rules and restrictions? Do we have to
> say "no MMIO in #VE"?
All we have to say is "No MMIO in #VE before getting thd TDVEINFO arguments"
After that it can nest without problems.
If you nest before that the TDX will cause a triple fault.
The code that
On Sat, Dec 19, 2020 at 09:08:05AM -0800, Randy Dunlap wrote:
> Give the user a clue about the problem along with the 35 lines of
> usage/help text.
Reviewed-by: Andi Kleen
-Andi
> I've heard things like "we need to harden the drivers" or "we need to do
> audits" and that drivers might be "whitelisted".
The basic driver allow listing patches are already in the repository,
but not currently posted or complete:
https://github.com/intel/tdx/commits/guest
>
> What are we
On Wed, Mar 31, 2021 at 08:46:18PM -0700, Dave Hansen wrote:
> On 3/31/21 8:28 PM, Andi Kleen wrote:
> >> The hardware (and VMMs and SEAM) have ways of telling the guest kernel
> >> what is supported: CPUID. If it screws up, and the guest gets an
> >> unexpected #VE,
> The hardware (and VMMs and SEAM) have ways of telling the guest kernel
> what is supported: CPUID. If it screws up, and the guest gets an
> unexpected #VE, so be it.
The main reason for disabling stuff is actually that we don't need
to harden it. All these things are potential attack paths.
>
; Signed-off-by: Kuppuswamy Sathyanarayanan
> >
> > Reviewed-by: Andi Kleen
> > ---
> >
> > Changes since previous series:
> > * Suppressed MWAIT feature as per Andi's comment.
> > * Added warning debug log for MWAIT #VE exception.
> >
> &
> > No, if these instructions take a #VE then they were executed at CPL=0.
> > MONITOR
> > and MWAIT will #UD without VM-Exit->#VE. Same for WBINVD, s/#UD/#GP.
>
> Dare I ask about XSETBV?
XGETBV does not cause a #VE, it just works normally. The guest has full
AVX capabilities.
-Andi
> > But is this a real problem?
>
> perhaps not, Andi, any idea about this?
It's not a problem for my tools which don't use the unit,
but I could imagine one for other parsers. I would recommend
to not change it for CSV, which is expected to be parsed
by tools.
-Andi
Jim Cromie writes:
> CONFIG_DYNAMIC_DEBUG creates a struct _ddebug (56 bytes) for each
> callsite, which includes 3 pointers to: module, filename, function.
> These are repetetive, and compressible, this patch series goes about
> doing that, it:
So how much memory does it actually save?
-Andi
> While we're discussing, do we really want to use the "core" and "atom"
> terms here? I thought cpu/cycles/ would be ok for the main (Big) CPU and
Yes absolutely.
> that we should come up with some short name for the "litle" CPUs.
There actually isn't a main CPU.
There's nothing "better"
> If you care about not breaking existing scripts, then the output they
> get with what they use as command line options must continue to produce
> the same output.
It's not clear there are any useful ones (except for tools that handle
both). It's really hard to parse the previous mess. It's
> Is it serious or just a joke? :)
I would prefer to not be compatible (at least not until someone complains),
but if compatibility is required then yes opting in to the broken
format would be better. Perhaps not with that name.
And the option could be hidden in the perf config file instead
of
On Tue, Mar 16, 2021 at 04:05:13PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Mar 16, 2021 at 09:34:21AM -0700, Andi Kleen escreveu:
> > > looks ok, but maybe make the option more related to CVS, like:
> > >
> > > --x-summary, --cvs-summary ...?
> >
> looks ok, but maybe make the option more related to CVS, like:
>
> --x-summary, --cvs-summary ...?
Actually I don't think it should be a new option. I doubt
anyone could parse the previous mess. So just make it default
with -x
-Andi
Michal Hocko writes:
> bikeshedding) because it hasn't really resonated with the udnerlying
> concept. Maybe just me as a non native speaker... page_head would have
> been so much more straightforward but not something I really care
> about.
Yes. page_head explains exactly what it is.
But
> I still think that there is value in taking those measurements after we
> enable the counters, as, for instance, for interval mode we want
> measurements with the counters enabled, whatever happens before the
> counters are enabled is just startup time, etc. Jiri, Andi?
Yes I agree. Only the
Yu Zhao writes:
> +
> +#ifdef CONFIG_MEMCG
> + if (memcg && atomic_read(>moving_account))
> + goto contended;
> +#endif
> + if (!mmap_read_trylock(mm))
> + goto contended;
These are essentially spinloops. Surely you need a
> A more concise syntax would be -e +event
>
> The + would apply to the whole event list if there are multiple.
>
> and maybe -event too to remove something from the default list.
Sorry that was an old email. Please ignore.
-Andi
On Tue, Dec 22, 2020 at 09:11:31AM +0800, Jin Yao wrote:
> The event default list includes the most common events which are widely
> used by users. But with -e option, the current perf only counts the events
> assigned by -e option. Users may want to collect some extra events with
> the default
stable though)
Looks all good to me. The VmPeak assumption might be slightly
fragile, but I guess there's nothing better currently.
Reviewed-by: Andi Kleen
-Andi
> If you want a uarch name, that might be the name of the core
> ( something-cove ... I can't keep track of the names of all the
GoldenCove for Sapphire Rapids/AlderLake.
But keep in mind that they're still different. Kind of like a different
distro with different patches and configuration
On Thu, Mar 11, 2021 at 08:58:32PM +0100, Peter Zijlstra wrote:
> On Thu, Mar 11, 2021 at 11:53:35AM -0500, Liang, Kan wrote:
>
> > > > The "cpu_core" PMU is similar to the Sapphire Rapids PMU, but without
> > > > PMEM.
> > > > The "cpu_atom" PMU is similar to Tremont, but with different
> > > >
> When a brute force attack is detected through the fork or execve system call,
> all the tasks involved in the attack will be killed with the exception of the
> init task (task with pid equal to zero). Now, and only if the init task is
> involved in the attack, block the fork system call from the
Thanks.
Okay but that means that the brute force attack can just continue
because the attacked daemon will be respawned?
You need some way to stop the respawning, otherwise the
mitigation doesn't work for daemons.
-Andi
> AFAICT we could register them all here. That instantly fixes that
> CPU_STARTING / CPU_DEAD fail elsewhere in this patch.
This would mean a system that only has Atoms or only has big cores
would still show the other CPU's PMU. We expect those to exist.
-Andi
. Presumably most people already
running with newer microcodes, so that functional impact is small.
But it should speed up the newer systems by the 2-4% claimed in
the original patch.
Cc: jmatt...@google.com
Fixes: 9b545c04abd4 ("perf/x86/kvm: Avoid unnecessary work ...")
Signed-off-by:
> The disk encryption is just one example and there might be others which
> we might not be aware of yet and we are not suspecting there is something
> wrong with the crypto code that needs to be fixed.
Then you don't have any leaks relating to branch tracing.
> restrict an external(in the sense
queue
> buffers correctly anyway, so the check is not needed.
>
> In addition, the check gets erroneously hit when using sample mode
> to trace multiple threads.
>
> Consequently, fix that case by removing the check.
Thanks!
Reviewed-by: Andi Kleen
-Andi
On Sun, Mar 07, 2021 at 07:05:41PM +0100, John Wood wrote:
> On Sun, Mar 07, 2021 at 09:25:40AM -0800, Andi Kleen wrote:
> > > processes created from it will be killed. If the systemd restart the
> > > network
> > > daemon and it will crash again, then the s
> processes created from it will be killed. If the systemd restart the network
> daemon and it will crash again, then the systemd will be killed. I think this
> way the attack is fully mitigated.
Wouldn't that panic the system? Killing init is usually a panic.
> > Or if it's a interactive login
Sorry for the late answer. I somehow missed your email earlier.
> As a mitigation method, all the offending tasks involved in the attack are
> killed. Or in other words, all the tasks that share the same statistics
> (statistics showing a fast crash rate) are killed.
So systemd will just restart
> Single event:
>
> # ./perf stat -e cpu/r031234/ -a -- sleep 1
> WARNING: event config '31234' not valid (bits 16 17 not supported by
> kernel)!
Just noticed that again. Can you please print the original event as
string in the message? While it's obvious with rXXX which one it is,
it may
Andi Kleen writes:
>
> Normally disk encryption is in specialized work queues. It's total
> overkill to restrict all of the kernel if you just want to restrict
> those work queues.
>
> I would suggest some more analysis where secrets are actually stored
> and handled first.
Sai Prakash Ranjan writes:
>
> "Consider a system where disk contents are encrypted and the encryption
> key is set up by the user when mounting the file system. From that point
> on the encryption key resides in the kernel. It seems reasonable to
> expect that the disk encryption key be
John Wood writes:
> +
> +To detect a brute force attack it is necessary that the statistics shared by
> all
> +the fork hierarchy processes be updated in every fatal crash and the most
> +important data to update is the application crash period.
So I haven't really followed the discussion and
> I assumed that it was this file:
> https://download.01.org/perfmon/mapfile.csv
Yep.
>
> which already has a different format - extra columns - so figured that some
> handcrafting was already going on. Indeed 'type' is not always 'core' there.
Yes. Could possibly add the extra columns from
On Fri, Feb 26, 2021 at 09:08:17PM +0800, John Garry wrote:
> Included for each CPU entry in the per-arch mapfile.csv file is a 'type'
> field.
I don't like it because this will make the mapfile incompatible with
what download.01.org uses. We originally derived it from that.
It would be ok if it
On Thu, Feb 18, 2021 at 11:57:50AM +0200, Adrian Hunter wrote:
> Hi
>
> Currently, only kernel tracing is supported and only with "timeless" decoding
> i.e. no TSC timestamps
Patches look good to me. That will be quite useful.
Acked-by: Andi Kleen
Thanks,
-Andi
On Fri, Feb 12, 2021 at 01:48:36PM -0800, Dave Hansen wrote:
> On 2/12/21 1:47 PM, Andy Lutomirski wrote:
> >> What about adding a property to the TD, e.g. via a flag set during TD
> >> creation,
> >> that controls whether unaccepted accesses cause #VE or are, for all
> >> intents and
> >>
> > > So what happens if NMI happens here, and triggers a nested #VE ?
> >
> > Yes that's a gap. We should probably bail out and reexecute the original
> > instruction. The VE handler would need to set a flag for that.
> >
> > Or alternatively the NMI always gets the VE information and puts
> >
> Which is supposedly then set up to avoid #VE during the syscall gap,
> yes? Which then results in #VE not having to be IST.
Yes that is currently true because all memory is pre-accepted.
If we ever do lazy accept we would need to make sure the memory accessed in
the syscall gap is already
ll microcode versions.
>
> Add the Cascade Lake Xeon steppings (5, 6, and 7) to the
> isolation_ucodes[] table so that these parts benefit from Andi's
> optimization in commit 9b545c04abd4f ("perf/x86/kvm: Avoid unnecessary
> work in guest filtering").
Reviewed-by: Andi Kleen
-Andi
On Fri, Feb 05, 2021 at 07:53:46PM +0200, Adrian Hunter wrote:
> Hi
>
> Here are 3 fixes and 1 minor new feature, for Intel PT.
For the series:
Reviewed-by: Andi Kleen
-Andi
> > I don't think I object to having an even in the stream, but your LOST
> > event is unfortunate in that it itself can get lost when there's no
> > space in the buffer (which arguably is unlikely, but still).
> >
> > So from that point of view, I think overloading LOST is not so very nice
> >
On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote:
> On Fri, Jan 15, 2021, Andi Kleen wrote:
> > > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS
> > > PMI"
> > > guaranteed to be atomic?
> >
> &
> I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI"
> guaranteed to be atomic?
Of course not.
>
> In practice, under what scenarios will guest counters get cross-mapped? And,
> how does this support affect guest accuracy? I.e. how bad do things get for
> the
>
On Sat, Jan 09, 2021 at 03:01:18AM -0500, Miaohe Lin wrote:
> Since commit 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged
> high MMIO PROT_NONE mappings"), when the first pfn modify is not allowed,
> we would break the loop with pte unchanged. Then the wrong pte - 1 would
> be passed
> Or, is the concern more about trying to time-slice the results in a
> fairly granular way and expecting accurate results then?
Usually the later. It's especially important for divisions. You want
both divisor and dividend to be in the same time slice, otherwise
the result usually doesn't make
> My first thought was: Why not have a 'perf iiostat' subcommand?
Same would apply to a lot of options in perf stat.
I guess you could add some aliases to "perf" that give shortcuts
for common perf stat command lines.
-Andi
On Wed, Dec 02, 2020 at 11:47:25AM -0800, Stephane Eranian wrote:
> On Wed, Dec 2, 2020 at 11:28 AM Andi Kleen wrote:
> >
> > > + prev_cgrp = task_css_check(prev, perf_event_cgrp_id, 1)->cgroup;
> > > + next_cgrp = task_css_check(next,
> + prev_cgrp = task_css_check(prev, perf_event_cgrp_id, 1)->cgroup;
> + next_cgrp = task_css_check(next, perf_event_cgrp_id, 1)->cgroup;
> +
> + if (prev_cgrp != next_cgrp)
> + perf_sw_event_sched(PERF_COUNT_SW_CGROUP_SWITCHES, 1, 0);
Seems to be the perf cgroup only, not
> Can't this be all in a macro? It's still a lot of duplication.
>
> I'm still not a fan, but I think with a macro it wouldn't be too bad.
>
>Ugh, not a fan of macros. They expand to a single line of code and make
>debugging hard. I'll do a v5 with a macro :-/
I suppose you
> +| expr '-' expr
> +{
> + if (!compute_ids || (is_const($1.val) && is_const($3.val))) {
> + assert($1.ids == NULL);
> + assert($3.ids == NULL);
> + $$.val = $1.val - $3.val;
> + $$.ids = NULL;
> + } else {
> + /* LHS and/or RHS
On Thu, Nov 26, 2020 at 01:30:26AM -0500, Paul Gortmaker wrote:
> But you currently can't make use of clear_warn_once unless you've got
> debugfs enabled and mounted - which may not be desired by some people
> in some deployment situations.
Seems awfully special purpose. The problem with debugfs
> +| expr '|' expr
> +{
> + if (!compute_ids || (isfinite($1.val) && isfinite($3.val))) {
> + assert($1.ids == NULL);
> + assert($3.ids == NULL);
> + $$.val = (long)$1.val | (long)$3.val;
> + $$.ids = NULL;
> + } else {
> + /*
> +
> You mean change event-converter-for-linux-perf to add this as JSON
> comments at the start of the generated files?
JSON doesn't support comments unfortunately
(unless they're somehow part of the data schema)
-Andi
>I'd prefer if we could make a copy of these scripts into the kernel tree
>along with the data files from:
>https://download.01.org/perfmon/
FWIW I originally tried this to include the raw JSON files, but Ingo objected
and wanted the files to be split up[1] and avoiding redundant
> Can I get ACK for this patch?
>
> It's mainly for fixing the perf-test issue and MEM_Parallel_Reads issue.
Acked-by: Andi Kleen
The JSON updates should be imho just merged automatically. Not much
point in doing code review on them. If there's a problem it has
to be fixed JSON
Actually thinking about it more you should probably pass around ctx/cgroup
in a single abstract argument. Otherwise have to change all the metrics
functions for the next filter too.
> @@ -57,6 +59,9 @@ static int saved_value_cmp(struct rb_node *rb_node, const
> void *entry)
> if (a->ctx != b->ctx)
> return a->ctx - b->ctx;
>
> + if (a->cgrp != b->cgrp)
> + return (char *)a->cgrp < (char *)b->cgrp ? -1 : +1;
This means the sort order
The patch does a lot of stuff and is hard to review.
The earlier patches all look good to me.
>
> static int
> __expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr,
> - int start, int runtime)
> + bool compute_ids, int runtime)
> {
> struct
0.76 insn per cycle
> CPU2 44,763,978 instructions #0.45 insn per cycle
> CPU3 69,049,079 instructions #0.56 insn per cycle
>
>1.001910444 seconds time elapsed
Yes makes a lot of sense. Thanks for tracking that down.
Reviewed-by: Andi Kleen
-Andi
On Wed, Nov 11, 2020 at 08:09:49PM -0800, Ian Rogers wrote:
> > to the optimization manual the group Topdown_Group_TopDownL1
> provides the
> >
> metrics Topdown_Metric_Frontend_Bound, Topdown_Metric_Backend_Bound,
> Topdown_Metric_Bad_Speculation
> > and
Hi Ian,
On Wed, Nov 11, 2020 at 03:49:10PM -0800, Ian Rogers wrote:
> FWIW I did something similar in python (that's how the current metrics
> json files were generated from the spreadsheet) and it was a lot
> simpler and shorter in a higher level language.
>
>The use of C++
On Tue, Nov 10, 2020 at 02:03:34AM -0800, Ian Rogers wrote:
> This RFC is for a new tool that reads TMA_Metrics.csv as found on
> download.01.org/perfmon and generates metrics and metric groups from
> it. To show the functionality the TMA_Metrics.csv is downloaded, but
> an accepted change would
On Mon, Nov 09, 2020 at 10:54:15PM +0100, Jiri Olsa wrote:
> Adding --buildid-mmap option to enable build id in mmap2 events.
> It will only work if there's kernel support for that and it disables
> build id cache (implies --no-buildid).
What's the point of the option? Why not enable it by
t' as well.
Hi,
What's the status of this patchkit? It would be quite useful to me (and
various other people) to use LBRs in guest. I reviewed it earlier and the
patches all looked good to me. But i don't see it in any -next tree.
Reviewed-by: Andi Kleen
Could it please be merged?
Thanks,
-Andi
On Mon, Nov 02, 2020 at 03:16:25PM +0100, Peter Zijlstra wrote:
> On Sun, Nov 01, 2020 at 07:52:38PM -0800, Andi Kleen wrote:
> > The main motivation is actually that the "multiple groups" algorithm
> > in perf doesn't work all that great: it has quite a few cases wh
> hm, it's too late for me to check ;-) but should I be able to do
> this with exclusive event.. running both command at the same time:
Yes. The exclusive part only applies during a given context,
but the two commands are different contexts.
You would only see a difference when in the same
per cycle
1.000838672,809865653,,branches,4003760616,100.00,202.287,M/sec
1.000838672,12439843,,branch-misses,4003780785,100.00,1.54,of all branches
...
Signed-off-by: Andi Kleen
---
v2: Fix default output (Jiri). Also handle \t as special value like the
original parser (Andi)
v3: Use
per cycle
1.000838672,809865653,,branches,4003760616,100.00,202.287,M/sec
1.000838672,12439843,,branch-misses,4003780785,100.00,1.54,of all branches
...
Signed-off-by: Andi Kleen
---
v2: Fix default output (Jiri). Also handle \t as special value like the
original parser (Andi
per cycle
1.000838672,809865653,,branches,4003760616,100.00,202.287,M/sec
1.000838672,12439843,,branch-misses,4003780785,100.00,1.54,of all branches
...
Signed-off-by: Andi Kleen
---
tools/perf/builtin-stat.c | 7 +++
1 file changed, 7 insertions(+)
diff --git a/tools/perf/builtin
it cuts the user time by 20%. On systems with more cores
the savings are higher.
Signed-off-by: Andi Kleen
---
tools/perf/Documentation/perf-stat.txt | 4
tools/perf/builtin-stat.c | 6 +-
tools/perf/util/stat.h | 1 +
3 files changed, 10 insertions(+)
On Mon, Oct 26, 2020 at 11:19:37PM +0900, Namhyung Kim wrote:
> This patch just added a warning before running it. I'd really want to
> fix the kernel if possible but don't have a good idea. Thoughts?
The easiest fix would be some multi threading in perf stat opening, then then
extra latencies
On Sat, Oct 24, 2020 at 11:59:17AM +0900, Namhyung Kim wrote:
> I found that the UNHALTED_CORE_CYCLES event is only available in the
> Intel machines and it makes other vendors/archs fail on the test. As
> libpfm4 can parse the generic events like cycles, let's use them.
>
> Fixes: 40b74c30ffb9
On Fri, Oct 23, 2020 at 04:42:34PM +0900, Namhyung Kim wrote:
> To make the command line even more compact with cgroups, support regex
> pattern matching in cgroup names.
>
> $ perf stat -a -e cpu-clock,cycles --for-each-cgroup '^.$' sleep 1
The example doesn't exactly show the benefit. So ^.$
en though the filter
> does not exist. This patch corrects the problem by relying on the CPUID 0xa
> leaf
> function to determine if anythread is supported or not as described in the
> Intel SDM Vol3b 18.2.5.1 AnyThread Deprecation section.
Reviewed-by: Andi Kleen
-Andi
> > After brief evaluation it still doesn't look easy. The simplest thing
> > I could imagine is to probably combine file_path and file_offset at a
> > struct object on stack and then pass the object by the reference along
> > function calls. I expect it will roughly cause the same amount of
On Mon, Oct 19, 2020 at 08:01:58AM -0700, kan.li...@linux.intel.com wrote:
> Add a line for the CYCLE_ACTIVITY.STALLS_MEM_ANY event in the ICL
> constraint table.
> Correct the comments for the CYCLE_ACTIVITY.CYCLES_MEM_ANY event.
Thanks Kan.
>
> Reported-by: Andi Kleen
>
1 - 100 of 19746 matches
Mail list logo