On Thu, 4 Apr 2019, Cyrill Gorcunov wrote:

> On Thu, Apr 04, 2019 at 12:37:18PM -0400, Vince Weaver wrote:
> 
> Oh, Vince, I suspect such kind of bisection might consume a lot of your
> time :( Maybe we could update perf fuzzer so that it would send events
> to some net-storage first then write them to the counters, iow to automatize
> this all stuff somehow?

I do have a lot of this automated already from tracking down past bugs, 
but it turns out that most of the fuzzer-found bugs aren't deterministic 
so it doesn't always work.

For example this bug, while I can easily repeat it, doesn't happen at 
the same time each time.  I suspect something corrupts things, but the
crash doesn't trigger until a context switch happens.

For what it's worth I've put code in p4_pmu_enable_all() to see what's 
going on when the NULL dereference happens, and sure enough the printk is 
triggered where I'd expect.

[  138.132889] VMW: p4_pmu_enable_all: idx 4 is NULL
[  138.171380] VMW: p4_pmu_enable_all: idx 4 is NULL
[  138.212588] VMW: p4_pmu_enable_all: idx 4 is NULL
[  138.263761] VMW: p4_pmu_enable_all: idx 4 is NULL
[  138.279944] VMW: p4_pmu_enable_all: idx 4 is NULL

static void p4_pmu_enable_all(int added)
{
        struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
        int idx;

        for (idx = 0; idx < x86_pmu.num_counters; idx++) {
                struct perf_event *event = cpuc->events[idx];
                if (!test_bit(idx, cpuc->active_mask))
                        continue;
                if (event==NULL) {
                        printk("VMW: p4_pmu_enable_all: idx %d is NULL\n",idx);
                } else {
                        p4_pmu_enable_event(event);
                }
        }
}


the machine still crashes after this, but not right away.

Vince

Reply via email to