Daniel P Berrange writes: > On Fri, Sep 09, 2016 at 01:03:51PM +0200, Lluís Vilanova wrote: >> Daniel P Berrange writes: >> >> > On Thu, Sep 08, 2016 at 03:23:26PM +0200, Lluís Vilanova wrote: >> >> Daniel P Berrange writes: >> >> >> >> > I previously split the global trace-events file up into one file >> >> > per-subdirectory to avoid merge conflict hell. >> >> [...] >> >> >> >> Sorry, I could not find the message where the infrastructure is modified >> >> to >> >> provide this. But I think there's a more efficient way to provide modular >> >> auto-generated tracing code without the hierarchical indexing you >> >> proposed. >> >> > [snip] >> >> >> struct TraceEvent ___trace_events[] = { >> >> { >> >> .name = "eventname", >> >> .sstate = 1, >> >> .dstate = ___trace_eventname_dstate; >> >> } >> >> } >> >> >> >> TraceEvent *TRACE_EVENTNAME = &___trace_events[...]; >> >> > Life would be simpler if we had the 'bool dstate' as part of the >> > TraceEvent struct, but doing so would essentially be reverting this >> > previous change: >> >> > commit 585ec7273e6fdab902b2128bc6c2a8136aafef04 >> > Author: Paolo Bonzini <pbonz...@redhat.com> >> > Date: Wed Oct 28 07:06:27 2015 +0100 >> >> > trace: track enabled events in a separate array >> >> > This is more cache friendly on the fast path, where we already have >> > the event id available. >> >> > I asked Paolo about this previously and he indicated it was a notable >> > performance improvement, so we can't put dstate back into the TraceEvent >> > struct :-( >> >> Sorry, there was a typo in my example code that led to this misunderstanding. >> >> I meant this for the global auto-generated .c: >> >> struct TraceEvent { >> /* ... */ >> bool *dstate; >> }; >> >> bool ___TRACE_EVENTNAME_DSTATE; >> >> struct TraceEvent ___trace_events[] = { >> { >> .name = "eventname", >> .sstate = 1, >> .dstate = &___TRACE_EVENTNAME_DSTATE; >> } >> } >> >> TraceEvent *TRACE_EVENTNAME = &___trace_events[...]; >> >> >> If you look at the modified macro I pasted for "trace/control-internal.h": >> >> >> #define trace_event_get_state_dynamic_by_id(id) \ >> (unlikely(trace_events_enabled_count) && \ >> (___ ## id ## _DSTATE)) >> >> We're retaining the fast-path performance of the seggregated boolean dstate >> array, while the event descriptor contains a pointer to the per-event dstate >> global boolean variable in case you want to get/set the dstate based on an >> event >> pointer.
> The various _DSTATE variables are still arbitrarily scattered in > memory, as opposed to in a contiguous cache friendly array, which > is one of the goals of Paolo's original change. > That said, I'm unclear on how much of the performance win from > Paolo's change came from eliminating the struct field de-reference, > vs having the contiguous array. I'm guessing most of the win is > from the former, the latter only being important if we hit multiple > related tracepoints on close succession. The latter can also be achieved by packing them all together in a single section, but I don't know if that's acceptable within portable QEMU code. For example: bool ___TRACE_EVENTNAME_DSTATE __attribute__((section("dstate_array"))) Cheers, Lluis