Paolo Bonzini writes: > On 14/06/2016 10:39, Stefan Hajnoczi wrote: >> On Mon, Jun 13, 2016 at 06:39:46PM +0200, Lluís Vilanova wrote: >>> Paolo Bonzini writes: >>> >>>> On 13/06/2016 14:15, Lluís Vilanova wrote: >>>>>> That said, I am skeptical about the benefit of the interfaces you are >>>>>> adding. They add a lot of complication and overhead (especially >>>>>> regarding the memory/cache overhead of the dstate array) without a clear >>>>>> use case, in my opinion; all the processing you do at run-time is just >>>>>> as well suited for later filtering. >>>>> >>>>> This should make tracing faster on the future with multi-threaded TCG, as >>>>> well >>>>> as trace files much smaller if you're tracing something like memory >>>>> accesses. Also, bear in mind this series was split from a much larger one >>>>> for >>>>> simplicity. The follow-up one provides much larger performance benefits by >>>>> avoiding the generation of TCG code to call the tracing backend when a >>>>> vCPU is >>>>> not traced. >>> >>>> This still assumes that tracing only some VCPUs is a common use case. >>>> Is it?... >>> >>> I use it for code profiling by sampling across vCPUs, or only on vCPUs I >>> know >>> run processes of my interest. The profiles can then be used for analyzing >>> the >>> application/system behaviour. >>> >>> Also, with the fast-path checks already in place >>> ('trace_events_enabled_count'), >>> performance when not tracing should never be worse with this series. >>> >>> If this feature does not look useful to overall QEMU I will fold it into my >>> other out-of-tree patches. >> >> I think the per-vcpu tracing feature is reasonable for qemu.git as long >> as it does not introduce performance regressions for existing users.
> I'm okay with it if the dstate array is changed to uint16_t. That should be fine. I don't think QEMU is gonna run more than 2**16 vCPUs anytime soon :) Cheers, Lluis