Hi, As I was debugging my hrtimer patch, I ran a few tests with hotplug CPU. In others words, I offline a CPU while there is an active monitoring session which causes multiplexing.
When the CPU goes down, all is well. But when it comes back, things go wrong. No kernel crashes but wrong results and multiplexing does not work anymore. I investigated this some more and found out there is an issue on re-activation. During shutdown, system-wide events are scheduled out AND removed from the event lists. Consequently, ctx->nr_events and ctx->nr_active go to zero. When the CPU is brought back online and tools do start/stop on the events they can be scheduled back in, and therefore increment ctx->nr_active. Because list_add_event() is not called again, you may end up with ctx->nr_events < ctx->nr_active which is wrong. Events may not be a lists and therefore they cannot get multiplexed again. It is not clear to me why we need to remove the events from any list (list_del_event) when the CPU goes down. Why isn't calling event_sched_out() enough? If events are kept on lists, why not try to schedule them back in when the CPU is brought back online? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

