Qemu

Anthony Harivel Tue, 12 Mar 2024 04:21:54 -0700


Hi Daniel, Paolo,


Here my last questions before wrapping up and send v4, or maybe call off
my attempt to add RAPL interface in QEMU.


Daniel P. Berrangé, Jan 30, 2024 at 10:39:
> > +    rcu_register_thread();
> > +
> > +    /* Get QEMU PID*/
> > +    pid = getpid();
> > +
> > +    /* Nb of CPUS per packages */
> > +    maxcpus = vmsr_get_maxcpus(0);
> > +
> > +    /* Nb of Physical Packages on the system */
> > +    maxpkgs = vmsr_get_max_physical_package(maxcpus);
>
> This function can fail so this needs to be checked & reported.
>
> > +
> > +    /* Those MSR values should not change as well */
> > +    vmsr->msr_unit  = vmsr_read_msr(MSR_RAPL_POWER_UNIT, 0, pid,
> > +                                    s->msr_energy.socket_path);
> > +    vmsr->msr_limit = vmsr_read_msr(MSR_PKG_POWER_LIMIT, 0, pid,
> > +                                    s->msr_energy.socket_path);
> > +    vmsr->msr_info  = vmsr_read_msr(MSR_PKG_POWER_INFO, 0, pid,
> > +                                    s->msr_energy.socket_path);
>
> This function can fail for a variety of reasons, most especially if someone
> gave an incorrect socket path, or if the daemon is not running. This is not
> getting diagnosed, and even if we try to report it here, we're in a background
> thread at this point.
>
> I think we need to connect and report errors before even starting this
> thread, so that QEMU startup gets aborted upon configuration error.
>

Fair enough. Would it be ok to do the sanity check before 
rcu_register_thread() and "return NULL;" in case of error or would you 
prefer me to check all of this before even calling the 
qemu_thread_create() ? 

> > +        /* Populate all the thread stats */
> > +        for (int i = 0; i < num_threads; i++) {
> > +            thd_stat[i].utime = g_new0(unsigned long long, 2);
> > +            thd_stat[i].stime = g_new0(unsigned long long, 2);
> > +            thd_stat[i].thread_id = thread_ids[i];
> > +            vmsr_read_thread_stat(&thd_stat[i], pid, 0);
>
> It is non-obvious that the 3rd parameter here is an index into
> the utime & stime array. This function would be saner to review
> if called as:
>
>             vmsr_read_thread_stat(pid,
>                                 thd_stat[i].thread_id,
>                                 &thd_stat[i].utime[0],
>                                 &thd_stat[i].stime[0],
>                                 &thd_stat[i].cpu_id);
>
> so we see what are input parameters and what are output parameters.
>
> Also this method can fail, eg if the thread has exited already,
> so we need to take that into account and stop trying to get info
> for that thread in later code. eg by setting 'thread_id' to 0
> and then skipping any thread_id == 0 later.
>
>

Good point. I'll rework the function and return "thread_id" to 0 in 
case of failure in order to test it later on. 

> > +            thd_stat[i].numa_node_id = 
> > numa_node_of_cpu(thd_stat[i].cpu_id);
> > +        }
> > +
> > +        /* Retrieve all packages power plane energy counter */
> > +        for (int i = 0; i <= maxpkgs; i++) {
> > +            for (int j = 0; j < num_threads; j++) {
> > +                /*
> > +                 * Use the first thread we found that ran on the CPU
> > +                 * of the package to read the packages energy counter
> > +                 */
> > +                if (thd_stat[j].numa_node_id == i) {
>
> 'i' is a CPU ID value, while 'numa_node_id' is a NUMA node ID value.
> I don't think it is semantically valid to compare them for equality.
>
> I'm not sure the NUMA node is even relevant, since IIUC from the docs
> earlier, the power values are scoped per package, which would mean per
> CPU socket.
>

'i' here is the package number on the host. 
I'm using functions of libnuma to populate the maxpkgs of the host. 
I tested this on different Intel CPU with multiple packages and this 
has always returned the good number of packages. A false positive ?

So here I'm checking if the thread has run on the package number 'i'. 
I populate 'numa_node_id' with numa_node_of_cpu().

I did not wanted to reinvent the wheel and the only lib that was talking 
about "node" was libnuma.

Maybe I'm wrong assuming that a "node" (defined as an area where all 
memory has the same speed as seen from a particular CPU) could lead me 
to the packages number ?

And this is what I see you wrote below: 
"A numa node isn't a package AFAICT."


Regards,
Anthony

Re: [PATCH v3 3/3] Add support for RAPL MSRs in KVM/Qemu

Reply via email to