Currently, the CPU topology for the converted domain is determined as follows:
(1) main() [main.c] (2) set_config_defaults() [main.c] vcpus <-- sysconf (_SC_NPROCESSORS_ONLN) get_cpu_config() [cpuid.c] get_topology() [cpuid.c] sockets <-- lscpu (physical) cores <-- lscpu (physical) threads <-- lscpu (physical) (3) update_config_from_kernel_cmdline() [kernel-config.c] vcpus <-- kernel cmdline sockets <-- kernel cmdline cores <-- kernel cmdline threads <-- kernel cmdline (4) opt#1: kernel_conversion() [kernel.c] no change to topology info opt#2: gui_conversion() [gui.c] vcpus <-- GUI (5) ... start_conversion() [conversion.c] generate_physical_xml() [physical-xml.c] format vcpus, sockets, cores, threads In (2), we initialize the topology information from the physical machine. In (3), we override each property in isolation from the kernel command line (this is done for both kernel conversion and GUI conversion; in the former case, it is the only way for the user to influence each element of the topology). In (4), in case we picked GUI conversion, the VCPU number can be overridden yet another time on the GUI. In (5), we format the topology information into the domain XML. The problem is that it's easy to create inconsistent topologies that libvirt complains about. One example is that in step (4) on the GUI, if the flat VCPU count is reduced by the user, it can result in sockets partially populated with cores or threads, or in cores partially populated with threads. Another example (even if the user does not touch the VCPU count) is a partially onlined CPU topology on the physical machine: _SC_NPROCESSORS_ONLN does not reflect any topology information, and if the flat number it returns is less than the (sockets * cores * threads) product from "lscpu", then the online logical processors will be "compressed to the left" by libvirt just the same as after the manual VCPU count reduction. An over-complicated way to fix this would be the following: - parse the output of "lscpu -p" (which is available since RHEL6 -- "lscpu" is altogether missing in RHEL5), retrieving online-ness combined with full topology information - expose complete topology info on the GUI - format the complete topology information for libvirt. A simpler way (which is what this patch series chooses) is to remove some flexibility from virt-p2v's configuration interface. Namely, only allow the user to choose betwen (a) copying the physical CPU topology, *fully populated*, (b) specifying the core count N for a "1 socket * N cores/socket * 1 thread/core" topology. Option (a) eliminates the "partially onlined physical CPU topology" scenario; it produces a topology that exists in the physical world, and does not require p2v to maintain topology information more expressively than it currently does. Option (b) eliminates any conflicts between the physical (or any user-configured) sockets:cores:threads hierarchy and a manually set logical processor count. Option (b) produces the kind of flat CPU topology that QEMU has defaulted to since version 6.2, with a simple "-smp" option. Option (b) preserves the simple p2v user interface for entering only a logical processor count. In this patch: - delete the (sockets, cores, threads) triplet from the p2v config structure, moving it out to a stand-alone structure, - introduce a new boolean config element (taking priority above the existent flat VCPU count) for copying -- densely populated -- the physical CPU topology, - always render the full topology into the domain XML, handling both options (a) and (b) -- for option (a), invoke "lscpu" right when the physical topology is needed, - make option (b) the default (sticking with current practice), - do not expose option (a) on the GUI yet. (An interesting question is whether post-patch, in GUI mode, where the conversion runs in a separate thread, it is safe to fork a child process for "lscpu" from said thread, with popen(): start_conversion_thread() -> start_conversion() -> generate_physical_xml() -> get_cpu_topology() -> get_lscpu() -> popen(). This concern arises because in the child process, only the one thread returning from fork() exists, thus if any locks are held by other threads in the parent process, those will never be released in the child process. But such a fork() should be safe: existent code already starts e.g. nbdkit on the same call path, namely start_conversion_thread() -> start_conversion() -> start_nbd_server() -> start_nbdkit() -> fork().) Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1590721 Signed-off-by: Laszlo Ersek <ler...@redhat.com> Reviewed-by: Richard W.M. Jones <rjo...@redhat.com> --- Notes: v2: - pick up Rich's R-b - rename dense_topo to phys_topo [Rich] - take Rich's suggested wording for the manual - rewrap the additions to the manual generate-p2v-config.pl | 45 ++++++++-------- p2v.h | 6 +++ cpuid.c | 39 ++++++++------ gui.c | 4 +- main.c | 8 +-- physical-xml.c | 54 ++++++++++---------- test-virt-p2v-cmdline.sh | 5 +- 7 files changed, 88 insertions(+), 73 deletions(-) diff --git a/generate-p2v-config.pl b/generate-p2v-config.pl index 06a2e2c8b0e4..47487f7b0f35 100755 --- a/generate-p2v-config.pl +++ b/generate-p2v-config.pl @@ -110,16 +110,19 @@ my @fields = [ ], ), ConfigString->new(name => 'guestname'), - ConfigInt->new(name => 'vcpus', value => 0), + ConfigSection->new( + name => 'vcpu', + elements => [ + ConfigBool->new(name => 'phys_topo'), + ConfigInt->new(name => 'cores', value => 0), + ], + ), ConfigUInt64->new(name => 'memory'), ConfigSection->new( name => 'cpu', elements => [ ConfigString->new(name => 'vendor'), ConfigString->new(name => 'model'), - ConfigUnsigned->new(name => 'sockets'), - ConfigUnsigned->new(name => 'cores'), - ConfigUnsigned->new(name => 'threads'), ConfigBool->new(name => 'acpi'), ConfigBool->new(name => 'apic'), ConfigBool->new(name => 'pae'), @@ -231,11 +234,21 @@ The name of the guest that is created. The default is to try to derive a name from the physical machine’s hostname (if possible) else use a randomly generated name.", ), - "p2v.vcpus" => manual_entry->new( + "p2v.vcpu.phys_topo" => manual_entry->new( + shortopt => "", # ignored for booleans + description => " +Copy the physical machine's complete CPU topology (sockets, cores and +threads) to the guest. Disabled by default. If disabled, the +C<p2v.vcpu.cores> setting takes effect.", + ), + "p2v.vcpu.cores" => manual_entry->new( shortopt => "N", description => " -The number of virtual CPUs to give to the guest. The default is to -use the same as the number of physical CPUs.", +This setting is ignored if C<p2v.vcpu.phys_topo> is enabled. Otherwise, +it specifies the flat number of vCPU cores to give to the guest (placing +all of those cores into a single socket, and exposing one thread per +core). The default value is the number of online logical processors on +the physical machine.", ), "p2v.memory" => manual_entry->new( shortopt => "n(M|G)", @@ -258,24 +271,6 @@ the same CPU vendor as the physical machine.", description => " The vCPU model, eg. \"IvyBridge\". The default is to use the same CPU model as the physical machine.", - ), - "p2v.cpu.sockets" => manual_entry->new( - shortopt => "N", - description => " -Number of vCPU sockets to use. The default is to use the same as the -physical machine.", - ), - "p2v.cpu.cores" => manual_entry->new( - shortopt => "N", - description => " -Number of vCPU cores to use. The default is to use the same as the -physical machine.", - ), - "p2v.cpu.threads" => manual_entry->new( - shortopt => "N", - description => " -Number of vCPU hyperthreads to use. The default is to use the same -as the physical machine.", ), "p2v.cpu.acpi" => manual_entry->new( shortopt => "", # ignored for booleans diff --git a/p2v.h b/p2v.h index 20c4ac7e506a..32dc55d94de5 100644 --- a/p2v.h +++ b/p2v.h @@ -60,6 +60,12 @@ extern int feature_colours_option; extern int force_colour; /* cpuid.c */ +struct cpu_topo { + unsigned sockets; + unsigned cores; + unsigned threads; +}; +extern void get_cpu_topology (struct cpu_topo *topo); extern void get_cpu_config (struct cpu_config *); /* rtc.c */ diff --git a/cpuid.c b/cpuid.c index 84603bbb3374..31a1720dc23c 100644 --- a/cpuid.c +++ b/cpuid.c @@ -154,22 +154,32 @@ get_vendor (char **lscpu, struct cpu_config *cpu) } /** - * Read the CPU topology from lscpu output. + * Read the CPU topology from a separate lscpu invocation. */ -static void -get_topology (char **lscpu, struct cpu_config *cpu) +void +get_cpu_topology (struct cpu_topo *topo) { - const char *v; - - v = get_field (lscpu, "Socket(s)"); - if (v) - ignore_value (sscanf (v, "%u", &cpu->sockets)); - v = get_field (lscpu, "Core(s) per socket"); - if (v) - ignore_value (sscanf (v, "%u", &cpu->cores)); - v = get_field (lscpu, "Thread(s) per core"); - if (v) - ignore_value (sscanf (v, "%u", &cpu->threads)); + + CLEANUP_FREE_STRING_LIST char **lscpu = NULL; + + lscpu = get_lscpu (); + + if (lscpu != NULL) { + const char *sockets, *cores, *threads; + + sockets = get_field (lscpu, "Socket(s)"); + cores = get_field (lscpu, "Core(s) per socket"); + threads = get_field (lscpu, "Thread(s) per core"); + if (sockets && cores && threads) { + ignore_value (sscanf (sockets, "%u", &topo->sockets)); + ignore_value (sscanf (cores, "%u", &topo->cores)); + ignore_value (sscanf (threads, "%u", &topo->threads)); + return; + } + } + topo->sockets = 1; + topo->cores = 1; + topo->threads = 1; } /** @@ -211,7 +221,6 @@ get_cpu_config (struct cpu_config *cpu) lscpu = get_lscpu (); if (lscpu != NULL) { get_vendor (lscpu, cpu); - get_topology (lscpu, cpu); get_flags (lscpu, cpu); } diff --git a/gui.c b/gui.c index 5c4f1343095a..b4a9fed18410 100644 --- a/gui.c +++ b/gui.c @@ -767,7 +767,7 @@ create_conversion_dialog (struct config *config) set_alignment (vcpus_label, 1., 0.5); vcpus_entry = gtk_entry_new (); gtk_label_set_mnemonic_widget (GTK_LABEL (vcpus_label), vcpus_entry); - snprintf (vcpus_str, sizeof vcpus_str, "%d", config->vcpus); + snprintf (vcpus_str, sizeof vcpus_str, "%d", config->vcpu.cores); gtk_entry_set_text (GTK_ENTRY (vcpus_entry), vcpus_str); table_attach (target_tbl, vcpus_entry, 1, 2, 1, 2, GTK_FILL, GTK_FILL, 1, 1); @@ -2010,7 +2010,7 @@ start_conversion_clicked (GtkWidget *w, gpointer data) return; } - config->vcpus = get_vcpus_from_conv_dlg (); + config->vcpu.cores = get_vcpus_from_conv_dlg (); config->memory = get_memory_from_conv_dlg (); /* Get the list of disks to be converted. */ diff --git a/main.c b/main.c index 0ebb7291c7ce..6c44183b65b3 100644 --- a/main.c +++ b/main.c @@ -291,16 +291,18 @@ set_config_defaults (struct config *config) } config->guestname = strdup (hostname); + config->vcpu.phys_topo = false; + /* Defaults for #vcpus and memory are taken from the physical machine. */ i = sysconf (_SC_NPROCESSORS_ONLN); if (i == -1) { perror ("sysconf: _SC_NPROCESSORS_ONLN"); - config->vcpus = 1; + config->vcpu.cores = 1; } else if (i == 0) - config->vcpus = 1; + config->vcpu.cores = 1; else - config->vcpus = i; + config->vcpu.cores = i; i = sysconf (_SC_PHYS_PAGES); if (i == -1) { diff --git a/physical-xml.c b/physical-xml.c index 4e830ea7ee0c..983798f8a365 100644 --- a/physical-xml.c +++ b/physical-xml.c @@ -62,6 +62,7 @@ generate_physical_xml (struct config *config, struct data_conn *data_conns, uint64_t memkb; CLEANUP_XMLFREETEXTWRITER xmlTextWriterPtr xo = NULL; size_t i; + struct cpu_topo topo; xo = xmlNewTextWriterFilename (filename, 0); if (xo == NULL) @@ -104,34 +105,35 @@ generate_physical_xml (struct config *config, struct data_conn *data_conns, string_format ("%" PRIu64, memkb); } end_element (); - single_element_format ("vcpu", "%d", config->vcpus); - - if (config->cpu.vendor || config->cpu.model || - config->cpu.sockets || config->cpu.cores || config->cpu.threads) { - /* https://libvirt.org/formatdomain.html#elementsCPU */ - start_element ("cpu") { - attribute ("match", "minimum"); - if (config->cpu.vendor) - single_element ("vendor", config->cpu.vendor); - if (config->cpu.model) { - start_element ("model") { - attribute ("fallback", "allow"); - string (config->cpu.model); - } end_element (); - } - if (config->cpu.sockets || config->cpu.cores || config->cpu.threads) { - start_element ("topology") { - if (config->cpu.sockets) - attribute_format ("sockets", "%u", config->cpu.sockets); - if (config->cpu.cores) - attribute_format ("cores", "%u", config->cpu.cores); - if (config->cpu.threads) - attribute_format ("threads", "%u", config->cpu.threads); - } end_element (); - } - } end_element (); + if (config->vcpu.phys_topo) + get_cpu_topology (&topo); + else { + topo.sockets = 1; + topo.cores = config->vcpu.cores; + topo.threads = 1; } + single_element_format ("vcpu", "%u", + topo.sockets * topo.cores * topo.threads); + + /* https://libvirt.org/formatdomain.html#elementsCPU */ + start_element ("cpu") { + attribute ("match", "minimum"); + if (config->cpu.vendor) + single_element ("vendor", config->cpu.vendor); + if (config->cpu.model) { + start_element ("model") { + attribute ("fallback", "allow"); + string (config->cpu.model); + } end_element (); + } + start_element ("topology") { + attribute_format ("sockets", "%u", topo.sockets); + attribute_format ("cores", "%u", topo.cores); + attribute_format ("threads", "%u", topo.threads); + } end_element (); + } end_element (); + switch (config->rtc.basis) { case BASIS_UNKNOWN: /* Don't emit any <clock> element. */ diff --git a/test-virt-p2v-cmdline.sh b/test-virt-p2v-cmdline.sh index cd76044195a3..79c3061b1e7c 100755 --- a/test-virt-p2v-cmdline.sh +++ b/test-virt-p2v-cmdline.sh @@ -26,7 +26,7 @@ out=test-virt-p2v-cmdline.out rm -f $out # The Linux kernel command line. -$VG virt-p2v --cmdline='p2v.server=localhost p2v.port=123 p2v.username=user p2v.password=secret p2v.skip_test_connection p2v.name=test p2v.vcpus=4 p2v.memory=1G p2v.disks=sda,sdb,sdc p2v.removable=sdd p2v.interfaces=eth0,eth1 p2v.o=local p2v.oa=sparse p2v.oc=qemu:///session p2v.of=raw p2v.os=/var/tmp p2v.network=em1:wired,other p2v.dump_config_and_exit' > $out +$VG virt-p2v --cmdline='p2v.server=localhost p2v.port=123 p2v.username=user p2v.password=secret p2v.skip_test_connection p2v.name=test p2v.vcpu.cores=4 p2v.memory=1G p2v.disks=sda,sdb,sdc p2v.removable=sdd p2v.interfaces=eth0,eth1 p2v.o=local p2v.oa=sparse p2v.oc=qemu:///session p2v.of=raw p2v.os=/var/tmp p2v.network=em1:wired,other p2v.dump_config_and_exit' > $out # For debugging purposes. cat $out @@ -37,7 +37,8 @@ grep "^remote\.port.*123" $out grep "^auth\.username.*user" $out grep "^auth\.sudo.*false" $out grep "^guestname.*test" $out -grep "^vcpus.*4" $out +grep "^vcpu.phys_topo.*false" $out +grep "^vcpu.cores.*4" $out grep "^memory.*"$((1024*1024*1024)) $out grep "^disks.*sda sdb sdc" $out grep "^removable.*sdd" $out _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs