Re: [PATCH] target/arm: fix MPIDR value for ARM CPUs with SMT
On 4/19/24 11:31, Dorjoy Chowdhury wrote: + +/* + * Instantiate a temporary CPU object to build mp_affinity + * of the possible CPUs. + */ +cpuobj = object_new(ms->cpu_type); +armcpu = ARM_CPU(cpuobj); + for (n = 0; n < ms->possible_cpus->len; n++) { ms->possible_cpus->cpus[n].type = ms->cpu_type; ms->possible_cpus->cpus[n].arch_id = -sbsa_ref_cpu_mp_affinity(sms, n); +sbsa_ref_cpu_mp_affinity(armcpu, n); ms->possible_cpus->cpus[n].props.has_thread_id = true; ms->possible_cpus->cpus[n].props.thread_id = n; } + +object_unref(cpuobj); Why is this instantiation necessary? --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1314,8 +1314,18 @@ static void arm_cpu_dump_state(CPUState *cs, FILE *f, int flags) } } -uint64_t arm_build_mp_affinity(int idx, uint8_t clustersz) +uint64_t arm_build_mp_affinity(ARMCPU *cpu, int idx, uint8_t clustersz) { +if (cpu->has_smt) { +/* + * Right now, the ARM CPUs with SMT supported by QEMU only have + * one thread per core. So Aff0 is always 0. + */ Well, this isn't true. -smp [[cpus=]n][,maxcpus=maxcpus][,drawers=drawers][,books=books][,sockets=sockets] [,dies=dies][,clusters=clusters][,cores=cores][,threads=threads] I would expect all of Aff[0-3] to be settable with the proper topology parameters. r~
Re: [PATCH 04/24] exec: Restrict TCG specific declarations of 'cputlb.h'
On 4/18/24 12:25, Philippe Mathieu-Daudé wrote: Avoid TCG specific declarations being used from non-TCG accelerators. Signed-off-by: Philippe Mathieu-Daudé --- include/exec/cputlb.h | 5 + 1 file changed, 5 insertions(+) Reviewed-by: Richard Henderson r~
Re: [PATCH 03/24] hw/core: Avoid including the full 'hw/core/cpu.h' in 'tcg-cpu-ops.h'
On 4/18/24 12:25, Philippe Mathieu-Daudé wrote: Only include what is required, avoiding the full CPUState API from the huge "hw/core/cpu.h" header. Signed-off-by: Philippe Mathieu-Daudé --- include/hw/core/tcg-cpu-ops.h | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) Reviewed-by: Richard Henderson r~
Re: [PATCH 02/24] exec: Declare CPUBreakpoint/CPUWatchpoint type in 'breakpoint.h' header
On 4/18/24 12:25, Philippe Mathieu-Daudé wrote: The CPUBreakpoint and CPUWatchpoint structures are declared in "hw/core/cpu.h", which contains declarations related to CPUState and CPUClass. Some source files only require the BP/WP definitions and don't need to pull in all CPU* API. In order to simplify, create a new "exec/breakpoint.h" header. Signed-off-by: Philippe Mathieu-Daudé --- include/exec/breakpoint.h | 23 +++ include/hw/core/cpu.h | 16 +--- target/arm/internals.h| 1 + target/ppc/internal.h | 1 + target/riscv/debug.h | 2 ++ 5 files changed, 28 insertions(+), 15 deletions(-) create mode 100644 include/exec/breakpoint.h Reviewed-by: Richard Henderson r~
Re: [PATCH 01/24] exec: Declare MMUAccessType type in 'mmu-access-type.h' header
On 4/18/24 12:25, Philippe Mathieu-Daudé wrote: The MMUAccessType enum is declared in "hw/core/cpu.h". "hw/core/cpu.h" contains declarations related to CPUState and CPUClass. Some source files only require MMUAccessType and don't need to pull in all CPU* declarations. In order to simplify, create a new "exec/mmu-access-type.h" header. Signed-off-by: Philippe Mathieu-Daudé --- include/exec/cpu_ldst.h| 1 + include/exec/exec-all.h| 1 + include/exec/mmu-access-type.h | 18 ++ include/hw/core/cpu.h | 8 +--- 4 files changed, 21 insertions(+), 7 deletions(-) create mode 100644 include/exec/mmu-access-type.h Reviewed-by: Richard Henderson r~
Re: [PATCH v7 00/12] Enabling DCD emulation support in Qemu
On Fri, Apr 19, 2024 at 11:43:14AM -0700, fan wrote: > On Fri, Apr 19, 2024 at 02:24:36PM -0400, Gregory Price wrote: > > > > added review to all patches, will hopefully be able to add a Tested-by > > tag early next week, along with a v1 RFC for MHD bit-tracking. > > > > We've been testing v5/v6 for a bit, so I expect as soon as we get the > > MHD code ported over to v7 i'll ship a tested-by tag pretty quick. > > > > The super-set release will complicate a few things but this doesn't > > look like a blocker on our end, just a change to how we track bits in a > > shared bit/bytemap. > > > > Hi Gregory, > Thanks for reviewing the patches so quickly. > > No pressure, but look forward to your MHD work. :) > > Fan Starting to get into versioniong hell a bit, since the Niagara work was based off of jonathan's branch and the mhd-dcd work needs some of the extentions from that branch - while this branch is based on master. Probably we'll need to wait for a new cxl dated branch to try and sus out the pain points before we push an RFC. I would not want to have conflicting commits for something like this for example: https://lore.kernel.org/qemu-devel/20230901012914.226527-2-gregory.pr...@memverge.com/ We get merge conflicts here because this is behind that patch. So pushing up an RFC in this state would be mostly useless to everyone. ~Gregory
Re: cross-i686-tci CI job is flaky again (timeouts): can somebody who cares about TCI investigate?
Am 16.04.24 um 14:17 schrieb Stefan Weil: Am 16.04.24 um 14:10 schrieb Peter Maydell: The cross-i686-tci job is flaky again, with persistent intermittent failures due to jobs timing out. [...] Some of these timeouts are very high -- no test should be taking 10 minutes, even given TCI and a slowish CI runner -- which suggests to me that there's some kind of intermittent deadlock going on. Can somebody who cares about TCI investigate, please, and track down whatever this is? I'll have a look. Short summary: The "persistent intermittent failures due to jobs timing out" are not related to TCI: they also occur if the same tests are run with the normal TCG. I suggest that the CI tests should run single threaded. But let's have a look on details of my results. I have run `time make test` using different scenarios on the rather old and not so performant VM which I typically use for QEMU builds. I did not restrict the tests to selected architectures like it is done in the QEMU CI tests. Therefore I had more tests which all ended successfully: Ok: 848 Expected Fail: 0 Fail: 0 Unexpected Pass:0 Skipped:68 Timeout:0 --- 1st test with normal TCG `nohup time ../configure --enable-modules --disable-werror && nohup time make -j4 && nohup time make test` [...] Full log written to /home/stefan/src/gitlab/qemu-project/qemu/bin/ndebug/x86_64-linux-gnu/meson-logs/testlog.txt 2296.08user 1525.02system 21:49.78elapsed 291%CPU (0avgtext+0avgdata 633476maxresident)k 1730448inputs+14140528outputs (11668major+56827263minor)pagefaults 0swaps --- 2nd test with TCI `nohup time ../configure --enable-modules --disable-werror --enable-tcg-interpreter && nohup time make -j4 && nohup time make test` [...] Full log written to /home/stefan/src/gitlab/qemu-project/qemu/bin/ndebug/x86_64-linux-gnu/meson-logs/testlog.txt 3766.74user 1521.38system 26:50.51elapsed 328%CPU (0avgtext+0avgdata 633012maxresident)k 32768inputs+14145080outputs (3033major+56121586minor)pagefaults 0swaps --- So the total test time with TCI was 26:50.51 minutes while for the normal test it was 21:49.78 minutes. These 10 single tests had the longest duration: 1st test with normal TCG 94/916 qtest-arm / qtest-arm/qom-test 373.41s 99/916 qtest-aarch64 / qtest-aarch64/qom-test 398.43s 100/916 qtest-i386 / qtest-i386/bios-tables-test 188.06s 103/916 qtest-x86_64 / qtest-x86_64/bios-tables-test 228.33s 106/916 qtest-aarch64 / qtest-aarch64/migration-test 201.15s 119/916 qtest-i386 / qtest-i386/migration-test 253.58s 126/916 qtest-x86_64 / qtest-x86_64/migration-test 266.66s 143/916 qtest-arm / qtest-arm/test-hmp 101.72s 144/916 qtest-aarch64 / qtest-aarch64/test-hmp 113.10s 163/916 qtest-arm / qtest-arm/aspeed_smc-test 256.92s 2nd test with TCI 68/916 qtest-arm / qtest-arm/qom-test 375.35s 82/916 qtest-aarch64 / qtest-aarch64/qom-test 403.50s 93/916 qtest-i386 / qtest-i386/bios-tables-test 192.22s 99/916 qtest-aarch64 / qtest-aarch64/bios-tables-test 379.92s 100/916 qtest-x86_64 / qtest-x86_64/bios-tables-test 240.19s 103/916 qtest-aarch64 / qtest-aarch64/migration-test 223.49s 106/916 qtest-ppc64 / qtest-ppc64/pxe-test 418.42s 113/916 qtest-i386 / qtest-i386/migration-test 284.96s 118/916 qtest-arm / qtest-arm/aspeed_smc-test 271.10s 119/916 qtest-x86_64 / qtest-x86_64/migration-test 287.36s --- Summary: TCI is not much slower than the normal TCG. Surprisingly it was even faster for the tests 99 and 103. For other tests like test 106 TCI was about half as fast as normal TCG, but in summary it is not "factors" slower. A total test time of 26:50 minutes is also not so bad compared with the 21:49 minutes of the normal TCG. No single test (including subtests) with TCI exceeded 10 minutes, the longest one was well below that margin with 418 seconds. --- The tests above were running with x86_64, and I could not reproduce timeouts. The Gitlab CI tests were running with i686 and used different configure options. Therefore I made additional tests with 32 bit builds in a chroot environment (Debian GNU Linux bullseye i386) with the original configure options. As expected that reduced the number of tests to 250. All tests passed with `make test`: 3rd test with normal TCG Ok: 250 Expected Fail: 0 Fail: 0 Unexpected Pass:0 Skipped:8 Timeout:0 Full log written to /root/qemu/bin/ndebug/i586-linux-gnu/meson-logs/testlog.txt 855.30user 450.53system 6:45.57elapsed 321%CPU (0avgtext+0avgdata 609180maxresident)k 28232inputs+4772968outputs (64944major+8328814minor)pagefaults 0swaps 4th test with TCI Ok: 250 Expected Fail: 0 Fail: 0 Unexpected Pass:0 Skipped:8 Timeout:0 Full log writte
xlnx-versal-virt machine in qemu
Hi, I have booted up the xlnx-versal-virt machine using qemu-system-aarch64. I wanted to work with can device that has been modelled with this device. I have used the xilinx_can.c driver for this device and can see two can controllers. The problem is I am not able to see any interrupts in /proc/interrupts for both can devices. I have set them up and running. I have also connected the canbus device to host to transmit and receive can packets. I am seeing qemu_set_irq() getting called. Am I missing something? Thanks , Abhijeet, India
Re: [PATCH 3/3] target/arm: Default to 1GHz cntfrq for 'max' and new CPUs
On 4/19/24 11:46, Peter Maydell wrote: In previous versions of the Arm architecture, the frequency of the generic timers as reported in CNTFRQ_EL0 could be any IMPDEF value, and for QEMU we picked 62.5MHz, giving a timer tick period of 16ns. In Armv8.6, the architecture standardized this frequency to 1GHz. Because there is no ID register feature field that indicates whether a CPU is v8.6 or that it ought to have this counter frequency, we implement this by changing our default CNTFRQ value for all CPUs, with exceptions for backwards compatibility: * CPU types which we already implement will retain the old default value. None of these are v8.6 CPUs, so this is architecturally OK. * CPUs used in versioned machine types with a version of 9.0 or earlier will retain the old default value. The upshot is that the only CPU type that changes is 'max'; but any new type we add in future (whether v8.6 or not) will also get the new 1GHz default. It remains the case that the machine model can override the default value via the 'cntfrq' QOM property (regardless of the CPU type). Signed-off-by: Peter Maydell --- target/arm/cpu.h | 11 +++ target/arm/internals.h | 12 ++-- hw/core/machine.c | 4 +++- target/arm/cpu.c | 28 ++-- target/arm/cpu64.c | 2 ++ target/arm/tcg/cpu32.c | 4 target/arm/tcg/cpu64.c | 18 ++ 7 files changed, 70 insertions(+), 9 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH 2/3] target/arm: Refactor default generic timer frequency handling
On 4/19/24 11:46, Peter Maydell wrote: The generic timer frequency is settable by board code via a QOM property "cntfrq", but otherwise defaults to 62.5MHz. The way this is done includes some complication resulting from how this was originally a fixed value with no QOM property. Clean it up: * always set cpu->gt_cntfrq_hz to some sensible value, whether the CPU has the generic timer or not, and whether it's system or user-only emulation * this means we can always use gt_cntfrq_hz, and never need the old GTIMER_SCALE define * set the default value in exactly one place, in the realize fn The aim here is to pave the way for handling the ARMv8.6 requirement that the generic timer frequency is always 1GHz. We're going to do that by having old CPU types keep their legacy-in-QEMU behaviour and having the default for any new CPU types be a 1GHz rather han 62.5MHz cntfrq, so we want the point where the default is decided to be in one place, and in code, not in a DEFINE_PROP_UINT64() initializer. This commit should have no behavioural changes. Signed-off-by: Peter Maydell --- target/arm/internals.h | 7 --- target/arm/cpu.c | 31 +-- target/arm/helper.c| 16 3 files changed, 29 insertions(+), 25 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH 0/3] Remove useless architecture prefix from the CPU list
20.04.2024 08:46, Thomas Huth: Printing an architecture prefix in front of each CPU name is not helpful at all: It is confusing for the users since they don't know whether they have to specify these letters for the "-cpu" parameter, too, and it also takes some precious space in the dense output of the CPU entries. Let's simply remove those now. Thomas Huth (3): target/i386/cpu: Remove "x86" prefix from the CPU list target/s390x/cpu_models: Rework the output of "-cpu help" target/ppc/cpu_init: Remove "PowerPC" prefix from the CPU list Reviewed-by: Michael Tokarev I'll pick it up for trivial-patches after 9.0 is out. This also reminded me about https://gitlab.com/qemu-project/qemu/-/issues/2141 /mjt
Re: [PATCH] tests/unit: Remove debug statements in test-nested-aio-poll.c
On 4/19/24 01:58, Philippe Mathieu-Daudé wrote: We are running this test since almost a year; it is safe to remove its debug statements, which clutter CI jobs output: ▶ 88/100 /nested-aio-poll OK io_read 0x16bb26158 io_poll_true 0x16bb26158 > io_poll_ready io_read 0x16bb26164 < io_poll_ready io_poll_true 0x16bb26158 io_poll_false 0x16bb26164 > io_poll_ready io_poll_false 0x16bb26164 io_poll_false 0x16bb26164 io_poll_false 0x16bb26164 io_poll_false 0x16bb26164 io_poll_false 0x16bb26164 io_poll_false 0x16bb26164 io_poll_false 0x16bb26164 io_poll_false 0x16bb26164 io_poll_false 0x16bb26164 io_read 0x16bb26164 < io_poll_ready 88/100 qemu:unit / test-nested-aio-pollOK Signed-off-by: Philippe Mathieu-Daudé --- tests/unit/test-nested-aio-poll.c | 7 --- 1 file changed, 7 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH v2 0/4] Sparc CPU naming and help text improvements
On 4/19/24 01:48, Thomas Huth wrote: Thomas Huth (4): target/sparc/cpu: Rename the CPU models with a "+" in their names target/sparc/cpu: Avoid spaces by default in the CPU names docs/system/target-sparc: Improve the Sparc documentation docs/about: Deprecate the old "UltraSparc" CPU names that contain a "+" Reviewed-by: Richard Henderson r~
Re: [PATCH 0/3] Remove useless architecture prefix from the CPU list
On 4/19/24 22:46, Thomas Huth wrote: Thomas Huth (3): target/i386/cpu: Remove "x86" prefix from the CPU list target/s390x/cpu_models: Rework the output of "-cpu help" target/ppc/cpu_init: Remove "PowerPC" prefix from the CPU list Reviewed-by: Richard Henderson r~
Re: [PATCH] target/riscv: Use get_address() to get address with Zicbom extensions
On 4/19/24 04:05, Philippe Mathieu-Daudé wrote: We need to use get_address() to get an address from cpu_gpr[], since $zero is "special" (NULL). Fixes: e05da09b7c ("target/riscv: implement Zicbom extension") Reported-by: Zhiwei Jiang (姜智伟) Signed-off-by: Philippe Mathieu-Daudé --- target/riscv/insn_trans/trans_rvzicbo.c.inc | 8 1 file changed, 4 insertions(+), 4 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH 5/8] target/ppc: Move multiply fixed-point insns (64-bit operands) to decodetree.
On 4/19/24 02:25, Chinmay Rath wrote: Hi Richard, On 4/17/24 00:06, Richard Henderson wrote: On 4/15/24 23:39, Chinmay Rath wrote: +static bool trans_MADDHDU(DisasContext *ctx, arg_MADDHDU *a) ... + tcg_gen_movi_i64(t1, 0); Drop the movi. + tcg_gen_add2_i64(t1, cpu_gpr[a->vrt], lo, hi, cpu_gpr[a->rc], t1); Use tcg_constant_i64(0). Looks like tcg_gen_add2_i64 internally modifies the passed arguments, hence constant is not expected. However, I tried using tcg_constant_i64(0) as suggested but this leads to an assert failure : qemu-system-ppc64: ../tcg/tcg.c:5071: tcg_reg_alloc_op: Assertion `!temp_readonly(ts)' failed. You misunderstood my suggestion. TCGv_i64 t1 = tcg_temp_new_i64(); tcg_gen_add2_i64(t1, cpu_gpr[vrt], lo, hi, cpu_gpr[a->rc], tcg_constantant_i64(0)); r~
Re: [PATCH 0/3] Remove useless architecture prefix from the CPU list
On 20.04.24 07:46, Thomas Huth wrote: Printing an architecture prefix in front of each CPU name is not helpful at all: It is confusing for the users since they don't know whether they have to specify these letters for the "-cpu" parameter, too, and it also takes some precious space in the dense output of the CPU entries. Let's simply remove those now. Yes, I also never really understood the purpose. -- Cheers, David / dhildenb
Re: [PATCH v9 13/20] virtio-net: Return an error when vhost cannot enable RSS
On Tue, Apr 16, 2024 at 9:54 AM Akihiko Odaki wrote: > > On 2024/04/16 13:00, Jason Wang wrote: > > On Mon, Apr 15, 2024 at 10:05 PM Yuri Benditovich > > wrote: > >> > >> On Wed, Apr 3, 2024 at 2:11 PM Akihiko Odaki > >> wrote: > >>> > >>> vhost requires eBPF for RSS. When eBPF is not available, virtio-net > >>> implicitly disables RSS even if the user explicitly requests it. Return > >>> an error instead of implicitly disabling RSS if RSS is requested but not > >>> available. > >>> > >>> Signed-off-by: Akihiko Odaki > >>> --- > >>> hw/net/virtio-net.c | 97 > >>> ++--- > >>> 1 file changed, 48 insertions(+), 49 deletions(-) > >>> > >>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > >>> index 61b49e335dea..3d53eba88cfc 100644 > >>> --- a/hw/net/virtio-net.c > >>> +++ b/hw/net/virtio-net.c > >>> @@ -793,9 +793,6 @@ static uint64_t virtio_net_get_features(VirtIODevice > >>> *vdev, uint64_t features, > >>> return features; > >>> } > >>> > >>> -if (!ebpf_rss_is_loaded(&n->ebpf_rss)) { > >>> -virtio_clear_feature(&features, VIRTIO_NET_F_RSS); > >>> -} > >>> features = vhost_net_get_features(get_vhost_net(nc->peer), > >>> features); > >>> vdev->backend_features = features; > >>> > >>> @@ -3591,6 +3588,50 @@ static bool > >>> failover_hide_primary_device(DeviceListener *listener, > >>> return qatomic_read(&n->failover_primary_hidden); > >>> } > >>> > >>> +static void virtio_net_device_unrealize(DeviceState *dev) > >>> +{ > >>> +VirtIODevice *vdev = VIRTIO_DEVICE(dev); > >>> +VirtIONet *n = VIRTIO_NET(dev); > >>> +int i, max_queue_pairs; > >>> + > >>> +if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) { > >>> +virtio_net_unload_ebpf(n); > >>> +} > >>> + > >>> +/* This will stop vhost backend if appropriate. */ > >>> +virtio_net_set_status(vdev, 0); > >>> + > >>> +g_free(n->netclient_name); > >>> +n->netclient_name = NULL; > >>> +g_free(n->netclient_type); > >>> +n->netclient_type = NULL; > >>> + > >>> +g_free(n->mac_table.macs); > >>> +g_free(n->vlans); > >>> + > >>> +if (n->failover) { > >>> +qobject_unref(n->primary_opts); > >>> +device_listener_unregister(&n->primary_listener); > >>> +migration_remove_notifier(&n->migration_state); > >>> +} else { > >>> +assert(n->primary_opts == NULL); > >>> +} > >>> + > >>> +max_queue_pairs = n->multiqueue ? n->max_queue_pairs : 1; > >>> +for (i = 0; i < max_queue_pairs; i++) { > >>> +virtio_net_del_queue(n, i); > >>> +} > >>> +/* delete also control vq */ > >>> +virtio_del_queue(vdev, max_queue_pairs * 2); > >>> +qemu_announce_timer_del(&n->announce_timer, false); > >>> +g_free(n->vqs); > >>> +qemu_del_nic(n->nic); > >>> +virtio_net_rsc_cleanup(n); > >>> +g_free(n->rss_data.indirections_table); > >>> +net_rx_pkt_uninit(n->rx_pkt); > >>> +virtio_cleanup(vdev); > >>> +} > >>> + > >>> static void virtio_net_device_realize(DeviceState *dev, Error **errp) > >>> { > >>> VirtIODevice *vdev = VIRTIO_DEVICE(dev); > >>> @@ -3760,53 +3801,11 @@ static void virtio_net_device_realize(DeviceState > >>> *dev, Error **errp) > >>> > >>> net_rx_pkt_init(&n->rx_pkt); > >>> > >>> -if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) { > >>> -virtio_net_load_ebpf(n); > >>> -} > >>> -} > >>> - > >>> -static void virtio_net_device_unrealize(DeviceState *dev) > >>> -{ > >>> -VirtIODevice *vdev = VIRTIO_DEVICE(dev); > >>> -VirtIONet *n = VIRTIO_NET(dev); > >>> -int i, max_queue_pairs; > >>> - > >>> -if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS)) { > >>> -virtio_net_unload_ebpf(n); > >>> +if (virtio_has_feature(n->host_features, VIRTIO_NET_F_RSS) && > >>> +!virtio_net_load_ebpf(n) && get_vhost_net(nc->peer)) { > >>> +virtio_net_device_unrealize(dev); > >>> +error_setg(errp, "Can't load eBPF RSS for vhost"); > >>> } > >> > >> As I already mentioned, I think this is an extremely bad idea to > >> fail to run qemu due to such a reason as .absence of one feature. > >> What I suggest is: > >> 1. Redefine rss as tri-state (off|auto|on) > >> 2. Fail to run only if rss is on and not available via ebpf > >> 3. On auto - silently drop it > > > > "Auto" might be promatic for migration compatibility which is hard to > > be used by management layers like libvirt. The reason is that there's > > no way for libvirt to know if it is supported by device or not. > > Certainly auto is not good for migration, but it is useful in the other > situations. You can still set "on" or "off" if you care migration. I'll > add "auto" support in the next version. It will be very nice if you take this patch to separate series, all others will pass without questions, I think. Thanks, Yuri Benditovich > > > > > Thanks > > > >> 4. The sam