Re: xenstored file descriptor leak
On Wed, Feb 03, 2021 at 07:18:51AM +0100, Jürgen Groß wrote: > On 02.02.21 19:37, Manuel Bouyer wrote: > > Hello, > > on NetBSD I'm tracking down an issue where xenstored never closes its > > file descriptor to /var/run/xenstored/socket and instead loops at 100% > > CPU on these descriptors. > > > > xenstored loops because poll(2) returns a POLLIN event for these > > descriptors but they are marked is_ignored = true. > > > > I'm seeing this with xen 4.15, 4.13 and has also been reported with 4.11 > > with latest security patches. > > It seems to have started with patches for XSA-115 (A user reported this > > for 4.11) > > > > I've tracked it down to a difference in poll(2) implementation; it seems > > that linux will return something that is not (POLLIN|POLLOUT) when a > > socket if closed; while NetBSD returns POLLIN (this matches the NetBSD's > > man page). > > Yeah, Linux seems to return POLLHUP additionally. Overall, it seems that the poll(2) behavior with non-regular files is highly OS-dependant when it comes to border cases (like a remote close of a socket) > > > > > First I think there may be a security issue here, as, even on linux it > > should > > be possible for a client to cause a socket to go to the is_ignored state, > > while still sending data and cause xenstored to go to a 100% CPU loop. > > No security issue, as sockets are restricted to dom0 and user root. > > In case you are suspecting a security issue, please don't send such > mails to xen-devel in future, but to secur...@lists.xenproject.org. Yes, sorry. Will do next time. > > > I think this is needed anyway: > > > > --- xenstored_core.c.orig 2021-02-02 18:06:33.389316841 +0100 > > +++ xenstored_core.c2021-02-02 19:27:43.761877371 +0100 > > @@ -397,9 +397,12 @@ > > !list_empty(&conn->out_list))) > > *ptimeout = 0; > > } else { > > - short events = POLLIN|POLLPRI; > > - if (!list_empty(&conn->out_list)) > > - events |= POLLOUT; > > + short events = 0; > > + if (!conn->is_ignored) { > > + events |= POLLIN|POLLPRI; > > + if (!list_empty(&conn->out_list)) > > + events |= POLLOUT; > > + } > > conn->pollfd_idx = set_fd(conn->fd, events); > > } > > } > > Yes, I think this is a good idea. Well, after some sleep I don't think it is. We should always keep at last POLLIN or we will never notice a socket close otherwise. > > > > > Now I wonder if, on NetBSD at last, a read error or short read shouldn't > > cause the socket to be closed, as with: > > > > @@ -1561,6 +1565,8 @@ > > bad_client: > > ignore_connection(conn); > > + /* we don't want to keep this connection alive */ > > + talloc_free(conn); > > } > > This is wrong for non-socket connections, as we want to keep the domain > in question to be known to xenstored. > > For socket connections this should be okay, though. What are "non-socket connections" BTW ? I don't think I've seen one in my test. Is there a way to know if a connection is socket or non-socket ? -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
RE: [PATCH] x86/HVM: support emulated UMIP
> From: Jan Beulich > Sent: Friday, January 29, 2021 7:45 PM > > There are three noteworthy drawbacks: > 1) The intercepts we need to enable here are CPL-independent, i.e. we >now have to emulate certain instructions for ring 0. > 2) On VMX there's no intercept for SMSW, so the emulation isn't really >complete there. > 3) The CR4 write intercept on SVM is lower priority than all exception >checks, so we need to intercept #GP. > Therefore this emulation doesn't get offered to guests by default. > > Signed-off-by: Jan Beulich Reviewed-by: Kevin Tian > --- > v3: Don't offer emulation by default. Re-base. > v2: Split off the x86 insn emulator part. Re-base. Use hvm_featureset > in hvm_cr4_guest_reserved_bits(). > > --- a/xen/arch/x86/cpuid.c > +++ b/xen/arch/x86/cpuid.c > @@ -453,6 +453,13 @@ static void __init calculate_hvm_max_pol > __set_bit(X86_FEATURE_X2APIC, hvm_featureset); > > /* > + * Xen can often provide UMIP emulation to HVM guests even if the host > + * doesn't have such functionality. > + */ > +if ( hvm_funcs.set_descriptor_access_exiting ) > +__set_bit(X86_FEATURE_UMIP, hvm_featureset); > + > +/* > * On AMD, PV guests are entirely unable to use SYSENTER as Xen runs in > * long mode (and init_amd() has cleared it out of host capabilities), > but > * HVM guests are able if running in protected mode. > @@ -504,6 +511,10 @@ static void __init calculate_hvm_def_pol > for ( i = 0; i < ARRAY_SIZE(hvm_featureset); ++i ) > hvm_featureset[i] &= hvm_featuremask[i]; > > +/* Don't offer UMIP emulation by default. */ > +if ( !cpu_has_umip ) > +__clear_bit(X86_FEATURE_UMIP, hvm_featureset); > + > guest_common_feature_adjustments(hvm_featureset); > guest_common_default_feature_adjustments(hvm_featureset); > > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -991,7 +991,8 @@ unsigned long hvm_cr4_guest_valid_bits(c > X86_CR4_PCE| > (p->basic.fxsr? X86_CR4_OSFXSR: 0) | > (p->basic.sse ? X86_CR4_OSXMMEXCPT: 0) | > -(p->feat.umip ? X86_CR4_UMIP : 0) | > +((p == &host_cpuid_policy ? &hvm_max_cpuid_policy : p)->feat.umip > + ? X86_CR4_UMIP : 0) | > (vmxe ? X86_CR4_VMXE : 0) | > (p->feat.fsgsbase ? X86_CR4_FSGSBASE : 0) | > (p->basic.pcid? X86_CR4_PCIDE : 0) | > @@ -3731,6 +3732,13 @@ int hvm_descriptor_access_intercept(uint > struct vcpu *curr = current; > struct domain *currd = curr->domain; > > +if ( (is_write || curr->arch.hvm.guest_cr[4] & X86_CR4_UMIP) && > + hvm_get_cpl(curr) ) > +{ > +hvm_inject_hw_exception(TRAP_gp_fault, 0); > +return X86EMUL_OKAY; > +} > + > if ( currd->arch.monitor.descriptor_access_enabled ) > { > ASSERT(curr->arch.vm_event); > --- a/xen/arch/x86/hvm/svm/svm.c > +++ b/xen/arch/x86/hvm/svm/svm.c > @@ -547,6 +547,28 @@ void svm_update_guest_cr(struct vcpu *v, > value &= ~(X86_CR4_SMEP | X86_CR4_SMAP); > } > > +if ( v->domain->arch.cpuid->feat.umip && !cpu_has_umip ) > +{ > +u32 general1_intercepts = vmcb_get_general1_intercepts(vmcb); > + > +if ( v->arch.hvm.guest_cr[4] & X86_CR4_UMIP ) > +{ > +value &= ~X86_CR4_UMIP; > +ASSERT(vmcb_get_cr_intercepts(vmcb) & > CR_INTERCEPT_CR0_READ); > +general1_intercepts |= GENERAL1_INTERCEPT_IDTR_READ | > + GENERAL1_INTERCEPT_GDTR_READ | > + GENERAL1_INTERCEPT_LDTR_READ | > + GENERAL1_INTERCEPT_TR_READ; > +} > +else if ( !v->domain->arch.monitor.descriptor_access_enabled ) > +general1_intercepts &= ~(GENERAL1_INTERCEPT_IDTR_READ | > + GENERAL1_INTERCEPT_GDTR_READ | > + GENERAL1_INTERCEPT_LDTR_READ | > + GENERAL1_INTERCEPT_TR_READ); > + > +vmcb_set_general1_intercepts(vmcb, general1_intercepts); > +} > + > vmcb_set_cr4(vmcb, value); > break; > default: > @@ -883,7 +905,14 @@ static void svm_set_descriptor_access_ex > if ( enable ) > general1_intercepts |= mask; > else > +{ > general1_intercepts &= ~mask; > +if ( (v->arch.hvm.guest_cr[4] & X86_CR4_UMIP) && !cpu_has_umip ) > +general1_intercepts |= GENERAL1_INTERCEPT_IDTR_READ | > + GENERAL1_INTERCEPT_GDTR_READ | > + GENERAL1_INTERCEPT_LDTR_READ | > +
Re: Question about xen and Rasp 4B
On 3.2.2021 2.18, Stefano Stabellini wrote: On Tue, 2 Feb 2021, Jukka Kaartinen wrote: Hi Roman, Good catch. GPU works now and I can start X! Thanks! I was also able to create domU that runs Raspian OS. This is very interesting that you were able to achieve that - congrats! Now, sorry to be a bit dense -- but since this thread went into all sorts of interesting directions all at once -- I just have a very particular question: what is exact combination of versions of Xen, Linux kernel and a set of patches that went on top that allowed you to do that? I'd love to obviously see it productized in Xen upstream, but for now -- I'd love to make available to Project EVE/Xen community since there seems to be a few folks interested in EVE/Xen combo being able to do that. I have tried Xen Release 4.14.0, 4.14.1 and master (from week 4, 2021). Kernel rpi-5.9.y and rpi-5.10.y branches from https://github.com/raspberrypi/linux and U-boot (master). For the GPU to work it was enough to disable swiotlb from the kernel(s) as suggested in this thread. How are you configuring and installing the kernel? make bcm2711_defconfig make Image.gz make modules_install ? The device tree is the one from the rpi-5.9.y build? How are you loading the kernel and device tree with uboot? Do you have any interesting changes to config.txt? I am asking because I cannot get to the point of reproducing what you are seeing: I can boot my rpi-5.9.y kernel on recent Xen but I cannot get any graphics output on my screen. (The serial works.) I am using the default Ubuntu Desktop rpi-install target as rootfs and uboot master. This is what I do: make bcm2711_defconfig cat "xen_additions" >> .config make Image modules dtbs make INSTALL_MOD_PATH=rootfs modules_install depmod -a cp arch/arm64/boot/dts/broadcom/bcm2711-rpi-4-b.dtb boot/ cp arch/arm64/boot/dts/overlays/*.dtbo boot/overlays/ config.txt: [pi4] max_framebuffers=2 enable_uart=1 arm_freq=1500 force_turbo=1 [all] arm_64bit=1 kernel=u-boot.bin start_file=start4.elf fixup_file=fixup4.dat # Enable the audio output, I2C and SPI interfaces on the GPIO header dtparam=audio=on dtparam=i2c_arm=on dtparam=spi=on # Enable the FKMS ("Fake" KMS) graphics overlay, enable the camera firmware # and allocate 128Mb to the GPU memory dtoverlay=vc4-fkms-v3d,cma-64 gpu_mem=128 # Comment out the following line if the edges of the desktop appear outside # the edges of your display disable_overscan=1 boot.source: setenv serverip 10.42.0.1 setenv ipaddr 10.42.0.231 tftpb 0xC0 boot2.scr source 0xC0 boot2.source: tftpb 0xE0 xen tftpb 0x100 Image setenv lin_size $filesize fdt addr ${fdt_addr} fdt resize 1024 fdt set /chosen xen,xen-bootargs "console=dtuart dtuart=serial0 sync_console dom0_mem=1024M dom0_max_vcpus=1 bootscrub=0 vwfi=native sched=credit2" fdt mknod /chosen dom0 # These will break the default framebuffer@3e2fe000 that # is the same chosen -node. #fdt set /chosen/dom0 \#address-cells <0x2> #fdt set /chosen/dom0 \#size-cells <0x2> fdt set /chosen/dom0 compatible "xen,linux-zimage" "xen,multiboot-module" fdt set /chosen/dom0 reg <0x100 0x${lin_size}> fdt set /chosen xen,dom0-bootargs "dwc_otg.lpm_enable=0 console=hvc0 earlycon=xen earlyprintk=xen root=/dev/sda4 elevator=deadline rootwait fixrtc quiet splash" setenv fdt_high 0x fdt print /chosen #xen booti 0xE0 - ${fdt_addr}
Re: Null scheduler and vwfi native problem
Hi again, On Tue, 2021-02-02 at 15:23 +, Julien Grall wrote: > (Adding Andrew, Jan, Juergen for visibility) > Thanks! :-) > On 02/02/2021 15:03, Dario Faggioli wrote: > > On Tue, 2021-02-02 at 07:59 +, Julien Grall wrote: > > > The placement in enter_hypervisor_from_guest() doesn't matter too > > > much, > > > although I would consider to call it as a late as possible. > > > > > Mmmm... Can I ask why? In fact, I would have said "as soon as > > possible". > > Because those functions only access data for the current vCPU/domain. > This is already protected by the fact that the domain is running. > Mmm.. ok, yes, I think it makes sense. > By leaving the "quiesce" mode later, you give an opportunity to the > RCU > to release memory earlier. > Yeah. What I wanted to be sure is that we put the CPU "back in the race" :-) before any current or future use of RCUs. > In reality, it is probably still too early as a pCPU can be > considered > quiesced until a call to rcu_lock*() (such rcu_lock_domain()). > Well, yes, in theory, we could track down which is the first RCU read side crit. section on this path, and put the call right before that (if I understood what you mean). To me, however, this looks indeed too complex and difficult to maintain, not only for 4.15 but in general. E.g., suppose we find such a use of RCUs in function foo() called by bar() called by hypervisor_enter_from_guest(). If someone at some points wants to use RCUs in bar(), how does she know that she should also move the call to rcu_quiet_enter() from foo() to there? So, yes, I'll move it a little down, but still within hypervisor_enter_from_guest(). In the meanwhile, I had a quick chat with Jourgen about x86. In fact, I had a look and was not finding a place where to put the rcu_quiet_{exit,enter}() calls as convenient as you have here on ARM. I.e., two nice C functions that we traverse for all kind of guests, for HVM and SVM, etc. Actually, I was quite skeptical about it but, you know, one can hope! Juergen confirmed that there's no such things, so I'll look at the various entry.S files for the proper spots. Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ --- <> (Raistlin Majere) signature.asc Description: This is a digitally signed message part
[libvirt test] 158973: regressions - FAIL
flight 158973 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/158973/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-amd64-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777 build-arm64-libvirt 6 libvirt-buildfail REGR. vs. 151777 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-qcow2 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a version targeted for testing: libvirt d115019b6a52bfd18cd5ee87cfe8d0811a06d725 baseline version: libvirt 2c846fa6bcc11929c9fb857a22430fb9945654ad Last test of basis 151777 2020-07-10 04:19:19 Z 208 days Failing since151818 2020-07-11 04:18:52 Z 207 days 202 attempts Testing same since 158973 2021-02-03 04:18:52 Z0 days1 attempts People who touched revisions under test: Adolfo Jayme Barrientos Aleksandr Alekseev Andika Triwidada Andrea Bolognani Balázs Meskó Barrett Schonefeld Bastien Orivel Bihong Yu Binfeng Wu Boris Fiuczynski Brian Turek Christian Ehrhardt Christian Schoenebeck Cole Robinson Collin Walling Cornelia Huck Cédric Bosdonnat Côme Borsoi Daniel Henrique Barboza Daniel Letai Daniel P. Berrange Daniel P. Berrangé Dmytro Linkin Eiichi Tsukata Erik Skultety Fabian Affolter Fabian Freyer Fangge Jin Farhan Ali Fedora Weblate Translation Guoyi Tu Göran Uddeborg Halil Pasic Han Han Hao Wang Helmut Grohne Ian Wienand Jamie Strandboge Jamie Strandboge Jan Kuparinen Jean-Baptiste Holcroft Jianan Gao Jim Fehlig Jin Yan Jiri Denemark John Ferlan Jonathan Watt Jonathon Jongsma Julio Faracco Ján Tomko Kashyap Chamarthy Kevin Locke Laine Stump Laszlo Ersek Liao Pingfang Lin Ma Lin Ma Lin Ma Marc Hartmayer Marc-André Lureau Marek Marczykowski-Górecki Markus Schade Martin Kletzander Masayoshi Mizuma Matt Coleman Matt Coleman Mauro Matteo Cascella Meina Li Michal Privoznik Michał Smyk Milo Casagrande Moshe Levi Muha Aliss Neal Gompa Nick Shyrokovskiy Nickys Music Group Nico Pache Nikolay Shirokovskiy Olaf Hering Olesya Gerasimenko Orion Poplawski Patrick Magauran Paulo de Rezende Pinatti Pavel Hrdina Peter Krempa Pino Toscano Pino Toscano Piotr Drąg Prathamesh Chavan Ricky Tigg Roman Bogorodskiy Roman Bolshakov Ryan Gahagan Ryan Schmidt Sam Hartman Scott Shambarger Sebastian Mitterle Shalini Chellathurai Saroja Shaojun Yang Shi Lei Simon Gaiser Stefan Bader Stefan Berger Szymon Scholz Thomas Huth Tim Wiederhake Tomáš Golembiovský Tomáš Janoušek Tuguoyi Wang Xin Weblate Yang Hang Yanqiu Zhang Yi Li Yi Wang Yuri Chornoivan Zheng Chuan zhenwei pi Zhenyu Zheng jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt fail build-arm64-libvirt fail build-armhf-libvirt fail build-i386-libvirt fail build-amd64-pvops
[ovmf test] 158959: all pass - PUSHED
flight 158959 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/158959/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 3f90ac3ec03512e2374cd2968c047a7e856a8965 baseline version: ovmf 3b468095cd3dfcd1aa4ae63bc623f534bc2390d2 Last test of basis 158932 2021-02-02 03:10:23 Z1 days Testing same since 158959 2021-02-02 16:40:59 Z0 days1 attempts People who touched revisions under test: Aiden Park jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 3b468095cd..3f90ac3ec0 3f90ac3ec03512e2374cd2968c047a7e856a8965 -> xen-tested-master
Re: xenstored file descriptor leak
On 02.02.21 19:37, Manuel Bouyer wrote: Hello, on NetBSD I'm tracking down an issue where xenstored never closes its file descriptor to /var/run/xenstored/socket and instead loops at 100% CPU on these descriptors. xenstored loops because poll(2) returns a POLLIN event for these descriptors but they are marked is_ignored = true. I'm seeing this with xen 4.15, 4.13 and has also been reported with 4.11 with latest security patches. It seems to have started with patches for XSA-115 (A user reported this for 4.11) I've tracked it down to a difference in poll(2) implementation; it seems that linux will return something that is not (POLLIN|POLLOUT) when a socket if closed; while NetBSD returns POLLIN (this matches the NetBSD's man page). Yeah, Linux seems to return POLLHUP additionally. First I think there may be a security issue here, as, even on linux it should be possible for a client to cause a socket to go to the is_ignored state, while still sending data and cause xenstored to go to a 100% CPU loop. No security issue, as sockets are restricted to dom0 and user root. In case you are suspecting a security issue, please don't send such mails to xen-devel in future, but to secur...@lists.xenproject.org. I think this is needed anyway: --- xenstored_core.c.orig 2021-02-02 18:06:33.389316841 +0100 +++ xenstored_core.c2021-02-02 19:27:43.761877371 +0100 @@ -397,9 +397,12 @@ !list_empty(&conn->out_list))) *ptimeout = 0; } else { - short events = POLLIN|POLLPRI; - if (!list_empty(&conn->out_list)) - events |= POLLOUT; + short events = 0; + if (!conn->is_ignored) { + events |= POLLIN|POLLPRI; + if (!list_empty(&conn->out_list)) + events |= POLLOUT; + } conn->pollfd_idx = set_fd(conn->fd, events); } } Yes, I think this is a good idea. Now I wonder if, on NetBSD at last, a read error or short read shouldn't cause the socket to be closed, as with: @@ -1561,6 +1565,8 @@ bad_client: ignore_connection(conn); + /* we don't want to keep this connection alive */ + talloc_free(conn); } This is wrong for non-socket connections, as we want to keep the domain in question to be known to xenstored. For socket connections this should be okay, though. Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: application/pgp-keys OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH v1] mm/memory_hotplug: MEMHP_MERGE_RESOURCE -> MHP_MERGE_RESOURCE
> Let's make "MEMHP_MERGE_RESOURCE" consistent with "MHP_NONE", "mhp_t" and > "mhp_flags". As discussed recently [1], "mhp" is our internal > acronym for memory hotplug now. > > [1] > https://lore.kernel.org/linux-mm/c37de2d0-28a1-4f7d-f944-cfd7d81c3...@redhat.com/ > > Cc: Andrew Morton > Cc: "K. Y. Srinivasan" > Cc: Haiyang Zhang > Cc: Stephen Hemminger > Cc: Wei Liu > Cc: "Michael S. Tsirkin" > Cc: Jason Wang > Cc: Boris Ostrovsky > Cc: Juergen Gross > Cc: Stefano Stabellini > Cc: Pankaj Gupta > Cc: Michal Hocko > Cc: Oscar Salvador > Cc: Anshuman Khandual > Cc: Wei Yang > Cc: linux-hyp...@vger.kernel.org > Cc: virtualizat...@lists.linux-foundation.org > Cc: xen-devel@lists.xenproject.org > Signed-off-by: David Hildenbrand > --- > drivers/hv/hv_balloon.c| 2 +- > drivers/virtio/virtio_mem.c| 2 +- > drivers/xen/balloon.c | 2 +- > include/linux/memory_hotplug.h | 2 +- > mm/memory_hotplug.c| 2 +- > 5 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c > index 8c471823a5af..2f776d78e3c1 100644 > --- a/drivers/hv/hv_balloon.c > +++ b/drivers/hv/hv_balloon.c > @@ -726,7 +726,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned > long size, > > nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); > ret = add_memory(nid, PFN_PHYS((start_pfn)), > - (HA_CHUNK << PAGE_SHIFT), > MEMHP_MERGE_RESOURCE); > + (HA_CHUNK << PAGE_SHIFT), MHP_MERGE_RESOURCE); > > if (ret) { > pr_err("hot_add memory failed error is %d\n", ret); > diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c > index 85a272c9978e..148bea39b09a 100644 > --- a/drivers/virtio/virtio_mem.c > +++ b/drivers/virtio/virtio_mem.c > @@ -623,7 +623,7 @@ static int virtio_mem_add_memory(struct virtio_mem *vm, > uint64_t addr, > /* Memory might get onlined immediately. */ > atomic64_add(size, &vm->offline_size); > rc = add_memory_driver_managed(vm->nid, addr, size, vm->resource_name, > - MEMHP_MERGE_RESOURCE); > + MHP_MERGE_RESOURCE); > if (rc) { > atomic64_sub(size, &vm->offline_size); > dev_warn(&vm->vdev->dev, "adding memory failed: %d\n", rc); > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index b57b2067ecbf..671c71245a7b 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -331,7 +331,7 @@ static enum bp_state reserve_additional_memory(void) > mutex_unlock(&balloon_mutex); > /* add_memory_resource() requires the device_hotplug lock */ > lock_device_hotplug(); > - rc = add_memory_resource(nid, resource, MEMHP_MERGE_RESOURCE); > + rc = add_memory_resource(nid, resource, MHP_MERGE_RESOURCE); > unlock_device_hotplug(); > mutex_lock(&balloon_mutex); > > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h > index 3d99de0db2dd..4b834f5d032e 100644 > --- a/include/linux/memory_hotplug.h > +++ b/include/linux/memory_hotplug.h > @@ -53,7 +53,7 @@ typedef int __bitwise mhp_t; > * with this flag set, the resource pointer must no longer be used as it > * might be stale, or the resource might have changed. > */ > -#define MEMHP_MERGE_RESOURCE ((__force mhp_t)BIT(0)) > +#define MHP_MERGE_RESOURCE ((__force mhp_t)BIT(0)) > > /* > * Extended parameters for memory hotplug: > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 710e469fb3a1..ae497e3ff77c 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1153,7 +1153,7 @@ int __ref add_memory_resource(int nid, struct resource > *res, mhp_t mhp_flags) > * In case we're allowed to merge the resource, flag it and trigger > * merging now that adding succeeded. > */ > - if (mhp_flags & MEMHP_MERGE_RESOURCE) > + if (mhp_flags & MHP_MERGE_RESOURCE) > merge_system_ram_resource(res); > > /* online pages if requested */ Reviewed-by: Pankaj Gupta
Re: [PATCH] tools/libxl: only set viridian flags on new domains
On 03/02/2021 04:01, Igor Druzhinin wrote: > Domains migrating or restoring should have viridian HVM param key in > the migration stream already and setting that twice results in Xen > returing -EEXIST on the second attempt later (during migration stream parsing) > in case the values don't match. That causes migration/restore operation > to fail at destination side. > > That issue is now resurfaced by the latest commits (983524671 and 7e5cffcd1e) > extending default viridian feature set making the values from the previous > migration streams and those set at domain construction different. > > Signed-off-by: Igor Druzhinin Suggested-by: Andrew Cooper Igor
[PATCH] tools/libxl: only set viridian flags on new domains
Domains migrating or restoring should have viridian HVM param key in the migration stream already and setting that twice results in Xen returing -EEXIST on the second attempt later (during migration stream parsing) in case the values don't match. That causes migration/restore operation to fail at destination side. That issue is now resurfaced by the latest commits (983524671 and 7e5cffcd1e) extending default viridian feature set making the values from the previous migration streams and those set at domain construction different. Signed-off-by: Igor Druzhinin --- tools/libs/light/libxl_arch.h | 6 -- tools/libs/light/libxl_arm.c | 4 +++- tools/libs/light/libxl_dom.c | 2 +- tools/libs/light/libxl_x86.c | 11 --- 4 files changed, 16 insertions(+), 7 deletions(-) diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h index 6a91775..c305d70 100644 --- a/tools/libs/light/libxl_arch.h +++ b/tools/libs/light/libxl_arch.h @@ -30,8 +30,10 @@ int libxl__arch_domain_save_config(libxl__gc *gc, /* arch specific internal domain creation function */ _hidden -int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config, - uint32_t domid); +int libxl__arch_domain_create(libxl__gc *gc, + libxl_domain_config *d_config, + libxl__domain_build_state *state, + uint32_t domid); /* setup arch specific hardware description, i.e. DTB on ARM */ _hidden diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c index 66e8a06..8c4eda3 100644 --- a/tools/libs/light/libxl_arm.c +++ b/tools/libs/light/libxl_arm.c @@ -126,7 +126,9 @@ int libxl__arch_domain_save_config(libxl__gc *gc, return 0; } -int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config, +int libxl__arch_domain_create(libxl__gc *gc, + libxl_domain_config *d_config, + ibxl__domain_build_state *state, uint32_t domid) { return 0; diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c index 1916857..842a51c 100644 --- a/tools/libs/light/libxl_dom.c +++ b/tools/libs/light/libxl_dom.c @@ -378,7 +378,7 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid, state->store_port = xc_evtchn_alloc_unbound(ctx->xch, domid, state->store_domid); state->console_port = xc_evtchn_alloc_unbound(ctx->xch, domid, state->console_domid); -rc = libxl__arch_domain_create(gc, d_config, domid); +rc = libxl__arch_domain_create(gc, d_config, state, domid); /* Construct a CPUID policy, but only for brand new domains. Domains * being migrated-in/restored have CPUID handled during the diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c index 91a9fc7..58187ed 100644 --- a/tools/libs/light/libxl_x86.c +++ b/tools/libs/light/libxl_x86.c @@ -453,8 +453,10 @@ static int hvm_set_conf_params(libxl__gc *gc, uint32_t domid, return ret; } -int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config, -uint32_t domid) +int libxl__arch_domain_create(libxl__gc *gc, + libxl_domain_config *d_config, + libxl__domain_build_state *state, + uint32_t domid) { const libxl_domain_build_info *info = &d_config->b_info; int ret = 0; @@ -466,7 +468,10 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config, (ret = hvm_set_conf_params(gc, domid, info)) != 0) goto out; -if (info->type == LIBXL_DOMAIN_TYPE_HVM && +/* Viridian flags are already a part of the migration stream so set + * them here only for brand new domains. */ +if (!state->restore && +info->type == LIBXL_DOMAIN_TYPE_HVM && (ret = hvm_set_viridian_features(gc, domid, info)) != 0) goto out; -- 2.7.4
Re: Question about xen and Rasp 4B
On Tue, Feb 02, 2021 at 04:18:44PM -0800, Stefano Stabellini wrote: > How are you configuring and installing the kernel? > > make bcm2711_defconfig > make Image.gz > make modules_install > > ? > > The device tree is the one from the rpi-5.9.y build? How are you loading > the kernel and device tree with uboot? Do you have any interesting > changes to config.txt? > > I am asking because I cannot get to the point of reproducing what you > are seeing: I can boot my rpi-5.9.y kernel on recent Xen but I cannot > get any graphics output on my screen. (The serial works.) I am using the > default Ubuntu Desktop rpi-install target as rootfs and uboot master. I've been trying with various pieces from various sources trying to get things to work. Since my goal has been a Debian-variant I use Debian-packaged versions of things if at all possible. Sticking to packaged versions is more maintainable over the longer run. My starting point was SuSE Raspberry PI 4B installation medium. I'm still using pieces from SuSE's installation. Notably SuSE's overlays have worked rather better than RPF or kernel versions of device-tree overlays. Debian's u-boot-rpi:arm64 package is functional. As such that provides u-boot.bin which is loaded via config.txt as the kernel. Debian's grub-efi-arm64 package is also functional. Installing that is a bit funky as U-Boot's EFI environment is incomplete. Nonetheless it is simply an issue of having that installed in EFI/BOOT as the primary boot entry, rather than EFI/Debian where it would normally install. The base device-tree files from the RPF kernel function reasonably well (unlike the overlays). I'm actually doing `make O= bcm2711_defconfig menuconfig bindeb-pkg` and then installing the resultant package. This places bcm2711-rpi-4-b.dtb in /usr/lib/linux-image-/broadcom/bcm2711-rpi-4-b.dtb, I'm presently copying that into the Raspberry PI boot area. If you're unable to get graphics output, note the instruction that HDMI MUST be plugged in *during* *boot*. Broadcom's chips have the graphics core is control of rather more than one might expect (Qualcomm follows this pattern by wanting their modems in control). In fact I've observed I need my monitor displaying the input from the RP4 in order for it to complete the handshake and have the RP4 do graphics. -- (\___(\___(\__ --=> 8-) EHM <=-- __/)___/)___/) \BS (| ehem+sig...@m5p.com PGP 87145445 |) / \_CS\ | _ -O #include O- _ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
[linux-linus test] 158949: regressions - FAIL
flight 158949 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/158949/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-arm64-arm64-xl-seattle broken in 158915 test-arm64-arm64-xl broken in 158915 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-ws16-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemut-rhel6hvm-intel 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-xsm7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemuu-rhel6hvm-intel 7 xen-install fail REGR. vs. 152332 test-amd64-i386-examine 6 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-coresched-i386-xl 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemuu-rhel6hvm-amd 7 xen-installfail REGR. vs. 152332 test-amd64-i386-qemut-rhel6hvm-amd 7 xen-installfail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332 test-amd64-i386-xl-qemut-ws16-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-freebsd10-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-raw7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-pvshim 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-shadow 7 xen-install fail REGR. vs. 152332 test-amd64-i386-freebsd10-i386 7 xen-installfail REGR. vs. 152332 test-amd64-i386-xl-qemuu-win7-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-win7-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-ovmf-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332 test-arm64-arm64-libvirt-xsm 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-xl-seattle 12 debian-install fail REGR. vs. 152332 test-arm64-arm64-examine 13 examine-iommufail REGR. vs. 152332 test-amd64-amd64-amd64-pvgrub 20 guest-stop fail REGR. vs. 152332 test-arm64-arm64-xl-thunderx 14 guest-start fail REGR. vs. 152332 test-armhf-armhf-xl-multivcpu 14 guest-start fail REGR. vs. 152332 test-amd64-amd64-i386-pvgrub 20 guest-stop fail REGR. vs. 152332 test-armhf-armhf-xl-credit1 14 guest-start fail REGR. vs. 152332 test-armhf-armhf-xl-arndale 14 guest-start fail REGR. vs. 152332 test-amd64-i386-xl-qemut-debianhvm-amd64 7 xen-install fail REGR. vs. 152332 test-armhf-armhf-libvirt 14 guest-start fail REGR. vs. 152332 test-armhf-armhf-xl-cubietruck 14 guest-startfail REGR. vs. 152332 test-armhf-armhf-xl-credit2 14 guest-start fail REGR. vs. 152332 test-arm64-arm64-xl 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-xl-xsm 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-xl-vhd 13 guest-start fail REGR. vs. 152332 test-armhf-armhf-xl 14 guest-start fail REGR. vs. 152332 test-armhf-armhf-libvirt-raw 13 guest-start fail REGR. vs. 152332 test-arm64-arm64-xl-credit1 10 host-ping-check-xen fail in 158915 REGR. vs. 152332 Tests which are failing intermittently (not blocking): test-arm64-arm64-xl-credit2 10 host-ping-check-xen fail in 158915 pass in 158949 test-arm64-arm64-xl-credit1 8 xen-boot fail pass in 158915 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 14 guest-start fail REGR. vs. 152332 Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-seattle 5 host-install(5) broken in 158915 blocked in 152332 test-arm64-arm64-xl 5 host-install(5) broken in 158915 blocked in 152332 test-arm64-arm64-xl-credit2 11 leak-check/ba
[qemu-mainline test] 158940: regressions - FAIL
flight 158940 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/158940/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-arm64-arm64-xl-credit1 broken in 158901 test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat fail REGR. vs. 152631 test-amd64-amd64-xl-qcow2 21 guest-start/debian.repeat fail REGR. vs. 152631 test-armhf-armhf-xl-vhd 17 guest-start/debian.repeat fail in 158879 REGR. vs. 152631 Tests which are failing intermittently (not blocking): test-armhf-armhf-libvirt-raw 13 guest-startfail pass in 158879 test-armhf-armhf-xl-vhd 13 guest-startfail pass in 158879 test-amd64-amd64-xl-xsm 17 guest-saverestore fail pass in 158901 Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-credit1 5 host-install(5) broken in 158901 blocked in 152631 test-armhf-armhf-libvirt-raw 15 saverestore-support-check fail in 158879 like 152631 test-armhf-armhf-xl-vhd 14 migrate-support-check fail in 158879 never pass test-armhf-armhf-xl-vhd 15 saverestore-support-check fail in 158879 never pass test-armhf-armhf-libvirt-raw 14 migrate-support-check fail in 158879 never pass test-armhf-armhf-xl-rtds 18 guest-start/debian.repeat fail in 158901 like 152631 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152631 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 152631 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 152631 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152631 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 152631 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass version targeted for testing: qemuu74208cd252c5da9d867270a178799abd802b9338 baseline version: qemuu1d806cef0e38b5db8347a8e12f214d543204a31
Re: [PATCH] xen/arm: domain_build: Ignore device nodes with invalid addresses
On Tue, 2 Feb 2021, Julien Grall wrote: > On 02/02/2021 18:12, Julien Grall wrote: > > On 02/02/2021 17:47, Elliott Mitchell wrote: > > > The handle_device() function has been returning failure upon > > > encountering a device address which was invalid. A device tree which > > > had such an entry has now been seen in the wild. As it causes no > > > failures to simply ignore the entries, ignore them. > > > > Signed-off-by: Elliott Mitchell > > > > > > --- > > > I'm starting to suspect there are an awful lot of places in the various > > > domain_build.c files which should simply ignore errors. This is now the > > > second place I've encountered in 2 months where ignoring errors was the > > > correct action. > > > > Right, as a counterpoint, we run Xen on Arm HW for several years now and > > this is the first time I heard about issue parsing the DT. So while I > > appreciate that you are eager to run Xen on the RPI... > > > > > I know failing in case of error is an engineer's > > > favorite approach, but there seem an awful lot of harmless failures > > > causing panics. > > > > > > This started as the thread "[RFC PATCH] xen/arm: domain_build: Ignore > > > empty memory bank". Now it seems clear the correct approach is to simply > > > ignore these entries. > > > > ... we first need to fully understand the issues. Here a few questions: > > 1) Can you provide more information why you believe the address is > > invalid? > > 2) How does Linux use the node? > > 3) Is it happening with all the RPI DT? If not, what are the > > differences? > > So I had another look at the device-tree you provided earlier on. The node is > the following (copied directly from the DTS): > > &pcie0 { > pci@1,0 { > #address-cells = <3>; > #size-cells = <2>; > ranges; > > reg = <0 0 0 0 0>; > > usb@1,0 { > reg = <0x1 0 0 0 0>; > resets = <&reset RASPBERRYPI_FIRMWARE_RESET_ID_USB>; > }; > }; > }; > > pcie0: pcie@7d50 { >compatible = "brcm,bcm2711-pcie"; >reg = <0x0 0x7d50 0x0 0x9310>; >device_type = "pci"; >#address-cells = <3>; >#interrupt-cells = <1>; >#size-cells = <2>; >interrupts = , > ; >interrupt-names = "pcie", "msi"; >interrupt-map-mask = <0x0 0x0 0x0 0x7>; >interrupt-map = <0 0 0 1 &gicv2 GIC_SPI 143 > IRQ_TYPE_LEVEL_HIGH>; >msi-controller; >msi-parent = <&pcie0>; > >ranges = <0x0200 0x0 0xc000 0x6 0x > 0x0 0x4000>; >/* > * The wrapper around the PCIe block has a bug > * preventing it from accessing beyond the first 3GB of > * memory. > */ >dma-ranges = <0x0200 0x0 0x 0x0 0x > 0x0 0xc000>; >brcm,enable-ssc; > }; > > The interpretation of "reg" depends on the context. In this case, we are > trying to interpret as a memory address from the CPU PoV when it has a > different meaning (I am not exactly sure what). > > In fact, you are lucky that Xen doesn't manage to interpret it. Xen should > really stop trying to look region to map when it discover a PCI bus. I wrote a > quick hack patch that should ignore it: Yes, I think you are right. There are a few instances where "reg" is not a address ready to be remapped. It is not just PCI, although that's the most common. Maybe we need a list, like skip_matches in handle_node. > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c > index 374bf655ee34..937fd1e387b7 100644 > --- a/xen/arch/arm/domain_build.c > +++ b/xen/arch/arm/domain_build.c > @@ -1426,7 +1426,7 @@ static int __init handle_device(struct domain *d, struct > dt_device_node *dev, > > static int __init handle_node(struct domain *d, struct kernel_info *kinfo, >struct dt_device_node *node, > - p2m_type_t p2mt) > + p2m_type_t p2mt, bool pci_bus) > { > static const struct dt_device_match skip_matches[] __initconst = > { > @@ -1532,9 +1532,14 @@ static int __init handle_node(struct domain *d, struct > kernel_info *kinfo, > "WARNING: Path %s is reserved, skip the node as we may re-use > the path.\n", > path); > > -res = handle_device(d, node, p2mt); > -if ( res) > -return res; > +if ( !pci_bus ) > +{ > +res = handle_device(d, node, p2mt); > +if ( res) > + return res; > + > +pci_bus = dt_device_type_is_equal(node, "pci"); > +} > > /* > * The property "name" is used to have a different name on older FDT > @@ -1554,7 +1559,7 @@ static int __init handle_node(struct domain *d, struct > kernel_info *kinfo, > > for ( child = node->child; child != NULL; child = child->sibling ) > { >
Re: Question about xen and Rasp 4B
On Tue, 2 Feb 2021, Jukka Kaartinen wrote: > Hi Roman, > > > > > > > > Good catch. > > > GPU works now and I can start X! Thanks! I was also able to create domU > > > that runs Raspian OS. > > > > This is very interesting that you were able to achieve that - congrats! > > > > Now, sorry to be a bit dense -- but since this thread went into all > > sorts of interesting > > directions all at once -- I just have a very particular question: what is > > exact > > combination of versions of Xen, Linux kernel and a set of patches that went > > on top that allowed you to do that? I'd love to obviously see it > > productized in Xen > > upstream, but for now -- I'd love to make available to Project EVE/Xen > > community > > since there seems to be a few folks interested in EVE/Xen combo being able > > to > > do that. > > I have tried Xen Release 4.14.0, 4.14.1 and master (from week 4, 2021). > > Kernel rpi-5.9.y and rpi-5.10.y branches from > https://github.com/raspberrypi/linux > > and > > U-boot (master). > > For the GPU to work it was enough to disable swiotlb from the kernel(s) as > suggested in this thread. How are you configuring and installing the kernel? make bcm2711_defconfig make Image.gz make modules_install ? The device tree is the one from the rpi-5.9.y build? How are you loading the kernel and device tree with uboot? Do you have any interesting changes to config.txt? I am asking because I cannot get to the point of reproducing what you are seeing: I can boot my rpi-5.9.y kernel on recent Xen but I cannot get any graphics output on my screen. (The serial works.) I am using the default Ubuntu Desktop rpi-install target as rootfs and uboot master.
[RFC PATCH v2] docs/design: boot domain device tree design
This is a Request For Comments on the adoption of Device Tree as the format for the Launch Control Module as described in the previously posted DomB RFC. For RFC purposes, a rendered of this file can be found here: ihttps://drive.google.com/file/d/1l3fo4FylvZCQs1V00DcwifyLjl5Jw3W8/view?usp=sharing Details on the DomB boot domain can be found on Xen wiki: https://wiki.xenproject.org/wiki/DomB_mode_of_dom0less Signed-off-by: Daniel P. Smith Signed-off-by: Christopher Clark Version 2 - - cleaned up wording - updated example to reflect a real configuration - add explanation of mb2 modules that would be loaded - add the config node --- docs/designs/boot-domain-device-tree.rst | 235 +++ 1 file changed, 235 insertions(+) create mode 100644 docs/designs/boot-domain-device-tree.rst diff --git a/docs/designs/boot-domain-device-tree.rst b/docs/designs/boot-domain-device-tree.rst new file mode 100644 index 00..558d75a796 --- /dev/null +++ b/docs/designs/boot-domain-device-tree.rst @@ -0,0 +1,235 @@ + +Xen Boot Domain Device Tree Bindings + + +The Xen Boot Domain device tree adopts the dom0less device tree structure and +extends it to meet the requirements for the Boot Domain capability. The primary +difference is the introduction of the ``xen`` node that is under the ``/chosen`` +node. The move to a dedicated node was driven by: + +1. Reduces the need to walk over nodes that are not of interest, e.g. only +nodes of interest should be in ``/chosen/xen`` + +2. Enables the use of the ``#address-cells`` and ``#size-cells`` fields on the +xen node. + +3. Allows for the domain construction information to easily be sanitized by +simple removing the ``/chosen/xen`` node. + +Below is an example device tree definition for a xen node followed by an +explanation of each section and field: +:: +xen { +#address-cells = <1>; +#size-cells = <0>; + +// Configuration container +config@0 { +#address-cells = <1>; +#size-cells = <0>; +compatible = "xen,config"; + +// reg is required but ignored for "xen,config" node +reg = <0>; + +module@1 { +compatible = "multiboot,microcode", "multiboot,module"; +reg = <1>; +}; + +module@2 { +compatible = "multiboot,xsm-policy", "multiboot,module"; +reg = <2>; +}; +}; + +// Boot Domain definition +domain@0x7FF5 { +#address-cells = <1>; +#size-cells = <0>; +compatible = "xen,domain"; + +reg = <0x7FF5>; +memory = <0x0 0x2>; +cpus = <1>; +module@3 { +compatible = "multiboot,kernel", "multiboot,module"; +reg = <3>; +}; + +module@4 { +compatible = "multiboot,ramdisk", "multiboot,module"; +reg = <4>; +}; +module@5 { +compatible = "multiboot,config", "multiboot,module"; +reg = <5>; +}; + +// Classic Dom0 definition +domain@0 { +#address-cells = <1>; +#size-cells = <0>; +compatible = "xen,domain"; + +reg = <0>; + +// PERMISSION_NONE (0) +// PERMISSION_CONTROL (1 << 0) +// PERMISSION_HARDWARE (1 << 1) +permissions = <3>; + +// FUNCTION_NONE(0) +// FUNCTION_BOOT(1 << 1) +// FUNCTION_CRASH (1 << 2) +// FUNCTION_CONSOLE (1 << 3) +// FUNCTION_XENSTORE(1 << 30) +// FUNCTION_LEGACY_DOM0 (1 << 31) +functions = <0x>; + +// MODE_PARAVIRTUALIZED (1 << 0) /* PV | PVH/HVM */ +// MODE_ENABLE_DEVICE_MODEL (1 << 1) /* HVM | PVH */ +// MODE_LONG(1 << 2) /* 64 BIT | 32 BIT */ +mode = <5>; /* 64 BIT, PV */ + +// UUID +domain-handle = [B3 FB 98 FB 8F 9F 67 A3]; + +cpus = <1>; +memory = <0x0 0x2>; +security-id = <0>; + +module@6 { +compatible = "multiboot,kernel", "multiboot,module"; +reg = <6>; +bootargs = "console=hvc0"; +}; +module@7 { +compatible = "multiboot,ramdisk", "multiboot,module"; +reg = <7>; +}; +}; + +The multiboot modules that would be supplied when using the above config would +be, in order: + - (the above config, compiled) + - CPU microcode + - XSM policy + - kernel for boot domain + - ramdisk for boot domain + - boot domain configuration file + - kernel for the classic dom0 domain + -
Re: [PATCH v9 00/11] acquire_resource size and external IPT monitoring
On 02/02/2021 12:20, Ian Jackson wrote: > Andrew Cooper writes ("[PATCH v9 00/11] acquire_resource size and external > IPT monitoring"): > ... >> Therefore, I'd like to request a release exception. > Thanks for this writeup. > > There is discussion here of the upside of granting an exception, which > certainly seems substantial enough to give this serious consideration. > >> It [is] fairly isolated in terms of interactions with the rest of >> Xen, so the changes of a showstopper affecting other features is >> very slim. > This is encouraging (optimistic, even) but very general. I would like > to see a frank and detailed assessment of the downside risks, ideally > based on analysis of the individual patches. > > When I say a "frank and detailed assessment" I'm hoping to have a list > of the specific design and code changes that pose a risk to non-IPT > configurations, in decreasing order of risk. > > For each one there should be brief discussion of the measures that > have exist to control that risk (eg, additional review, additional > testing), and a characterisation of the resulting risk (both in terms > of likelihood and severity of the resulting bug). > > All risks that would come to a diligent reviewer's mind should be > mentioned and explicitly delath with, even if it is immediately clear > that they are not a real risk. > > Do you think that would be feasible ? We would want to make a > decision ASAP so it would have to be done quickly too - in the next > few days and certainly by the end of the week. Honestly, I think this is an unreasonably large paperwork expectation, particularly for changes this-clearly isolated in terms of functionality. I'm going to explicitly disregard build/compile issues because we're not even at code freeze or -rc1 yet, with multiple weeks yet before any potential release, and loads of tooling. Patch 2 adds a new domain creation parameter, which is an internal tools/xen interface. Default is off, and it needs explicit opting in to (patch 3), so it will get all the testing it needs in an OSSTest smoke run. This patch does introduce one new use of a preexisting refcounting pattern currently under discussion. It many leak memory in theoretical circumstances, not practical ones. Work to figure out how to unbreak this pattern is in progress, and orthogonal, and needs applying uniformly Patch 4 adds a new resource type, which is an API/ABI with userspace. It is a new type/index so has no current users. Patch 5 adds enumeration for the IPT feature in hardware, as well as context switching logic. All context switching changes are behind an opt-in flag, so a smoke run will be sufficient to prove no adverse interaction in !vmtrace case. Patch 6 adds a new domctl and subops for controlling vmtrace. All brand new functionality with no users, and bounded by the opt-ins from patch 2 and 5. Patch 7 adds libxc library functions wrapping the domctl of patch 6. No users. Patch 8 is example code demonstrating how to use all of the new functionality. It is built, but not installed. Patch 9 extends the existing VMFork feature to cope with VMs configured with this new functionality. It is a no-op for regular VMs. Patch 10 extends vm_event requests with additional optional metadata about the tail pointer of data in the vmtrace buffer. Doesn't alter the behaviour for regular VMs. Patch 11 extends vm_event responses with an optional request to reset the vmtrace buffer position. No users, and a no-op for regular VMs. All of this new functionality is off-by-default and needs an explicit opt in, for any behavioural changes to occur. While there is no guarantee that the implementation of the new functionality is perfect, the development of it has found and fixed a whole slew bugs elsewhere in Xen, and the new functionality does have extensive testing itself. ~Andrew
Re: [PATCH] xen/arm: domain_build: Ignore device nodes with invalid addresses
On 02/02/2021 18:12, Julien Grall wrote: Hi, On 02/02/2021 17:47, Elliott Mitchell wrote: The handle_device() function has been returning failure upon encountering a device address which was invalid. A device tree which had such an entry has now been seen in the wild. As it causes no failures to simply ignore the entries, ignore them. > Signed-off-by: Elliott Mitchell --- I'm starting to suspect there are an awful lot of places in the various domain_build.c files which should simply ignore errors. This is now the second place I've encountered in 2 months where ignoring errors was the correct action. Right, as a counterpoint, we run Xen on Arm HW for several years now and this is the first time I heard about issue parsing the DT. So while I appreciate that you are eager to run Xen on the RPI... I know failing in case of error is an engineer's favorite approach, but there seem an awful lot of harmless failures causing panics. This started as the thread "[RFC PATCH] xen/arm: domain_build: Ignore empty memory bank". Now it seems clear the correct approach is to simply ignore these entries. ... we first need to fully understand the issues. Here a few questions: 1) Can you provide more information why you believe the address is invalid? 2) How does Linux use the node? 3) Is it happening with all the RPI DT? If not, what are the differences? So I had another look at the device-tree you provided earlier on. The node is the following (copied directly from the DTS): &pcie0 { pci@1,0 { #address-cells = <3>; #size-cells = <2>; ranges; reg = <0 0 0 0 0>; usb@1,0 { reg = <0x1 0 0 0 0>; resets = <&reset RASPBERRYPI_FIRMWARE_RESET_ID_USB>; }; }; }; pcie0: pcie@7d50 { compatible = "brcm,bcm2711-pcie"; reg = <0x0 0x7d50 0x0 0x9310>; device_type = "pci"; #address-cells = <3>; #interrupt-cells = <1>; #size-cells = <2>; interrupts = , ; interrupt-names = "pcie", "msi"; interrupt-map-mask = <0x0 0x0 0x0 0x7>; interrupt-map = <0 0 0 1 &gicv2 GIC_SPI 143 IRQ_TYPE_LEVEL_HIGH>; msi-controller; msi-parent = <&pcie0>; ranges = <0x0200 0x0 0xc000 0x6 0x 0x0 0x4000>; /* * The wrapper around the PCIe block has a bug * preventing it from accessing beyond the first 3GB of * memory. */ dma-ranges = <0x0200 0x0 0x 0x0 0x 0x0 0xc000>; brcm,enable-ssc; }; The interpretation of "reg" depends on the context. In this case, we are trying to interpret as a memory address from the CPU PoV when it has a different meaning (I am not exactly sure what). In fact, you are lucky that Xen doesn't manage to interpret it. Xen should really stop trying to look region to map when it discover a PCI bus. I wrote a quick hack patch that should ignore it: diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 374bf655ee34..937fd1e387b7 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1426,7 +1426,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev, static int __init handle_node(struct domain *d, struct kernel_info *kinfo, struct dt_device_node *node, - p2m_type_t p2mt) + p2m_type_t p2mt, bool pci_bus) { static const struct dt_device_match skip_matches[] __initconst = { @@ -1532,9 +1532,14 @@ static int __init handle_node(struct domain *d, struct kernel_info *kinfo, "WARNING: Path %s is reserved, skip the node as we may re-use the path.\n", path); -res = handle_device(d, node, p2mt); -if ( res) -return res; +if ( !pci_bus ) +{ +res = handle_device(d, node, p2mt); +if ( res) + return res; + +pci_bus = dt_device_type_is_equal(node, "pci"); +} /* * The property "name" is used to have a different name on older FDT @@ -1554,7 +1559,7 @@ static int __init handle_node(struct domain *d, struct kernel_info *kinfo, for ( child = node->child; child != NULL; child = child->sibling ) { -res = handle_node(d, kinfo, child, p2mt); +res = handle_node(d, kinfo, child, p2mt, pci_bus); if ( res ) return res; } @@ -2192,7 +2197,7 @@ static int __init prepare_dtb_hwdom(struct domain *d, struct kernel_info *kinfo) fdt_finish_reservemap(kinfo->fdt); -ret = handle_node(d, kinfo, dt_host, default_p2mt); +ret = handle_node(d, kinfo, dt_host, default_p2mt, false); if ( ret ) goto err; A less hackish possibility would be to modify dt_number_of_address() and return 0 when the device is
[PATCH for-4.15] tools/tests: Introduce a test for acquire_resource
For now, simply try to map 40 frames of grant table. This catches most of the basic errors with resource sizes found and fixed through the 4.15 dev window. Signed-off-by: Andrew Cooper --- CC: Ian Jackson CC: Wei Liu CC: Jan Beulich CC: Roger Pau Monné CC: Wei Liu CC: Stefano Stabellini CC: Julien Grall CC: Volodymyr Babchuk CC: Oleksandr Fails against current staging: XENMEM_acquire_resource tests Test x86 PV d7: grant table Fail: Map 7 - Argument list too long Test x86 PVH d8: grant table Fail: Map 7 - Argument list too long The fix has already been posted: [PATCH v9 01/11] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource and the fixed run is: XENMEM_acquire_resource tests Test x86 PV d7: grant table Test x86 PVH d8: grant table ARM folk: would you mind testing this? I'm pretty sure the create parameters are suitable, but I don't have any way to test this. I've got more plans for this, but insufficient time right now. --- tools/tests/Makefile | 1 + tools/tests/resource/.gitignore | 1 + tools/tests/resource/Makefile| 40 ++ tools/tests/resource/test-resource.c | 138 +++ 4 files changed, 180 insertions(+) create mode 100644 tools/tests/resource/.gitignore create mode 100644 tools/tests/resource/Makefile create mode 100644 tools/tests/resource/test-resource.c diff --git a/tools/tests/Makefile b/tools/tests/Makefile index fc9b715951..c45b5fbc1d 100644 --- a/tools/tests/Makefile +++ b/tools/tests/Makefile @@ -2,6 +2,7 @@ XEN_ROOT = $(CURDIR)/../.. include $(XEN_ROOT)/tools/Rules.mk SUBDIRS-y := +SUBDIRS-y := resource SUBDIRS-$(CONFIG_X86) += cpu-policy SUBDIRS-$(CONFIG_X86) += mce-test ifneq ($(clang),y) diff --git a/tools/tests/resource/.gitignore b/tools/tests/resource/.gitignore new file mode 100644 index 00..4872e97d4b --- /dev/null +++ b/tools/tests/resource/.gitignore @@ -0,0 +1 @@ +test-resource diff --git a/tools/tests/resource/Makefile b/tools/tests/resource/Makefile new file mode 100644 index 00..8a3373e786 --- /dev/null +++ b/tools/tests/resource/Makefile @@ -0,0 +1,40 @@ +XEN_ROOT = $(CURDIR)/../../.. +include $(XEN_ROOT)/tools/Rules.mk + +TARGET := test-resource + +.PHONY: all +all: $(TARGET) + +.PHONY: run +run: $(TARGET) + ./$(TARGET) + +.PHONY: clean +clean: + $(RM) -f -- *.o .*.d .*.d2 $(TARGET) + +.PHONY: distclean +distclean: clean + $(RM) -f -- *~ + +.PHONY: install +install: all + +.PHONY: uninstall +uninstall: + +CFLAGS += -Werror -D__XEN_TOOLS__ +CFLAGS += $(CFLAGS_xeninclude) +CFLAGS += $(CFLAGS_libxenctrl) +CFLAGS += $(CFLAGS_libxenforeginmemory) +CFLAGS += $(APPEND_CFLAGS) + +LDFLAGS += $(LDLIBS_libxenctrl) +LDFLAGS += $(LDLIBS_libxenforeignmemory) +LDFLAGS += $(APPEND_LDFLAGS) + +test-resource: test-resource.o + $(CC) $(LDFLAGS) -o $@ $< + +-include $(DEPS_INCLUDE) diff --git a/tools/tests/resource/test-resource.c b/tools/tests/resource/test-resource.c new file mode 100644 index 00..81a2a5cd12 --- /dev/null +++ b/tools/tests/resource/test-resource.c @@ -0,0 +1,138 @@ +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +static unsigned int nr_failures; +#define fail(fmt, ...) \ +({ \ +nr_failures++; \ +(void)printf(fmt, ##__VA_ARGS__); \ +}) + +static xc_interface *xch; +static xenforeignmemory_handle *fh; +static xendevicemodel_handle *dh; + +static void test_gnttab(uint32_t domid, unsigned int nr_frames) +{ +xenforeignmemory_resource_handle *res; +void *addr = NULL; +size_t size; +int rc; + +printf(" d%u: grant table\n", domid); + +rc = xenforeignmemory_resource_size( +fh, domid, XENMEM_resource_grant_table, +XENMEM_resource_grant_table_id_shared, &size); +if ( rc ) +return fail("Fail: Get size: %d - %s\n", errno, strerror(errno)); + +if ( (size >> XC_PAGE_SHIFT) != nr_frames ) +return fail("Fail: Get size: expected %u frames, got %zu\n", +nr_frames, size >> XC_PAGE_SHIFT); + +res = xenforeignmemory_map_resource( +fh, domid, XENMEM_resource_grant_table, +XENMEM_resource_grant_table_id_shared, 0, size >> XC_PAGE_SHIFT, +&addr, PROT_READ | PROT_WRITE, 0); +if ( !res ) +return fail("Fail: Map %d - %s\n", errno, strerror(errno)); + +rc = xenforeignmemory_unmap_resource(fh, res); +if ( rc ) +return fail("Fail: Unmap %d - %s\n", errno, strerror(errno)); +} + +static void test_domain_configurations(void) +{ +static struct test { +const char *name; +struct xen_domctl_createdomain create; +} tests[] = { +#if defined(__x86_64__) || defined(__i386__) +{ +.name = "x86 PV", +.create = { +
Re: [PATCH v3 0/3] Generic SMMU Bindings
Hi Stefano, On 02/02/2021 18:27, Stefano Stabellini wrote: On Tue, 2 Feb 2021, Julien Grall wrote: On 02/02/2021 17:44, Stefano Stabellini wrote: On Tue, 2 Feb 2021, Rahul Singh wrote: Hello Stefano, On 26 Jan 2021, at 10:58 pm, Stefano Stabellini wrote: Hi all, This series introduces support for the generic SMMU bindings to xen/drivers/passthrough/arm/smmu.c. The last version of the series was https://marc.info/?l=xen-devel&m=159539053406643 I realize that it is late for 4.15 -- I think it is OK if this series goes in afterwards. I tested the series on the Juno board it is woking fine. I found one issue in SMMU driver while testing this series that is not related to this series but already existing issue in SMMU driver. If there are more than one device behind SMMU and they share the same Stream-Id, SMMU driver is creating the new SMR entry without checking the already configured SMR entry if SMR entry correspond to stream-id is already configured. Because of this I observed the stream match conflicts on Juno board. (XEN) smmu: /iommu@7fb3: Unexpected global fault, this could be serious (XEN) smmu: /iommu@7fb3:GFSR 0x0004, GFSYNR0 0x0006, GFSYNR1 0x, GFSYNR2 0x Below two patches is required to be ported to Xen to fix the issue from Linux driver. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=1f3d5ca43019bff1105838712d55be087d93c0da https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=21174240e4f4439bb8ed6c116cdbdc03eba2126e Good catch and thanks for the pointers! Do you have any interest in backporting these two patches or should I put them on my TODO list? Unrelated to who does the job, we should discuss if it makes sense to try to fix the bug for 4.15. The patches don't seem trivial so I am tempted to say that it might be best to leave the bug unfixed for 4.15 and fix it later. SMMU support on Juno is not that interesting because IIRC the stream-ID is the same for all the devices. So it is all or nothing passthrough. For other HW, this may be a useful feature. Yet we would need a way to group the devices for passthrough. In this context, I would consider it more a feature than a bug because the SMMU driver never remotely work on such HW. I see. To be honest I wasn't thinking of Juno (I wasn't aware of its limitations) but of potential genuine situations where stream-ids are the same for 2 devices. I know it can happen with PCI devices for instance, although I am aware we don't have PCI passthrough yet. I don't know if it is possible for it to happen with non-PCI devices but I wouldn't be surprised if it can. I merely pointed out Juno because this is where the discussion started. Although, my conclusion wasn't solely based on this system nor PCI devices. It was based on the fact that this could never have worked with the current SMMU driver. So this is not a regression and more an improvement of the driver to support passthrough for devices using the same stream-ID. At this stage of the release, I would only consider trivial improvement to be merged. Cheers, -- Julien Grall
xenstored file descriptor leak
Hello, on NetBSD I'm tracking down an issue where xenstored never closes its file descriptor to /var/run/xenstored/socket and instead loops at 100% CPU on these descriptors. xenstored loops because poll(2) returns a POLLIN event for these descriptors but they are marked is_ignored = true. I'm seeing this with xen 4.15, 4.13 and has also been reported with 4.11 with latest security patches. It seems to have started with patches for XSA-115 (A user reported this for 4.11) I've tracked it down to a difference in poll(2) implementation; it seems that linux will return something that is not (POLLIN|POLLOUT) when a socket if closed; while NetBSD returns POLLIN (this matches the NetBSD's man page). First I think there may be a security issue here, as, even on linux it should be possible for a client to cause a socket to go to the is_ignored state, while still sending data and cause xenstored to go to a 100% CPU loop. I think this is needed anyway: --- xenstored_core.c.orig 2021-02-02 18:06:33.389316841 +0100 +++ xenstored_core.c2021-02-02 19:27:43.761877371 +0100 @@ -397,9 +397,12 @@ !list_empty(&conn->out_list))) *ptimeout = 0; } else { - short events = POLLIN|POLLPRI; - if (!list_empty(&conn->out_list)) - events |= POLLOUT; + short events = 0; + if (!conn->is_ignored) { + events |= POLLIN|POLLPRI; + if (!list_empty(&conn->out_list)) + events |= POLLOUT; + } conn->pollfd_idx = set_fd(conn->fd, events); } } Now I wonder if, on NetBSD at last, a read error or short read shouldn't cause the socket to be closed, as with: @@ -1561,6 +1565,8 @@ bad_client: ignore_connection(conn); + /* we don't want to keep this connection alive */ + talloc_free(conn); } static void handle_output(struct connection *conn) what do you think ? -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: [PATCH v3 0/3] Generic SMMU Bindings
On Tue, 2 Feb 2021, Julien Grall wrote: > On 02/02/2021 17:44, Stefano Stabellini wrote: > > On Tue, 2 Feb 2021, Rahul Singh wrote: > > > Hello Stefano, > > > > > > > On 26 Jan 2021, at 10:58 pm, Stefano Stabellini > > > > wrote: > > > > > > > > Hi all, > > > > > > > > This series introduces support for the generic SMMU bindings to > > > > xen/drivers/passthrough/arm/smmu.c. > > > > > > > > The last version of the series was > > > > https://marc.info/?l=xen-devel&m=159539053406643 > > > > > > > > I realize that it is late for 4.15 -- I think it is OK if this series > > > > goes in afterwards. > > > > > > I tested the series on the Juno board it is woking fine. > > > I found one issue in SMMU driver while testing this series that is not > > > related to this series but already existing issue in SMMU driver. > > > > > > If there are more than one device behind SMMU and they share the same > > > Stream-Id, SMMU driver is creating the new SMR entry without checking the > > > already configured SMR entry if SMR entry correspond to stream-id is > > > already configured. Because of this I observed the stream match conflicts > > > on Juno board. > > > > > > (XEN) smmu: /iommu@7fb3: Unexpected global fault, this could be > > > serious > > > (XEN) smmu: /iommu@7fb3: GFSR 0x0004, GFSYNR0 0x0006, > > > GFSYNR1 0x, GFSYNR2 0x > > > > > > > > > Below two patches is required to be ported to Xen to fix the issue from > > > Linux driver. > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=1f3d5ca43019bff1105838712d55be087d93c0da > > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=21174240e4f4439bb8ed6c116cdbdc03eba2126e > > > > > > Good catch and thanks for the pointers! Do you have any interest in > > backporting these two patches or should I put them on my TODO list? > > > > Unrelated to who does the job, we should discuss if it makes sense to > > try to fix the bug for 4.15. The patches don't seem trivial so I am > > tempted to say that it might be best to leave the bug unfixed for 4.15 > > and fix it later. > > SMMU support on Juno is not that interesting because IIRC the stream-ID is the > same for all the devices. So it is all or nothing passthrough. > > For other HW, this may be a useful feature. Yet we would need a way to group > the devices for passthrough. > > In this context, I would consider it more a feature than a bug because the > SMMU driver never remotely work on such HW. I see. To be honest I wasn't thinking of Juno (I wasn't aware of its limitations) but of potential genuine situations where stream-ids are the same for 2 devices. I know it can happen with PCI devices for instance, although I am aware we don't have PCI passthrough yet. I don't know if it is possible for it to happen with non-PCI devices but I wouldn't be surprised if it can.
Re: [PATCH] xen/arm: domain_build: Ignore device nodes with invalid addresses
Hi, On 02/02/2021 17:47, Elliott Mitchell wrote: The handle_device() function has been returning failure upon encountering a device address which was invalid. A device tree which had such an entry has now been seen in the wild. As it causes no failures to simply ignore the entries, ignore them. > Signed-off-by: Elliott Mitchell --- I'm starting to suspect there are an awful lot of places in the various domain_build.c files which should simply ignore errors. This is now the second place I've encountered in 2 months where ignoring errors was the correct action. Right, as a counterpoint, we run Xen on Arm HW for several years now and this is the first time I heard about issue parsing the DT. So while I appreciate that you are eager to run Xen on the RPI... I know failing in case of error is an engineer's favorite approach, but there seem an awful lot of harmless failures causing panics. This started as the thread "[RFC PATCH] xen/arm: domain_build: Ignore empty memory bank". Now it seems clear the correct approach is to simply ignore these entries. ... we first need to fully understand the issues. Here a few questions: 1) Can you provide more information why you believe the address is invalid? 2) How does Linux use the node? 3) Is it happening with all the RPI DT? If not, what are the differences? This seems a good candidate for backport to 4.14 and certainly should be in 4.15. --- xen/arch/arm/domain_build.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 374bf655ee..c0568b7579 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1407,9 +1407,9 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev, res = dt_device_get_address(dev, i, &addr, &size); if ( res ) { -printk(XENLOG_ERR "Unable to retrieve address %u for %s\n", - i, dt_node_full_name(dev)); -return res; +printk(XENLOG_ERR "Unable to retrieve address of %s, index %u\n", + dt_node_full_name(dev), i); +continue; } res = map_range_to_domain(dev, addr, size, &mr_data); Cheers, -- Julien Grall
[PATCH v3] xen-blkback: fix compatibility bug with single page rings
From: Paul Durrant Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the behaviour of xen-blkback when connecting to a frontend was: - read 'ring-page-order' - if not present then expect a single page ring specified by 'ring-ref' - else expect a ring specified by 'ring-refX' where X is between 0 and 1 << ring-page-order This was correct behaviour, but was broken by the afforementioned commit to become: - read 'ring-page-order' - if not present then expect a single page ring (i.e. ring-page-order = 0) - expect a ring specified by 'ring-refX' where X is between 0 and 1 << ring-page-order - if that didn't work then see if there's a single page ring specified by 'ring-ref' This incorrect behaviour works most of the time but fails when a frontend that sets 'ring-page-order' is unloaded and replaced by one that does not because, instead of reading 'ring-ref', xen-blkback will read the stale 'ring-ref0' left around by the previous frontend will try to map the wrong grant reference. This patch restores the original behaviour. Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront") Signed-off-by: Paul Durrant Reviewed-by: Dongli Zhang Reviewed-by: "Roger Pau Monné" --- Cc: Konrad Rzeszutek Wilk Cc: Jens Axboe v3: - Whitespace fix v2: - Remove now-spurious error path special-case when nr_grefs == 1 --- drivers/block/xen-blkback/common.h | 1 + drivers/block/xen-blkback/xenbus.c | 38 +- 2 files changed, 17 insertions(+), 22 deletions(-) diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index b0c71d3a81a0..bda5c815e441 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -313,6 +313,7 @@ struct xen_blkif { struct work_struct free_work; unsigned intnr_ring_pages; + boolmulti_ref; /* All rings for this device. */ struct xen_blkif_ring *rings; unsigned intnr_rings; diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 9860d4842f36..6c5e9373e91c 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -998,14 +998,17 @@ static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir) for (i = 0; i < nr_grefs; i++) { char ring_ref_name[RINGREF_NAME_LEN]; - snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i); + if (blkif->multi_ref) + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref%u", i); + else { + WARN_ON(i != 0); + snprintf(ring_ref_name, RINGREF_NAME_LEN, "ring-ref"); + } + err = xenbus_scanf(XBT_NIL, dir, ring_ref_name, "%u", &ring_ref[i]); if (err != 1) { - if (nr_grefs == 1) - break; - err = -EINVAL; xenbus_dev_fatal(dev, err, "reading %s/%s", dir, ring_ref_name); @@ -1013,18 +1016,6 @@ static int read_per_ring_refs(struct xen_blkif_ring *ring, const char *dir) } } - if (err != 1) { - WARN_ON(nr_grefs != 1); - - err = xenbus_scanf(XBT_NIL, dir, "ring-ref", "%u", - &ring_ref[0]); - if (err != 1) { - err = -EINVAL; - xenbus_dev_fatal(dev, err, "reading %s/ring-ref", dir); - return err; - } - } - err = -ENOMEM; for (i = 0; i < nr_grefs * XEN_BLKIF_REQS_PER_PAGE; i++) { req = kzalloc(sizeof(*req), GFP_KERNEL); @@ -1129,10 +1120,15 @@ static int connect_ring(struct backend_info *be) blkif->nr_rings, blkif->blk_protocol, protocol, blkif->vbd.feature_gnt_persistent ? "persistent grants" : ""); - ring_page_order = xenbus_read_unsigned(dev->otherend, - "ring-page-order", 0); - - if (ring_page_order > xen_blkif_max_ring_order) { + err = xenbus_scanf(XBT_NIL, dev->otherend, "ring-page-order", "%u", + &ring_page_order); + if (err != 1) { + blkif->nr_ring_pages = 1; + blkif->multi_ref = false; + } else if (ring_page_order <= xen_blkif_max_ring_order) { + blkif->nr_ring_pages = 1 << ring_page_order; + blkif->multi_ref = true; + } else { err = -EINVAL; xenbus_dev_fatal(dev, err, "requested ring page order %d ex
Re: [PATCH v3 0/3] Generic SMMU Bindings
Hi, On 02/02/2021 17:44, Stefano Stabellini wrote: On Tue, 2 Feb 2021, Rahul Singh wrote: Hello Stefano, On 26 Jan 2021, at 10:58 pm, Stefano Stabellini wrote: Hi all, This series introduces support for the generic SMMU bindings to xen/drivers/passthrough/arm/smmu.c. The last version of the series was https://marc.info/?l=xen-devel&m=159539053406643 I realize that it is late for 4.15 -- I think it is OK if this series goes in afterwards. I tested the series on the Juno board it is woking fine. I found one issue in SMMU driver while testing this series that is not related to this series but already existing issue in SMMU driver. If there are more than one device behind SMMU and they share the same Stream-Id, SMMU driver is creating the new SMR entry without checking the already configured SMR entry if SMR entry correspond to stream-id is already configured. Because of this I observed the stream match conflicts on Juno board. (XEN) smmu: /iommu@7fb3: Unexpected global fault, this could be serious (XEN) smmu: /iommu@7fb3:GFSR 0x0004, GFSYNR0 0x0006, GFSYNR1 0x, GFSYNR2 0x Below two patches is required to be ported to Xen to fix the issue from Linux driver. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=1f3d5ca43019bff1105838712d55be087d93c0da https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=21174240e4f4439bb8ed6c116cdbdc03eba2126e Good catch and thanks for the pointers! Do you have any interest in backporting these two patches or should I put them on my TODO list? Unrelated to who does the job, we should discuss if it makes sense to try to fix the bug for 4.15. The patches don't seem trivial so I am tempted to say that it might be best to leave the bug unfixed for 4.15 and fix it later. SMMU support on Juno is not that interesting because IIRC the stream-ID is the same for all the devices. So it is all or nothing passthrough. For other HW, this may be a useful feature. Yet we would need a way to group the devices for passthrough. In this context, I would consider it more a feature than a bug because the SMMU driver never remotely work on such HW. Cheers, -- Julien Grall
[PATCH] xen/arm: domain_build: Ignore device nodes with invalid addresses
The handle_device() function has been returning failure upon encountering a device address which was invalid. A device tree which had such an entry has now been seen in the wild. As it causes no failures to simply ignore the entries, ignore them. Signed-off-by: Elliott Mitchell --- I'm starting to suspect there are an awful lot of places in the various domain_build.c files which should simply ignore errors. This is now the second place I've encountered in 2 months where ignoring errors was the correct action. I know failing in case of error is an engineer's favorite approach, but there seem an awful lot of harmless failures causing panics. This started as the thread "[RFC PATCH] xen/arm: domain_build: Ignore empty memory bank". Now it seems clear the correct approach is to simply ignore these entries. This seems a good candidate for backport to 4.14 and certainly should be in 4.15. --- xen/arch/arm/domain_build.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 374bf655ee..c0568b7579 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1407,9 +1407,9 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev, res = dt_device_get_address(dev, i, &addr, &size); if ( res ) { -printk(XENLOG_ERR "Unable to retrieve address %u for %s\n", - i, dt_node_full_name(dev)); -return res; +printk(XENLOG_ERR "Unable to retrieve address of %s, index %u\n", + dt_node_full_name(dev), i); +continue; } res = map_range_to_domain(dev, addr, size, &mr_data); -- 2.20.1 -- (\___(\___(\__ --=> 8-) EHM <=-- __/)___/)___/) \BS (| ehem+sig...@m5p.com PGP 87145445 |) / \_CS\ | _ -O #include O- _ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445
Re: [PATCH v3 0/3] Generic SMMU Bindings
On Tue, 2 Feb 2021, Rahul Singh wrote: > Hello Stefano, > > > On 26 Jan 2021, at 10:58 pm, Stefano Stabellini > > wrote: > > > > Hi all, > > > > This series introduces support for the generic SMMU bindings to > > xen/drivers/passthrough/arm/smmu.c. > > > > The last version of the series was > > https://marc.info/?l=xen-devel&m=159539053406643 > > > > I realize that it is late for 4.15 -- I think it is OK if this series > > goes in afterwards. > > I tested the series on the Juno board it is woking fine. > I found one issue in SMMU driver while testing this series that is not > related to this series but already existing issue in SMMU driver. > > If there are more than one device behind SMMU and they share the same > Stream-Id, SMMU driver is creating the new SMR entry without checking the > already configured SMR entry if SMR entry correspond to stream-id is already > configured. Because of this I observed the stream match conflicts on Juno > board. > > (XEN) smmu: /iommu@7fb3: Unexpected global fault, this could be serious > (XEN) smmu: /iommu@7fb3: GFSR 0x0004, GFSYNR0 0x0006, GFSYNR1 > 0x, GFSYNR2 0x > > > Below two patches is required to be ported to Xen to fix the issue from Linux > driver. > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=1f3d5ca43019bff1105838712d55be087d93c0da > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=21174240e4f4439bb8ed6c116cdbdc03eba2126e Good catch and thanks for the pointers! Do you have any interest in backporting these two patches or should I put them on my TODO list? Unrelated to who does the job, we should discuss if it makes sense to try to fix the bug for 4.15. The patches don't seem trivial so I am tempted to say that it might be best to leave the bug unfixed for 4.15 and fix it later.
[linux-5.4 test] 158929: regressions - FAIL
flight 158929 linux-5.4 real [real] http://logs.test-lab.xenproject.org/osstest/logs/158929/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-arm64-arm64-xl broken in 158898 test-arm64-arm64-xl-credit2 broken in 158898 test-arm64-arm64-xl-xsm broken in 158898 test-amd64-amd64-dom0pvh-xl-intel 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-dom0pvh-xl-amd 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-xl-multivcpu 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-xl-pvhv2-amd 14 guest-start fail REGR. vs. 158387 test-amd64-coresched-amd64-xl 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-qemuu-freebsd12-amd64 13 guest-startfail REGR. vs. 158387 test-amd64-coresched-i386-xl 14 guest-start fail REGR. vs. 158387 test-arm64-arm64-xl-seattle 14 guest-start fail REGR. vs. 158387 test-arm64-arm64-xl 14 guest-start fail REGR. vs. 158387 test-amd64-i386-qemut-rhel6hvm-amd 12 redhat-install fail REGR. vs. 158387 test-amd64-i386-qemuu-rhel6hvm-amd 12 redhat-install fail REGR. vs. 158387 test-amd64-i386-freebsd10-amd64 13 guest-start fail REGR. vs. 158387 test-amd64-amd64-xl-pvshim 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-xl-pvhv2-intel 14 guest-start fail REGR. vs. 158387 test-amd64-i386-freebsd10-i386 13 guest-startfail REGR. vs. 158387 test-amd64-amd64-xl-xsm 14 guest-start fail REGR. vs. 158387 test-amd64-i386-libvirt 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-libvirt-xsm 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-libvirt 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-xl-credit1 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-pair25 guest-start/debian fail REGR. vs. 158387 test-amd64-amd64-xl 14 guest-start fail REGR. vs. 158387 test-amd64-i386-xl 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-xl-shadow 14 guest-start fail REGR. vs. 158387 test-amd64-i386-xl-xsm 14 guest-start fail REGR. vs. 158387 test-amd64-amd64-xl-credit2 14 guest-start fail REGR. vs. 158387 test-amd64-i386-pair 25 guest-start/debian fail REGR. vs. 158387 test-amd64-i386-libvirt-xsm 14 guest-start fail REGR. vs. 158387 test-amd64-i386-libvirt-pair 25 guest-start/debian fail REGR. vs. 158387 test-amd64-amd64-libvirt-pair 25 guest-start/debian fail REGR. vs. 158387 test-amd64-i386-qemut-rhel6hvm-intel 12 redhat-install fail REGR. vs. 158387 test-amd64-amd64-qemuu-nested-amd 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-i386-qemuu-rhel6hvm-intel 12 redhat-install fail REGR. vs. 158387 test-amd64-amd64-xl-qemuu-win7-amd64 12 windows-install fail REGR. vs. 158387 test-amd64-i386-xl-qemut-win7-amd64 12 windows-install fail REGR. vs. 158387 test-amd64-amd64-amd64-pvgrub 12 debian-di-install fail REGR. vs. 158387 test-amd64-amd64-pygrub 12 debian-di-installfail REGR. vs. 158387 test-amd64-amd64-qemuu-nested-intel 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-amd64-xl-qemut-win7-amd64 12 windows-install fail REGR. vs. 158387 test-amd64-amd64-i386-pvgrub 12 debian-di-installfail REGR. vs. 158387 test-amd64-amd64-xl-qemuu-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-i386-xl-qemuu-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 158387 test-arm64-arm64-xl-xsm 14 guest-start fail REGR. vs. 158387 test-arm64-arm64-xl-credit1 14 guest-start fail REGR. vs. 158387 test-amd64-i386-xl-qemuu-ovmf-amd64 12 debian-hvm-install fail REGR. vs. 158387 test-arm64-arm64-xl-thunderx 14 guest-start fail REGR. vs. 158387 test-arm64-arm64-xl-credit2 14 guest-start fail REGR. vs. 158387 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-amd64-xl-qemut-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-amd64-xl-qcow212 debian-di-installfail REGR. vs. 158387 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-i386-xl-qemut-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 158387 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 debian-hvm-install fail REGR. vs
Re: [PATCH v2 2/4] x86: Introduce MSR_UNHANDLED
On 1/22/21 6:51 AM, Jan Beulich wrote: > Also, Andrew, (I think I did say so before) - I definitely > would want your general consent with this model, as what gets > altered here is almost all relatively recent contributions > by you. Nor would I exclude the approach being controversial. > Andrew, ping? -boris
RE: [PATCH v2] xen-blkback: fix compatibility bug with single page rings
> -Original Message- > From: Roger Pau Monné > Sent: 02 February 2021 16:29 > To: Paul Durrant > Cc: xen-devel@lists.xenproject.org; linux-bl...@vger.kernel.org; > linux-ker...@vger.kernel.org; Paul > Durrant ; Konrad Rzeszutek Wilk > ; Jens Axboe > ; Dongli Zhang > Subject: Re: [PATCH v2] xen-blkback: fix compatibility bug with single page > rings > > On Thu, Jan 28, 2021 at 01:04:41PM +, Paul Durrant wrote: > > From: Paul Durrant > > > > Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid > > inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the > > behaviour of xen-blkback when connecting to a frontend was: > > > > - read 'ring-page-order' > > - if not present then expect a single page ring specified by 'ring-ref' > > - else expect a ring specified by 'ring-refX' where X is between 0 and > > 1 << ring-page-order > > > > This was correct behaviour, but was broken by the afforementioned commit to > > become: > > > > - read 'ring-page-order' > > - if not present then expect a single page ring (i.e. ring-page-order = 0) > > - expect a ring specified by 'ring-refX' where X is between 0 and > > 1 << ring-page-order > > - if that didn't work then see if there's a single page ring specified by > > 'ring-ref' > > > > This incorrect behaviour works most of the time but fails when a frontend > > that sets 'ring-page-order' is unloaded and replaced by one that does not > > because, instead of reading 'ring-ref', xen-blkback will read the stale > > 'ring-ref0' left around by the previous frontend will try to map the wrong > > grant reference. > > > > This patch restores the original behaviour. > > > > Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid > > inconsistent xenstore 'ring-page- > order' set by malicious blkfront") > > Signed-off-by: Paul Durrant > > --- > > Cc: Konrad Rzeszutek Wilk > > Cc: "Roger Pau Monné" > > Cc: Jens Axboe > > Cc: Dongli Zhang > > > > v2: > > - Remove now-spurious error path special-case when nr_grefs == 1 > > --- > > drivers/block/xen-blkback/common.h | 1 + > > drivers/block/xen-blkback/xenbus.c | 38 +- > > 2 files changed, 17 insertions(+), 22 deletions(-) > > > > diff --git a/drivers/block/xen-blkback/common.h > > b/drivers/block/xen-blkback/common.h > > index b0c71d3a81a0..524a79f10de6 100644 > > --- a/drivers/block/xen-blkback/common.h > > +++ b/drivers/block/xen-blkback/common.h > > @@ -313,6 +313,7 @@ struct xen_blkif { > > > > struct work_struct free_work; > > unsigned intnr_ring_pages; > > + boolmulti_ref; > > You seem to have used spaces between the type and the variable name > here, while neighbors also use hard tabs. > Oops. Xen vs. Linux coding style :-( I'll send a v3 with the whitespace fixed. > The rest LGTM: > > Reviewed-by: Roger Pau Monné > > We should have forbidden the usage of ring-page-order = 0 and we could > have avoided having to add the multi_ref variable, but that's too late > now. Thanks. Yes, that cat is out of the bag and has been for a while unfortunately. Paul > > Thanks, Roger.
Re: [PATCH] hw/i386/xen: Remove dead code
On 02/02/21 16:56, Philippe Mathieu-Daudé wrote: 'drivers_blacklisted' is never accessed, remove it. Signed-off-by: Philippe Mathieu-Daudé --- hw/i386/xen/xen_platform.c | 13 ++--- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c index 7c4db35debb..01ae1fb1618 100644 --- a/hw/i386/xen/xen_platform.c +++ b/hw/i386/xen/xen_platform.c @@ -60,7 +60,6 @@ struct PCIXenPlatformState { MemoryRegion bar; MemoryRegion mmio_bar; uint8_t flags; /* used only for version_id == 2 */ -int drivers_blacklisted; uint16_t driver_product_version; /* Log from guest drivers */ @@ -245,18 +244,10 @@ static void platform_fixed_ioport_writeb(void *opaque, uint32_t addr, uint32_t v static uint32_t platform_fixed_ioport_readw(void *opaque, uint32_t addr) { -PCIXenPlatformState *s = opaque; - switch (addr) { case 0: -if (s->drivers_blacklisted) { -/* The drivers will recognise this magic number and refuse - * to do anything. */ -return 0xd249; -} else { -/* Magic value so that you can identify the interface. */ -return 0x49d2; -} +/* Magic value so that you can identify the interface. */ +return 0x49d2; default: return 0x; } Cc: qemu-triv...@nongnu.org
[ovmf test] 158932: all pass - PUSHED
flight 158932 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/158932/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 3b468095cd3dfcd1aa4ae63bc623f534bc2390d2 baseline version: ovmf ea56ebf67dd55483105aa9f9996a48213e78337e Last test of basis 158874 2021-02-01 01:54:43 Z1 days Testing same since 158932 2021-02-02 03:10:23 Z0 days1 attempts People who touched revisions under test: Kun Qin jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git ea56ebf67d..3b468095cd 3b468095cd3dfcd1aa4ae63bc623f534bc2390d2 -> xen-tested-master
Re: [PATCH v2] xen-blkback: fix compatibility bug with single page rings
On Thu, Jan 28, 2021 at 01:04:41PM +, Paul Durrant wrote: > From: Paul Durrant > > Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid > inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the > behaviour of xen-blkback when connecting to a frontend was: > > - read 'ring-page-order' > - if not present then expect a single page ring specified by 'ring-ref' > - else expect a ring specified by 'ring-refX' where X is between 0 and > 1 << ring-page-order > > This was correct behaviour, but was broken by the afforementioned commit to > become: > > - read 'ring-page-order' > - if not present then expect a single page ring (i.e. ring-page-order = 0) > - expect a ring specified by 'ring-refX' where X is between 0 and > 1 << ring-page-order > - if that didn't work then see if there's a single page ring specified by > 'ring-ref' > > This incorrect behaviour works most of the time but fails when a frontend > that sets 'ring-page-order' is unloaded and replaced by one that does not > because, instead of reading 'ring-ref', xen-blkback will read the stale > 'ring-ref0' left around by the previous frontend will try to map the wrong > grant reference. > > This patch restores the original behaviour. > > Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid > inconsistent xenstore 'ring-page-order' set by malicious blkfront") > Signed-off-by: Paul Durrant > --- > Cc: Konrad Rzeszutek Wilk > Cc: "Roger Pau Monné" > Cc: Jens Axboe > Cc: Dongli Zhang > > v2: > - Remove now-spurious error path special-case when nr_grefs == 1 > --- > drivers/block/xen-blkback/common.h | 1 + > drivers/block/xen-blkback/xenbus.c | 38 +- > 2 files changed, 17 insertions(+), 22 deletions(-) > > diff --git a/drivers/block/xen-blkback/common.h > b/drivers/block/xen-blkback/common.h > index b0c71d3a81a0..524a79f10de6 100644 > --- a/drivers/block/xen-blkback/common.h > +++ b/drivers/block/xen-blkback/common.h > @@ -313,6 +313,7 @@ struct xen_blkif { > > struct work_struct free_work; > unsigned intnr_ring_pages; > + boolmulti_ref; You seem to have used spaces between the type and the variable name here, while neighbors also use hard tabs. The rest LGTM: Reviewed-by: Roger Pau Monné We should have forbidden the usage of ring-page-order = 0 and we could have avoided having to add the multi_ref variable, but that's too late now. Thanks, Roger.
Re: [PATCH] xen/netback: avoid race in xenvif_rx_ring_slots_available()
On Tue, Feb 02, 2021 at 08:09:38AM +0100, Juergen Gross wrote: > Since commit 23025393dbeb3b8b3 ("xen/netback: use lateeoi irq binding") > xenvif_rx_ring_slots_available() is no longer called only from the rx > queue kernel thread, so it needs to access the rx queue with the > associated queue held. > > Reported-by: Igor Druzhinin > Fixes: 23025393dbeb3b8b3 ("xen/netback: use lateeoi irq binding") > Cc: sta...@vger.kernel.org > Signed-off-by: Juergen Gross Acked-by: Wei Liu
Re: [PATCH] xen/netback: avoid race in xenvif_rx_ring_slots_available()
On 02.02.21 16:26, Igor Druzhinin wrote: On 02/02/2021 07:09, Juergen Gross wrote: Since commit 23025393dbeb3b8b3 ("xen/netback: use lateeoi irq binding") xenvif_rx_ring_slots_available() is no longer called only from the rx queue kernel thread, so it needs to access the rx queue with the associated queue held. Reported-by: Igor Druzhinin Fixes: 23025393dbeb3b8b3 ("xen/netback: use lateeoi irq binding") Cc: sta...@vger.kernel.org Signed-off-by: Juergen Gross Appreciate a quick fix! Is this the only place that sort of race could happen now? I checked and didn't find any other similar problem. Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: application/pgp-keys OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH v3 2/3] arm,smmu: restructure code in preparation to new bindings support
Hello Stefano, > On 26 Jan 2021, at 10:58 pm, Stefano Stabellini > wrote: > > From: Brian Woods > > Restructure some of the code and add supporting functions for adding > generic device tree (DT) binding support. This will allow for using > current Linux device trees with just modifying the chosen field to > enable Xen. > > Signed-off-by: Brian Woods > Signed-off-by: Stefano Stabellini Reviewed-by: Rahul Singh Tested-by: Rahul Singh Regards, Rahul > --- > Changes in v3: > - split patch > --- > xen/drivers/passthrough/arm/smmu.c | 60 +- > 1 file changed, 35 insertions(+), 25 deletions(-) > > diff --git a/xen/drivers/passthrough/arm/smmu.c > b/xen/drivers/passthrough/arm/smmu.c > index 3898d1d737..9687762283 100644 > --- a/xen/drivers/passthrough/arm/smmu.c > +++ b/xen/drivers/passthrough/arm/smmu.c > @@ -782,50 +782,36 @@ static int insert_smmu_master(struct arm_smmu_device > *smmu, > return 0; > } > > -static int register_smmu_master(struct arm_smmu_device *smmu, > - struct device *dev, > - struct of_phandle_args *masterspec) > +static int arm_smmu_dt_add_device_legacy(struct arm_smmu_device *smmu, > + struct device *dev, > + struct iommu_fwspec *fwspec) > { > - int i, ret = 0; > + int i; > struct arm_smmu_master *master; > - struct iommu_fwspec *fwspec; > + struct device_node *dev_node = dev_get_dev_node(dev); > > - master = find_smmu_master(smmu, masterspec->np); > + master = find_smmu_master(smmu, dev_node); > if (master) { > dev_err(dev, > "rejecting multiple registrations for master device > %s\n", > - masterspec->np->name); > + dev_node->name); > return -EBUSY; > } > > master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL); > if (!master) > return -ENOMEM; > - master->of_node = masterspec->np; > - > - ret = iommu_fwspec_init(&master->of_node->dev, smmu->dev); > - if (ret) { > - kfree(master); > - return ret; > - } > - fwspec = dev_iommu_fwspec_get(dev); > - > - /* adding the ids here */ > - ret = iommu_fwspec_add_ids(&masterspec->np->dev, > -masterspec->args, > -masterspec->args_count); > - if (ret) > - return ret; > + master->of_node = dev_node; > > /* Xen: Let Xen know that the device is protected by an SMMU */ > - dt_device_set_protected(masterspec->np); > + dt_device_set_protected(dev_node); > > if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH)) { > for (i = 0; i < fwspec->num_ids; ++i) { > - if (masterspec->args[i] >= smmu->num_mapping_groups) { > + if (fwspec->ids[i] >= smmu->num_mapping_groups) { > dev_err(dev, > "stream ID for master device %s greater > than maximum allowed (%d)\n", > - masterspec->np->name, > smmu->num_mapping_groups); > + dev_node->name, > smmu->num_mapping_groups); > return -ERANGE; > } > } > @@ -833,6 +819,30 @@ static int register_smmu_master(struct arm_smmu_device > *smmu, > return insert_smmu_master(smmu, master); > } > > +static int register_smmu_master(struct arm_smmu_device *smmu, > + struct device *dev, > + struct of_phandle_args *masterspec) > +{ > + int ret = 0; > + struct iommu_fwspec *fwspec; > + > + ret = iommu_fwspec_init(&masterspec->np->dev, smmu->dev); > + if (ret) > + return ret; > + > + fwspec = dev_iommu_fwspec_get(&masterspec->np->dev); > + > + ret = iommu_fwspec_add_ids(&masterspec->np->dev, > +masterspec->args, > +masterspec->args_count); > + if (ret) > + return ret; > + > + return arm_smmu_dt_add_device_legacy(smmu, > + &masterspec->np->dev, > + fwspec); > +} > + > static struct arm_smmu_device *find_smmu_for_device(struct device *dev) > { > struct arm_smmu_device *smmu; > -- > 2.17.1 > >
Re: [PATCH v3 3/3] arm,smmu: add support for generic DT bindings. Implement add_device and dt_xlate.
Hello Stefano, > On 26 Jan 2021, at 10:58 pm, Stefano Stabellini > wrote: > > From: Brian Woods > > Now that all arm iommu drivers support generic bindings we can remove > the workaround from iommu_add_dt_device(). > > Note that if both legacy bindings and generic bindings are present in > device tree, the legacy bindings are the ones that are used. > > Signed-off-by: Brian Woods > Signed-off-by: Stefano Stabellini Reviewed-by: Rahul Singh Tested-by: Rahul Singh Regards, Rahul > --- > Changes in v3: > - split patch > - make find_smmu return non-const so that we can use it in > arm_smmu_dt_add_device_generic > - use dt_phandle_args > - update commit message > --- > xen/drivers/passthrough/arm/smmu.c| 40 ++- > xen/drivers/passthrough/device_tree.c | 17 +--- > 2 files changed, 40 insertions(+), 17 deletions(-) > > diff --git a/xen/drivers/passthrough/arm/smmu.c > b/xen/drivers/passthrough/arm/smmu.c > index 9687762283..620ba5a4b5 100644 > --- a/xen/drivers/passthrough/arm/smmu.c > +++ b/xen/drivers/passthrough/arm/smmu.c > @@ -254,6 +254,8 @@ struct iommu_group > atomic_t ref; > }; > > +static struct arm_smmu_device *find_smmu(const struct device *dev); > + > static struct iommu_group *iommu_group_alloc(void) > { > struct iommu_group *group = xzalloc(struct iommu_group); > @@ -843,6 +845,40 @@ static int register_smmu_master(struct arm_smmu_device > *smmu, >fwspec); > } > > +static int arm_smmu_dt_add_device_generic(u8 devfn, struct device *dev) > +{ > + struct arm_smmu_device *smmu; > + struct iommu_fwspec *fwspec; > + > + fwspec = dev_iommu_fwspec_get(dev); > + if (fwspec == NULL) > + return -ENXIO; > + > + smmu = find_smmu(fwspec->iommu_dev); > + if (smmu == NULL) > + return -ENXIO; > + > + return arm_smmu_dt_add_device_legacy(smmu, dev, fwspec); > +} > + > +static int arm_smmu_dt_xlate_generic(struct device *dev, > + const struct dt_phandle_args *spec) > +{ > + uint32_t mask, fwid = 0; > + > + if (spec->args_count > 0) > + fwid |= (SMR_ID_MASK & spec->args[0]) << SMR_ID_SHIFT; > + > + if (spec->args_count > 1) > + fwid |= (SMR_MASK_MASK & spec->args[1]) << SMR_MASK_SHIFT; > + else if (!of_property_read_u32(spec->np, "stream-match-mask", &mask)) > + fwid |= (SMR_MASK_MASK & mask) << SMR_MASK_SHIFT; > + > + return iommu_fwspec_add_ids(dev, > + &fwid, > + 1); > +} > + > static struct arm_smmu_device *find_smmu_for_device(struct device *dev) > { > struct arm_smmu_device *smmu; > @@ -2766,6 +2802,7 @@ static void arm_smmu_iommu_domain_teardown(struct > domain *d) > static const struct iommu_ops arm_smmu_iommu_ops = { > .init = arm_smmu_iommu_domain_init, > .hwdom_init = arm_smmu_iommu_hwdom_init, > +.add_device = arm_smmu_dt_add_device_generic, > .teardown = arm_smmu_iommu_domain_teardown, > .iotlb_flush = arm_smmu_iotlb_flush, > .iotlb_flush_all = arm_smmu_iotlb_flush_all, > @@ -2773,9 +2810,10 @@ static const struct iommu_ops arm_smmu_iommu_ops = { > .reassign_device = arm_smmu_reassign_dev, > .map_page = arm_iommu_map_page, > .unmap_page = arm_iommu_unmap_page, > +.dt_xlate = arm_smmu_dt_xlate_generic, > }; > > -static __init const struct arm_smmu_device *find_smmu(const struct device > *dev) > +static struct arm_smmu_device *find_smmu(const struct device *dev) > { > struct arm_smmu_device *smmu; > bool found = false; > diff --git a/xen/drivers/passthrough/device_tree.c > b/xen/drivers/passthrough/device_tree.c > index a51ae3c9c3..ae07f272e1 100644 > --- a/xen/drivers/passthrough/device_tree.c > +++ b/xen/drivers/passthrough/device_tree.c > @@ -162,22 +162,7 @@ int iommu_add_dt_device(struct dt_device_node *np) > * these callback implemented. > */ > if ( !ops->add_device || !ops->dt_xlate ) > -{ > -/* > - * Some Device Trees may expose both legacy SMMU and generic > - * IOMMU bindings together. However, the SMMU driver is only > - * supporting the former and will protect them during the > - * initialization. So we need to skip them and not return > - * error here. > - * > - * XXX: This can be dropped when the SMMU is able to deal > - * with generic bindings. > - */ > -if ( dt_device_is_protected(np) ) > -return 0; > -else > -return -EINVAL; > -} > +return -EINVAL; > > if ( !dt_device_is_available(iommu_spec.np) ) > break; > -- > 2.17.1 >
Re: [PATCH v3 0/3] Generic SMMU Bindings
Hello Stefano, > On 26 Jan 2021, at 10:58 pm, Stefano Stabellini > wrote: > > Hi all, > > This series introduces support for the generic SMMU bindings to > xen/drivers/passthrough/arm/smmu.c. > > The last version of the series was > https://marc.info/?l=xen-devel&m=159539053406643 > > I realize that it is late for 4.15 -- I think it is OK if this series > goes in afterwards. I tested the series on the Juno board it is woking fine. I found one issue in SMMU driver while testing this series that is not related to this series but already existing issue in SMMU driver. If there are more than one device behind SMMU and they share the same Stream-Id, SMMU driver is creating the new SMR entry without checking the already configured SMR entry if SMR entry correspond to stream-id is already configured. Because of this I observed the stream match conflicts on Juno board. (XEN) smmu: /iommu@7fb3: Unexpected global fault, this could be serious (XEN) smmu: /iommu@7fb3:GFSR 0x0004, GFSYNR0 0x0006, GFSYNR1 0x, GFSYNR2 0x Below two patches is required to be ported to Xen to fix the issue from Linux driver. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=1f3d5ca43019bff1105838712d55be087d93c0da https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/iommu/arm-smmu.c?h=linux-5.8.y&id=21174240e4f4439bb8ed6c116cdbdc03eba2126e Regards, Rahul > > Cheers, > > Stefano > > > Brian Woods (3): > arm,smmu: switch to using iommu_fwspec functions > arm,smmu: restructure code in preparation to new bindings support > arm,smmu: add support for generic DT bindings. Implement add_device and > dt_xlate. > > xen/drivers/passthrough/arm/smmu.c| 162 -- > xen/drivers/passthrough/device_tree.c | 24 ++--- > 2 files changed, 123 insertions(+), 63 deletions(-)
Re: [PATCH v3 1/3] arm,smmu: switch to using iommu_fwspec functions
Hello Stefano, > On 26 Jan 2021, at 10:58 pm, Stefano Stabellini > wrote: > > From: Brian Woods > > Modify the smmu driver so that it uses the iommu_fwspec helper > functions. This means both ARM IOMMU drivers will both use the > iommu_fwspec helper functions, making enabling generic device tree > bindings in the SMMU driver much cleaner. > > Signed-off-by: Brian Woods > Signed-off-by: Stefano Stabellini Reviewed-by: Rahul Singh Tested-by: Rahul Singh Regards, Rahul > --- > Changes in v3: > - add a comment in iommu_add_dt_device > - don't allocate fwspec twice in arm_smmu_add_device > - reuse existing fwspec pointer, don't add a second one > - add comment about supporting fwspec at the top of the file > --- > xen/drivers/passthrough/arm/smmu.c| 98 --- > xen/drivers/passthrough/device_tree.c | 7 ++ > 2 files changed, 66 insertions(+), 39 deletions(-) > > diff --git a/xen/drivers/passthrough/arm/smmu.c > b/xen/drivers/passthrough/arm/smmu.c > index 3e8aa37866..3898d1d737 100644 > --- a/xen/drivers/passthrough/arm/smmu.c > +++ b/xen/drivers/passthrough/arm/smmu.c > @@ -32,6 +32,9 @@ > *- 4k and 64k pages, with contiguous pte hints. > *- Up to 48-bit addressing (dependent on VA_BITS) > *- Context fault reporting > + * > + * Changes compared to Linux driver: > + * - support for fwspec > */ > > > @@ -49,6 +52,7 @@ > #include > #include > #include > +#include > #include > > /* Xen: The below defines are redefined within the file. Undef it */ > @@ -302,9 +306,6 @@ static struct iommu_group *iommu_group_get(struct device > *dev) > > /* Start of Linux SMMU code */ > > -/* Maximum number of stream IDs assigned to a single device */ > -#define MAX_MASTER_STREAMIDS MAX_PHANDLE_ARGS > - > /* Maximum number of context banks per SMMU */ > #define ARM_SMMU_MAX_CBS 128 > > @@ -597,8 +598,6 @@ struct arm_smmu_smr { > }; > > struct arm_smmu_master_cfg { > - int num_streamids; > - u16 streamids[MAX_MASTER_STREAMIDS]; > struct arm_smmu_smr *smrs; > }; > > @@ -686,6 +685,14 @@ static struct arm_smmu_option_prop arm_smmu_options[] = { > { 0, NULL}, > }; > > +static inline struct iommu_fwspec * > +arm_smmu_get_fwspec(struct arm_smmu_master_cfg *cfg) > +{ > + struct arm_smmu_master *master = container_of(cfg, > + struct > arm_smmu_master, cfg); > + return dev_iommu_fwspec_get(&master->of_node->dev); > +} > + > static void parse_driver_options(struct arm_smmu_device *smmu) > { > int i = 0; > @@ -779,8 +786,9 @@ static int register_smmu_master(struct arm_smmu_device > *smmu, > struct device *dev, > struct of_phandle_args *masterspec) > { > - int i; > + int i, ret = 0; > struct arm_smmu_master *master; > + struct iommu_fwspec *fwspec; > > master = find_smmu_master(smmu, masterspec->np); > if (master) { > @@ -790,34 +798,37 @@ static int register_smmu_master(struct arm_smmu_device > *smmu, > return -EBUSY; > } > > - if (masterspec->args_count > MAX_MASTER_STREAMIDS) { > - dev_err(dev, > - "reached maximum number (%d) of stream IDs for master > device %s\n", > - MAX_MASTER_STREAMIDS, masterspec->np->name); > - return -ENOSPC; > - } > - > master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL); > if (!master) > return -ENOMEM; > + master->of_node = masterspec->np; > > - master->of_node = masterspec->np; > - master->cfg.num_streamids = masterspec->args_count; > + ret = iommu_fwspec_init(&master->of_node->dev, smmu->dev); > + if (ret) { > + kfree(master); > + return ret; > + } > + fwspec = dev_iommu_fwspec_get(dev); > + > + /* adding the ids here */ > + ret = iommu_fwspec_add_ids(&masterspec->np->dev, > +masterspec->args, > +masterspec->args_count); > + if (ret) > + return ret; > > /* Xen: Let Xen know that the device is protected by an SMMU */ > dt_device_set_protected(masterspec->np); > > - for (i = 0; i < master->cfg.num_streamids; ++i) { > - u16 streamid = masterspec->args[i]; > - > - if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH) && > - (streamid >= smmu->num_mapping_groups)) { > - dev_err(dev, > - "stream ID for master device %s greater than > maximum allowed (%d)\n", > - masterspec->np->name, smmu->num_mapping_groups); > - return -ERANGE; > + if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH)) { > + for (i = 0; i < fwspec->num_i
RE: [PATCH] hw/i386/xen: Remove dead code
> -Original Message- > From: Philippe Mathieu-Daudé > Sent: 02 February 2021 15:57 > To: qemu-de...@nongnu.org > Cc: Richard Henderson ; Paolo Bonzini > ; Eduardo > Habkost ; qemu-triv...@nongnu.org; Michael S. Tsirkin > ; Marcel > Apfelbaum ; xen-devel@lists.xenproject.org; Paul > Durrant ; > Anthony Perard ; Stefano Stabellini > ; Philippe > Mathieu-Daudé > Subject: [PATCH] hw/i386/xen: Remove dead code > > 'drivers_blacklisted' is never accessed, remove it. > > Signed-off-by: Philippe Mathieu-Daudé FTR this is a vestige of an ancient mechanism that's not used any more (see https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/misc/hvm-emulated-unplug.pandoc step 5). Reviewed-by: Paul Durrant > --- > hw/i386/xen/xen_platform.c | 13 ++--- > 1 file changed, 2 insertions(+), 11 deletions(-) > > diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c > index 7c4db35debb..01ae1fb1618 100644 > --- a/hw/i386/xen/xen_platform.c > +++ b/hw/i386/xen/xen_platform.c > @@ -60,7 +60,6 @@ struct PCIXenPlatformState { > MemoryRegion bar; > MemoryRegion mmio_bar; > uint8_t flags; /* used only for version_id == 2 */ > -int drivers_blacklisted; > uint16_t driver_product_version; > > /* Log from guest drivers */ > @@ -245,18 +244,10 @@ static void platform_fixed_ioport_writeb(void *opaque, > uint32_t addr, uint32_t v > > static uint32_t platform_fixed_ioport_readw(void *opaque, uint32_t addr) > { > -PCIXenPlatformState *s = opaque; > - > switch (addr) { > case 0: > -if (s->drivers_blacklisted) { > -/* The drivers will recognise this magic number and refuse > - * to do anything. */ > -return 0xd249; > -} else { > -/* Magic value so that you can identify the interface. */ > -return 0x49d2; > -} > +/* Magic value so that you can identify the interface. */ > +return 0x49d2; > default: > return 0x; > } > -- > 2.26.2
[PATCH] hw/i386/xen: Remove dead code
'drivers_blacklisted' is never accessed, remove it. Signed-off-by: Philippe Mathieu-Daudé --- hw/i386/xen/xen_platform.c | 13 ++--- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c index 7c4db35debb..01ae1fb1618 100644 --- a/hw/i386/xen/xen_platform.c +++ b/hw/i386/xen/xen_platform.c @@ -60,7 +60,6 @@ struct PCIXenPlatformState { MemoryRegion bar; MemoryRegion mmio_bar; uint8_t flags; /* used only for version_id == 2 */ -int drivers_blacklisted; uint16_t driver_product_version; /* Log from guest drivers */ @@ -245,18 +244,10 @@ static void platform_fixed_ioport_writeb(void *opaque, uint32_t addr, uint32_t v static uint32_t platform_fixed_ioport_readw(void *opaque, uint32_t addr) { -PCIXenPlatformState *s = opaque; - switch (addr) { case 0: -if (s->drivers_blacklisted) { -/* The drivers will recognise this magic number and refuse - * to do anything. */ -return 0xd249; -} else { -/* Magic value so that you can identify the interface. */ -return 0x49d2; -} +/* Magic value so that you can identify the interface. */ +return 0x49d2; default: return 0x; } -- 2.26.2
Re: [PATCH] xen/netback: avoid race in xenvif_rx_ring_slots_available()
On 02/02/2021 07:09, Juergen Gross wrote: > Since commit 23025393dbeb3b8b3 ("xen/netback: use lateeoi irq binding") > xenvif_rx_ring_slots_available() is no longer called only from the rx > queue kernel thread, so it needs to access the rx queue with the > associated queue held. > > Reported-by: Igor Druzhinin > Fixes: 23025393dbeb3b8b3 ("xen/netback: use lateeoi irq binding") > Cc: sta...@vger.kernel.org > Signed-off-by: Juergen Gross Appreciate a quick fix! Is this the only place that sort of race could happen now? Igor
Re: Null scheduler and vwfi native problem
(Adding Andrew, Jan, Juergen for visibility) Hi Dario, On 02/02/2021 15:03, Dario Faggioli wrote: On Tue, 2021-02-02 at 07:59 +, Julien Grall wrote: Hi Dario, I have had a quick look at your place. The RCU call in leave_hypervisor_to_guest() needs to be placed just after the last call to check_for_pcpu_work(). Otherwise, you may be preempted and keep the RCU quiet. Ok, makes sense. I'll move it. The placement in enter_hypervisor_from_guest() doesn't matter too much, although I would consider to call it as a late as possible. Mmmm... Can I ask why? In fact, I would have said "as soon as possible". Because those functions only access data for the current vCPU/domain. This is already protected by the fact that the domain is running. By leaving the "quiesce" mode later, you give an opportunity to the RCU to release memory earlier. In reality, it is probably still too early as a pCPU can be considered quiesced until a call to rcu_lock*() (such rcu_lock_domain()). But this would require some investigation to check if we effectively protect all the region with the RCU helpers. This is likely too complicated for 4.15. Cheers, -- Julien Grall
[PATCH v2 2/2] IOREQ: refine when to send mapcache invalidation request
XENMEM_decrease_reservation isn't the only means by which pages can get removed from a guest, yet all removals ought to be signaled to qemu. Put setting of the flag into the central p2m_remove_page() underlying all respective hypercalls as well as a few similar places, mainly in PoD code. Additionally there's no point sending the request for the local domain when the domain acted upon is a different one. The latter domain's ioreq server mapcaches need invalidating. We assume that domain to be paused at the point the operation takes place, so sending the request in this case happens from the hvm_do_resume() path, which as one of its first steps calls handle_hvm_io_completion(). Even without the remote operation aspect a single domain-wide flag doesn't do: Guests may e.g. decrease-reservation on multiple vCPU-s in parallel. Each of them needs to issue an invalidation request in due course, in particular because exiting to guest context should not happen before the request was actually seen by (all) the emulator(s). Signed-off-by: Jan Beulich --- v2: Preemption related adjustment split off. Make flag per-vCPU. More places to set the flag. Also handle acting on a remote domain. Re-base. --- I'm still unconvinced of moving the flag setting into p2m_set_entry(). Besides likely causing false positives, we'd also need to make the function retrieve the prior entry's type in order to do this for displaced RAM entries only. Instead I've identified more places where the flag should be set. --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -759,10 +759,9 @@ static void p2m_free_entry(struct p2m_do * has failed (error case). * So, at worst, the spurious mapcache invalidation might be sent. */ -if ( (p2m->domain == current->domain) && - domain_has_ioreq_server(p2m->domain) && - p2m_is_ram(entry.p2m.type) ) -p2m->domain->mapcache_invalidate = true; +if ( p2m_is_ram(entry.p2m.type) && + domain_has_ioreq_server(p2m->domain) ) +ioreq_request_mapcache_invalidate(p2m->domain); #endif p2m->stats.mappings[level]--; --- a/xen/arch/arm/traps.c +++ b/xen/arch/arm/traps.c @@ -1509,8 +1509,8 @@ static void do_trap_hypercall(struct cpu * Note that sending the invalidation request causes the vCPU to block * until all the IOREQ servers have acknowledged the invalidation. */ -if ( unlikely(curr->domain->mapcache_invalidate) && - test_and_clear_bool(curr->domain->mapcache_invalidate) ) +if ( unlikely(curr->mapcache_invalidate) && + test_and_clear_bool(curr->mapcache_invalidate) ) ioreq_signal_mapcache_invalidate(); #endif } --- a/xen/arch/x86/hvm/hypercall.c +++ b/xen/arch/x86/hvm/hypercall.c @@ -32,7 +32,6 @@ static long hvm_memory_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) { -const struct vcpu *curr = current; long rc; switch ( cmd & MEMOP_CMD_MASK ) @@ -42,14 +41,11 @@ static long hvm_memory_op(int cmd, XEN_G return -ENOSYS; } -if ( !curr->hcall_compat ) +if ( !current->hcall_compat ) rc = do_memory_op(cmd, arg); else rc = compat_memory_op(cmd, arg); -if ( (cmd & MEMOP_CMD_MASK) == XENMEM_decrease_reservation ) -curr->domain->mapcache_invalidate = true; - return rc; } @@ -327,9 +323,11 @@ int hvm_hypercall(struct cpu_user_regs * HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%lu -> %lx", eax, regs->rax); -if ( unlikely(currd->mapcache_invalidate) && - test_and_clear_bool(currd->mapcache_invalidate) ) +if ( unlikely(curr->mapcache_invalidate) ) +{ +curr->mapcache_invalidate = false; ioreq_signal_mapcache_invalidate(); +} return curr->hcall_preempted ? HVM_HCALL_preempted : HVM_HCALL_completed; } --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include #include @@ -815,6 +816,8 @@ p2m_remove_page(struct p2m_domain *p2m, } } +ioreq_request_mapcache_invalidate(p2m->domain); + return p2m_set_entry(p2m, gfn, INVALID_MFN, page_order, p2m_invalid, p2m->default_access); } @@ -1301,6 +1304,8 @@ static int set_typed_p2m_entry(struct do ASSERT(mfn_valid(mfn_add(omfn, i))); set_gpfn_from_mfn(mfn_x(omfn) + i, INVALID_M2P_ENTRY); } + +ioreq_request_mapcache_invalidate(d); } P2M_DEBUG("set %d %lx %lx\n", gfn_p2mt, gfn_l, mfn_x(mfn)); --- a/xen/arch/x86/mm/p2m-pod.c +++ b/xen/arch/x86/mm/p2m-pod.c @@ -20,6 +20,7 @@ */ #include +#include #include #include #include @@ -647,6 +648,8 @@ p2m_pod_decrease_reservation(struct doma set_gpfn_from_mfn(mfn_x(mfn), INVALID_M2P_ENTRY); p2m_pod_cache_add(p2m, page, cur_order); +ioreq_request_mapcache_invalidate(d); + steal_for_cache = ( p
[PATCH v2 1/2] IOREQ: fix waiting for broadcast completion
Checking just a single server is not enough - all of them must have signaled that they're done processing the request. Signed-off-by: Jan Beulich --- v2: New. --- a/xen/common/ioreq.c +++ b/xen/common/ioreq.c @@ -213,9 +213,9 @@ bool vcpu_ioreq_handle_completion(struct return false; } -sv = get_pending_vcpu(v, &s); -if ( sv && !wait_for_io(sv, get_ioreq(s, v)) ) -return false; +while ( (sv = get_pending_vcpu(v, &s)) != NULL ) +if ( !wait_for_io(sv, get_ioreq(s, v)) ) +return false; vio->req.state = ioreq_needs_completion(&vio->req) ? STATE_IORESP_READY : STATE_IOREQ_NONE;
[PATCH v2 0/2] IOREQ: mapcache invalidation request sending corrections
I'm sorry it took so long to prepare v2. I had some trouble figuring a reasonable way to address the main earlier review requests in what is now patch 2; see there for what changed. Patch 1 is new. 1: fix waiting for broadcast completion 2: refine when to send mapcache invalidation request Jan
Re: Null scheduler and vwfi native problem
On Tue, 2021-02-02 at 07:59 +, Julien Grall wrote: > Hi Dario, > Hi! > I have had a quick look at your place. The RCU call in > leave_hypervisor_to_guest() needs to be placed just after the last > call > to check_for_pcpu_work(). > > Otherwise, you may be preempted and keep the RCU quiet. > Ok, makes sense. I'll move it. > The placement in enter_hypervisor_from_guest() doesn't matter too > much, > although I would consider to call it as a late as possible. > Mmmm... Can I ask why? In fact, I would have said "as soon as possible". Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ --- <> (Raistlin Majere) signature.asc Description: This is a digitally signed message part
[xen-unstable test] 158922: tolerable FAIL
flight 158922 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/158922/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-amd64-amd64-libvirt-vhd 16 guest-saverestore fail pass in 158873 test-armhf-armhf-libvirt-raw 13 guest-startfail pass in 158873 test-armhf-armhf-xl-vhd 13 guest-startfail pass in 158873 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-raw 15 saverestore-support-check fail in 158873 like 158835 test-armhf-armhf-libvirt-raw 14 migrate-support-check fail in 158873 never pass test-armhf-armhf-xl-vhd 14 migrate-support-check fail in 158873 never pass test-armhf-armhf-xl-vhd 15 saverestore-support-check fail in 158873 never pass test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 158873 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 158873 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 158873 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 158873 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 158873 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 158873 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 158873 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 158873 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 158873 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 158873 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass version targeted for testing: xen 9dc687f155a57216b83b17f9cde55dd43e06b0cd baseline version: xen 9dc687f155a57216b83b17f9cde55dd43e06b0cd Last test of basis 158922 2021-02-02 01:51:30 Z0 days Testing same since (not found) 0 attempts jobs: build-amd64-xsm pass build-arm64-xsm
[linux-5.4 bisection] complete test-amd64-coresched-i386-xl
branch xen-unstable xenbranch xen-unstable job test-amd64-coresched-i386-xl testid guest-start Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: seabios git://xenbits.xen.org/osstest/seabios.git Tree: xen git://xenbits.xen.org/xen.git *** Found and reproduced problem changeset *** Bug is in tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Bug introduced: a09d4e7acdbf276b2096661ee82454ae3dd24d2b Bug not present: acc402fa5bf502d471d50e3d495379f093a7f9e4 Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/158955/ commit a09d4e7acdbf276b2096661ee82454ae3dd24d2b Author: David Woodhouse Date: Wed Jan 13 13:26:02 2021 + xen: Fix event channel callback via INTX/GSI [ Upstream commit 3499ba8198cad47b731792e5e56b9ec2a78a83a2 ] For a while, event channel notification via the PCI platform device has been broken, because we attempt to communicate with xenstore before we even have notifications working, with the xs_reset_watches() call in xs_init(). We tend to get away with this on Xen versions below 4.0 because we avoid calling xs_reset_watches() anyway, because xenstore might not cope with reading a non-existent key. And newer Xen *does* have the vector callback support, so we rarely fall back to INTX/GSI delivery. To fix it, clean up a bit of the mess of xs_init() and xenbus_probe() startup. Call xs_init() directly from xenbus_init() only in the !XS_HVM case, deferring it to be called from xenbus_probe() in the XS_HVM case instead. Then fix up the invocation of xenbus_probe() to happen either from its device_initcall if the callback is available early enough, or when the callback is finally set up. This means that the hack of calling xenbus_probe() from a workqueue after the first interrupt, or directly from the PCI platform device setup, is no longer needed. Signed-off-by: David Woodhouse Reviewed-by: Boris Ostrovsky Link: https://lore.kernel.org/r/20210113132606.422794-2-dw...@infradead.org Signed-off-by: Juergen Gross Signed-off-by: Sasha Levin For bisection revision-tuple graph see: http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-5.4/test-amd64-coresched-i386-xl.guest-start.html Revision IDs in each graph node refer, respectively, to the Trees above. Running cs-bisection-step --graph-out=/home/logs/results/bisect/linux-5.4/test-amd64-coresched-i386-xl.guest-start --summary-out=tmp/158955.bisection-summary --basis-template=158387 --blessings=real,real-bisect,real-retry linux-5.4 test-amd64-coresched-i386-xl guest-start Searching for failure / basis pass: 158881 fail [host=pinot0] / 158681 [host=rimava1] 158624 [host=pinot1] 158616 ok. Failure / basis pass flights: 158881 / 158616 (tree with no url: minios) Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git Tree: qemuu git://xenbits.xen.org/qemu-xen.git Tree: seabios git://xenbits.xen.org/osstest/seabios.git Tree: xen git://xenbits.xen.org/xen.git Latest 0fbca6ce4174724f28be5268c5d210f51ed96e31 c530a75c1e6a472b0eb9558310b518f0dfcd8860 c6be6dab9c4bdf135bc02b61ecc304d5511c3588 3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 7ea428895af2840d85c524f0bd11a38aac308308 ef88eeaf052c8a7d28c5f85e790c5e45bcffa45e 9dc687f155a57216b83b17f9cde55dd43e06b0cd Basis pass 09f983f0c7fc0db79a5f6c883ec3510d424c369c c530a75c1e6a472b0eb9558310b518f0dfcd8860 96a9acfc527964dc5ab7298862a0cd8aa5fffc6a 3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 7ea428895af2840d85c524f0bd11a38aac308308 ef88eeaf052c8a7d28c5f85e790c5e45bcffa45e 452ddbe3592b141b05a7e0676f09c8ae07f98fdd Generating revisions with ./adhoc-revtuple-generator git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git#09f983f0c7fc0db79a5f6c883ec3510d424c369c-0fbca6ce4174724f28be5268c5d210f51ed96e31 git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860 git://xenbits.xen.org/osstest/ovmf.git#96a9acfc527964dc5ab7298862a0cd8aa5fffc6a-c6be6dab9c4bdf135bc02b61ecc304d5511c3588 git://xenbits.xen.org/qemu-xen-traditional\ .git#3d273dd05e51e5a1ffba3d98c7437ee84e8f8764-3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 git://xenbits.xen.org/qemu-xen.git#7ea428895af2840d85c524f0bd11a38aac308308-7ea428895af2840d85c524f0bd11a38aac308308 git://xenbits.xen.org/os
[xen-unstable-smoke test] 158950: tolerable all pass - PUSHED
flight 158950 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/158950/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 5e7aa904405fa2f268c3af213516bae271de3265 baseline version: xen 9dc687f155a57216b83b17f9cde55dd43e06b0cd Last test of basis 158804 2021-01-30 04:00:24 Z3 days Failing since158892 2021-02-01 16:00:25 Z0 days 12 attempts Testing same since 158950 2021-02-02 11:01:24 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Ian Jackson Jan Beulich Julien Grall Manuel Bouyer Roger Pau Monne Roger Pau Monné jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 9dc687f155..5e7aa90440 5e7aa904405fa2f268c3af213516bae271de3265 -> smoke
Re: [PATCH v9 00/11] acquire_resource size and external IPT monitoring
On 02/02/2021 12:20, Ian Jackson wrote: > Since you mentioned patch 1 and asserted it didn't need a release-ack, > I looked at it in a little more detail. It seems to contain a > moderate amount of (fairly localised) restructuring. IDK whether > XENMEM_acquire_resource is used by non-IPT configurations but I didn't > see an assertion anywhere that it isn't. Acquire resource is used by Qemu/demu/varstored/etc (for IO emulation) and the the domain builder (seeding the grant table with xenstore/console details). None of these usecases used a size calculation, and made blind mapping calls of 1 page in size. IPT is the first usecase to want to map more than a single page in one go. > I appreciate that whether something is "straightforward" on the one > hand, vs involving "substantial refactoring" on the ohter, or this is > a matter of judgement, which I have left up to the commiters during > this part of the freeze. But for the record my view is that this > patch is not a "straightforward bugfix" and needs a release ack. I have extensive testing, demonstrating the bug already present in staging (unable to map the guests whole grant table in default configurations), and demonstrating the correctness of the fix. Some of this testing (specifically, the toos/tests/* binary) is something I plan to fix up over the ARM IOERQ and other series, and submit later this week. It will demonstrate the current bug in staging, and show it fixed with patch 1 committed. (This is something I want to become an autotest in due course.) Other parts of this testing cannot be submitted. To get the compat layer correct, I needed an XTF test and modified Xen which had a known pattern, to check the marshalling logic didn't lose anything when a continuation hit an interesting. I can talk you through these tests and assert that I have run them, but its not testing logic we can commit into Xen, and its not anything which gets tested by OSSTest because we don't test 32bit PVH dom0's in anger. ~Andrew
Re: [PATCH v9 00/11] acquire_resource size and external IPT monitoring
Andrew Cooper writes ("[PATCH v9 00/11] acquire_resource size and external IPT monitoring"): ... > Therefore, I'd like to request a release exception. Thanks for this writeup. There is discussion here of the upside of granting an exception, which certainly seems substantial enough to give this serious consideration. > It [is] fairly isolated in terms of interactions with the rest of > Xen, so the changes of a showstopper affecting other features is > very slim. This is encouraging (optimistic, even) but very general. I would like to see a frank and detailed assessment of the downside risks, ideally based on analysis of the individual patches. When I say a "frank and detailed assessment" I'm hoping to have a list of the specific design and code changes that pose a risk to non-IPT configurations, in decreasing order of risk. For each one there should be brief discussion of the measures that have exist to control that risk (eg, additional review, additional testing), and a characterisation of the resulting risk (both in terms of likelihood and severity of the resulting bug). All risks that would come to a diligent reviewer's mind should be mentioned and explicitly delath with, even if it is immediately clear that they are not a real risk. Do you think that would be feasible ? We would want to make a decision ASAP so it would have to be done quickly too - in the next few days and certainly by the end of the week. Since you mentioned patch 1 and asserted it didn't need a release-ack, I looked at it in a little more detail. It seems to contain a moderate amount of (fairly localised) restructuring. IDK whether XENMEM_acquire_resource is used by non-IPT configurations but I didn't see an assertion anywhere that it isn't. I appreciate that whether something is "straightforward" on the one hand, vs involving "substantial refactoring" on the ohter, or this is a matter of judgement, which I have left up to the commiters during this part of the freeze. But for the record my view is that this patch is not a "straightforward bugfix" and needs a release ack. To give you an idea of what kind of thing I am looking for in a risk assessment, I have written one up for [PATCH v9 03/11] tools/[lib]xl: Add vmtrace_buf_size parameter Ideally I would like to go through a similar process for the other patches. I appreciate that this is rather a more throrough process than we have adopted in the past. Thanks, Ian.
Re: [PATCH v9 03/11] tools/[lib]xl: Add vmtrace_buf_size parameter
Ian Jackson writes ("Re: [PATCH v9 03/11] tools/[lib]xl: Add vmtrace_buf_size parameter"): > Andrew Cooper writes ("[PATCH v9 03/11] tools/[lib]xl: Add vmtrace_buf_size > parameter"): > > From: Michał Leszczyński > > > > Allow to specify the size of per-vCPU trace buffer upon > > domain creation. This is zero by default (meaning: not enabled). > ... > > Wearing my maintainer/reviewer hat: > > Release risk assessment for this patch: > > * This contains golang changes which might break the build or need >updates to golang generated files. This ought to be detected by >our tests so we can fix it. At this stage of the release that is >probably OK. The risk of actually shipping a broken build is low. > > * The patch introduces a new libxl config parameter. That has API >and UI implications. But it is a very small change and the >semantics are fairly obvious. The name likewise is fine. So I am >very comfortable with recommending this late addition to these >APIs. > > * The patch contains buffer size handling code. In the general case >that might produce a risk of buffer overruns. But at least here in >this patch this is actually just the configured size of a buffer, >and actual length/use checks are done elsewhere, so this is is not >a real risk. Consequently, wearing my RM hat, this patch: Release-Acked-by: Ian Jackson
Re: [PATCH v9 03/11] tools/[lib]xl: Add vmtrace_buf_size parameter
Andrew Cooper writes ("[PATCH v9 03/11] tools/[lib]xl: Add vmtrace_buf_size parameter"): > From: Michał Leszczyński > > Allow to specify the size of per-vCPU trace buffer upon > domain creation. This is zero by default (meaning: not enabled). ... Wearing my maintainer/reviewer hat: Release risk assessment for this patch: * This contains golang changes which might break the build or need updates to golang generated files. This ought to be detected by our tests so we can fix it. At this stage of the release that is probably OK. The risk of actually shipping a broken build is low. * The patch introduces a new libxl config parameter. That has API and UI implications. But it is a very small change and the semantics are fairly obvious. The name likewise is fine. So I am very comfortable with recommending this late addition to these APIs. * The patch contains buffer size handling code. In the general case that might produce a risk of buffer overruns. But at least here in this patch this is actually just the configured size of a buffer, and actual length/use checks are done elsewhere, so this is is not a real risk. Ian.
Re: [PATCH v8 00/16] acquire_resource size and external IPT monitoring
On 01.02.21 16:00, Andrew Cooper wrote: Hi Andrew On 01/02/2021 13:47, Oleksandr wrote: On 01.02.21 15:07, Andrew Cooper wrote: Hi Andrew On 01/02/2021 12:34, Oleksandr wrote: On 30.01.21 04:58, Andrew Cooper wrote: One query I did leave on IRC, and hasn't had an answer. What is the maximum number of vcpus in an ARM guest? public/arch-arm.h says that current supported guest VCPUs max number is 128. You moved an x86-ism "max 128 vcpus" into common code. Ooh, I am not sure I understand where exactly. Could you please clarify in which patch? ioreq_server_get_frame() hardcodes "there is exactly one non-bufioreq frame", which in practice means there is 128 vcpu's work of struct ioreqs contained within the mapping. I've coded ioreq_server_max_frames() to perform the calculation correctly, but ioreq_server_get_frame() will need fixing by whomever supports more than 128 vcpus with ioreq servers first. Thank you for the explanation. Now it is clear what you meant. -- Regards, Oleksandr Tyshchenko
Re: staging: unable to restore HVM with Viridian param set
On 02/02/2021 11:51, Ian Jackson wrote: > Igor Druzhinin writes ("Re: staging: unable to restore HVM with Viridian > param set"): >> On 02/02/2021 08:35, Paul Durrant wrote: >>> Surely it should be addressed properly in libxl by not messing with the >>> viridian settings on migrate-in/resume, as Andrew says? TBH I thought this >>> was already the case. There should be no problem with adding to the default >>> set as this is just an xl/libxl concept; the param flags in Xen are always >>> define the *exact* set of enabled enlightenments. >> If maintainers agree with this approach I can make a patch. > If Andy is in favour of this approach then certainly it is fine by me. Yeah - in the case that we're restoring from a suspend image or migrate, just skip setting the viridian flags in build. They are in the the stream, and will be restored later. ~Andrew P.S. This is going to get even more complicated when it all shifts into CPUID settings, but it needs to happen at some point.
Re: staging: unable to restore HVM with Viridian param set
Igor Druzhinin writes ("Re: staging: unable to restore HVM with Viridian param set"): > On 02/02/2021 08:35, Paul Durrant wrote: > > Surely it should be addressed properly in libxl by not messing with the > > viridian settings on migrate-in/resume, as Andrew says? TBH I thought this > > was already the case. There should be no problem with adding to the default > > set as this is just an xl/libxl concept; the param flags in Xen are always > > define the *exact* set of enabled enlightenments. > > If maintainers agree with this approach I can make a patch. If Andy is in favour of this approach then certainly it is fine by me. FTR, preemptively, Release-Acked-by: Ian Jackson although as this is a bugfix it probably doesn't need one. Ian.
Re: staging: unable to restore HVM with Viridian param set
On 02/02/2021 08:35, Paul Durrant wrote: >> -Original Message- >> From: Igor Druzhinin >> Sent: 02 February 2021 00:20 >> To: Andrew Cooper ; Tamas K Lengyel >> ; Xen-devel >> ; Wei Liu ; Ian Jackson >> ; Anthony >> PERARD ; Paul Durrant >> Subject: Re: staging: unable to restore HVM with Viridian param set >> >> n 01/02/2021 22:57, Andrew Cooper wrote: >>> On 01/02/2021 22:51, Tamas K Lengyel wrote: Hi all, trying to restore a Windows VM saved on Xen 4.14 with Xen staging results in: # xl restore -p /shared/cfg/windows10.save Loading new save file /shared/cfg/windows10.save (new xl fmt info 0x3/0x0/1475) Savefile contains xl domain config in JSON format Parsing config from xc: info: Found x86 HVM domain from Xen 4.14 xc: info: Restoring domain xc: error: set HVM param 9 = 0x0065 (17 = File exists): Internal error xc: error: Restore failed (17 = File exists): Internal error libxl: error: libxl_stream_read.c:850:libxl__xc_domain_restore_done: restoring domain: File exists libxl: error: libxl_create.c:1581:domcreate_rebuild_done: Domain 8:cannot (re-)build domain: -3 libxl: error: libxl_domain.c:1182:libxl__destroy_domid: Domain 8:Non-existant domain libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain 8:Unable to destroy guest libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain 8:Destruction of domain failed Running on staging 419cd07895891c6642f29085aee07be72413f08c >>> >>> CC'ing maintainers and those who've edited the code recently. >>> >>> What is happening is xl/libxl is selecting some viridian settings, >>> applying them to the domain, and then the migrations stream has a >>> different set of viridian settings. >>> >>> For a migrating-in VM, nothing should be set during domain build. >>> Viridian state has been part of the migrate stream since before mig-v2, >>> so can be considered to be everywhere relevant now. >> >> The fallout is likely from my changes that modified default set of viridian >> settings. The relevant commits: >> 983524671031fcfdb24a6c0da988203ebb47aebe >> 7e5cffcd1e9300cab46a1816b5eb676caaeed2c1 >> >> The same config from migrated domains now implies different set of viridian >> extensions then those set at the source side. That creates inconsistency in >> libxl. I don't really know how to address it properly in libxl other than >> don't extend the default set ever. >> > > Surely it should be addressed properly in libxl by not messing with the > viridian settings on migrate-in/resume, as Andrew says? TBH I thought this > was already the case. There should be no problem with adding to the default > set as this is just an xl/libxl concept; the param flags in Xen are always > define the *exact* set of enabled enlightenments. If maintainers agree with this approach I can make a patch. Igor
[PATCH] xenstore: Fix all builds
Andrew Cooper writes ("[PATCH] xenstore: Fix all builds"): > This diff is easier viewed through `cat -A` ... > A non-breaking space isn't a valid C preprocessor token. Yikes. Thanks! Ian.
[PATCH] drivers: net: xen-netfront: Simplify the calculation of variables
Fix the following coccicheck warnings: ./drivers/net/xen-netfront.c:1816:52-54: WARNING !A || A && B is equivalent to !A || B. Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong --- drivers/net/xen-netfront.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index b01848e..5158841 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -1813,7 +1813,7 @@ static int setup_netfront(struct xenbus_device *dev, * a) feature-split-event-channels == 0 * b) feature-split-event-channels == 1 but failed to setup */ - if (!feature_split_evtchn || (feature_split_evtchn && err)) + if (!feature_split_evtchn || err) err = setup_netfront_single(queue); if (err) -- 1.8.3.1
[xen-unstable-smoke test] 158946: regressions - trouble: blocked/fail
flight 158946 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/158946/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 6 xen-buildfail REGR. vs. 158804 build-arm64-xsm 6 xen-buildfail REGR. vs. 158804 build-armhf 6 xen-buildfail REGR. vs. 158804 Tests which did not succeed, but are not blocking: build-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a version targeted for testing: xen ffbb8aa282de262403275f2395d8540818cf576e baseline version: xen 9dc687f155a57216b83b17f9cde55dd43e06b0cd Last test of basis 158804 2021-01-30 04:00:24 Z3 days Failing since158892 2021-02-01 16:00:25 Z0 days 11 attempts Testing same since 158897 2021-02-01 19:02:25 Z0 days 10 attempts People who touched revisions under test: Ian Jackson Manuel Bouyer Roger Pau Monne Roger Pau Monné jobs: build-arm64-xsm fail build-amd64 fail build-armhf fail build-amd64-libvirt blocked test-armhf-armhf-xl blocked test-arm64-arm64-xl-xsm blocked test-amd64-amd64-xl-qemuu-debianhvm-amd64blocked test-amd64-amd64-libvirt blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. commit ffbb8aa282de262403275f2395d8540818cf576e Author: Roger Pau Monne Date: Mon Feb 1 16:53:17 2021 +0100 xenstore: fix build on {Net/Free}BSD The endian.h header is in sys/ on NetBSD and FreeBSD. Signed-off-by: Roger Pau Monné Acked-by: Ian Jackson Release-Acked-by: Ian Jackson commit 419cd07895891c6642f29085aee07be72413f08c Author: Ian Jackson Date: Mon Feb 1 15:18:36 2021 + xenpmd.c: Remove hard tab bbed98e7cedc "xenpmd.c: use dynamic allocation" had a hard tab. I thought we had fixed that and I thought I had checked. Remove it now. Signed-off-by: Ian Jackson commit bbed98e7cedcd5072671c21605330075740382d3 Author: Manuel Bouyer Date: Sat Jan 30 19:27:10 2021 +0100 xenpmd.c: use dynamic allocation On NetBSD, d_name is larger than 256, so file_name[284] may not be large enough (and gcc emits a format-truncation error). Use asprintf() instead of snprintf() on a static on-stack buffer. Signed-off-by: Manuel Bouyer Reviewed-by: Ian Jackson Reviewed-by: Roger Pau Monné Release-Acked-by: Ian Jackson Plus define GNU_SOURCE for asprintf() Harmless on NetBSD. Signed-off-by: Manuel Bouyer Reviewed-by: Ian Jackson Release-Acked-by: Ian Jackson (qemu changes not included)
[linux-linus test] 158915: regressions - trouble: broken/fail/pass
flight 158915 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/158915/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-arm64-arm64-xl broken test-arm64-arm64-xl-seattle broken test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-ws16-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemut-rhel6hvm-intel 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-xsm7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemuu-rhel6hvm-intel 7 xen-install fail REGR. vs. 152332 test-amd64-i386-examine 6 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332 test-amd64-coresched-i386-xl 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemuu-rhel6hvm-amd 7 xen-installfail REGR. vs. 152332 test-amd64-i386-qemut-rhel6hvm-amd 7 xen-installfail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-ws16-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-freebsd10-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-raw7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-pvshim 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-shadow 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-freebsd10-i386 7 xen-installfail REGR. vs. 152332 test-amd64-i386-xl-qemuu-win7-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-win7-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-ovmf-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332 test-arm64-arm64-xl-credit1 10 host-ping-check-xen fail REGR. vs. 152332 test-arm64-arm64-libvirt-xsm 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-examine 13 examine-iommufail REGR. vs. 152332 test-amd64-amd64-amd64-pvgrub 20 guest-stop fail REGR. vs. 152332 test-arm64-arm64-xl-thunderx 14 guest-start fail REGR. vs. 152332 test-armhf-armhf-xl-multivcpu 14 guest-start fail REGR. vs. 152332 test-amd64-amd64-i386-pvgrub 20 guest-stop fail REGR. vs. 152332 test-armhf-armhf-xl-credit1 14 guest-start fail REGR. vs. 152332 test-armhf-armhf-xl-arndale 14 guest-start fail REGR. vs. 152332 test-amd64-i386-xl-qemut-debianhvm-amd64 7 xen-install fail REGR. vs. 152332 test-armhf-armhf-libvirt 14 guest-start fail REGR. vs. 152332 test-armhf-armhf-xl-cubietruck 14 guest-startfail REGR. vs. 152332 test-armhf-armhf-xl-credit2 14 guest-start fail REGR. vs. 152332 test-arm64-arm64-xl-xsm 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-xl-credit2 10 host-ping-check-xen fail REGR. vs. 152332 test-armhf-armhf-xl-vhd 13 guest-start fail REGR. vs. 152332 test-armhf-armhf-xl 14 guest-start fail REGR. vs. 152332 test-armhf-armhf-libvirt-raw 13 guest-start fail REGR. vs. 152332 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 14 guest-start fail REGR. vs. 152332 Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-seattle 5 host-install(5) broken blocked in 152332 test-arm64-arm64-xl 5 host-install(5) broken blocked in 152332 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-instal
[xen-unstable-smoke test] 158944: regressions - trouble: blocked/fail
flight 158944 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/158944/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 6 xen-buildfail REGR. vs. 158804 build-arm64-xsm 6 xen-buildfail REGR. vs. 158804 build-armhf 6 xen-buildfail REGR. vs. 158804 Tests which did not succeed, but are not blocking: build-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a version targeted for testing: xen ffbb8aa282de262403275f2395d8540818cf576e baseline version: xen 9dc687f155a57216b83b17f9cde55dd43e06b0cd Last test of basis 158804 2021-01-30 04:00:24 Z3 days Failing since158892 2021-02-01 16:00:25 Z0 days 10 attempts Testing same since 158897 2021-02-01 19:02:25 Z0 days9 attempts People who touched revisions under test: Ian Jackson Manuel Bouyer Roger Pau Monne Roger Pau Monné jobs: build-arm64-xsm fail build-amd64 fail build-armhf fail build-amd64-libvirt blocked test-armhf-armhf-xl blocked test-arm64-arm64-xl-xsm blocked test-amd64-amd64-xl-qemuu-debianhvm-amd64blocked test-amd64-amd64-libvirt blocked sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Not pushing. commit ffbb8aa282de262403275f2395d8540818cf576e Author: Roger Pau Monne Date: Mon Feb 1 16:53:17 2021 +0100 xenstore: fix build on {Net/Free}BSD The endian.h header is in sys/ on NetBSD and FreeBSD. Signed-off-by: Roger Pau Monné Acked-by: Ian Jackson Release-Acked-by: Ian Jackson commit 419cd07895891c6642f29085aee07be72413f08c Author: Ian Jackson Date: Mon Feb 1 15:18:36 2021 + xenpmd.c: Remove hard tab bbed98e7cedc "xenpmd.c: use dynamic allocation" had a hard tab. I thought we had fixed that and I thought I had checked. Remove it now. Signed-off-by: Ian Jackson commit bbed98e7cedcd5072671c21605330075740382d3 Author: Manuel Bouyer Date: Sat Jan 30 19:27:10 2021 +0100 xenpmd.c: use dynamic allocation On NetBSD, d_name is larger than 256, so file_name[284] may not be large enough (and gcc emits a format-truncation error). Use asprintf() instead of snprintf() on a static on-stack buffer. Signed-off-by: Manuel Bouyer Reviewed-by: Ian Jackson Reviewed-by: Roger Pau Monné Release-Acked-by: Ian Jackson Plus define GNU_SOURCE for asprintf() Harmless on NetBSD. Signed-off-by: Manuel Bouyer Reviewed-by: Ian Jackson Release-Acked-by: Ian Jackson (qemu changes not included)
RE: [PATCH v2] xen-blkback: fix compatibility bug with single page rings
> -Original Message- > From: Xen-devel On Behalf Of Dongli > Zhang > Sent: 30 January 2021 05:09 > To: p...@xen.org; 'Jürgen Groß' ; > xen-devel@lists.xenproject.org; linux- > bl...@vger.kernel.org; linux-ker...@vger.kernel.org > Cc: 'Paul Durrant' ; 'Konrad Rzeszutek Wilk' > ; 'Roger Pau > Monné' ; 'Jens Axboe' > Subject: Re: [PATCH v2] xen-blkback: fix compatibility bug with single page > rings > > > > On 1/29/21 12:13 AM, Paul Durrant wrote: > >> -Original Message- > >> From: Jürgen Groß > >> Sent: 29 January 2021 07:35 > >> To: Dongli Zhang ; Paul Durrant ; > >> xen- > >> de...@lists.xenproject.org; linux-bl...@vger.kernel.org; > >> linux-ker...@vger.kernel.org > >> Cc: Paul Durrant ; Konrad Rzeszutek Wilk > >> ; Roger Pau > >> Monné ; Jens Axboe > >> Subject: Re: [PATCH v2] xen-blkback: fix compatibility bug with single > >> page rings > >> > >> On 29.01.21 07:20, Dongli Zhang wrote: > >>> > >>> > >>> On 1/28/21 5:04 AM, Paul Durrant wrote: > From: Paul Durrant > > Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to > avoid > inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the > behaviour of xen-blkback when connecting to a frontend was: > > - read 'ring-page-order' > - if not present then expect a single page ring specified by 'ring-ref' > - else expect a ring specified by 'ring-refX' where X is between 0 and > 1 << ring-page-order > > This was correct behaviour, but was broken by the afforementioned commit > to > become: > > - read 'ring-page-order' > - if not present then expect a single page ring (i.e. ring-page-order = > 0) > - expect a ring specified by 'ring-refX' where X is between 0 and > 1 << ring-page-order > - if that didn't work then see if there's a single page ring specified by > 'ring-ref' > > This incorrect behaviour works most of the time but fails when a frontend > that sets 'ring-page-order' is unloaded and replaced by one that does not > because, instead of reading 'ring-ref', xen-blkback will read the stale > 'ring-ref0' left around by the previous frontend will try to map the > wrong > grant reference. > > This patch restores the original behaviour. > > Fixes: 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid > inconsistent xenstore 'ring- > page- > >> order' set by malicious blkfront") > Signed-off-by: Paul Durrant > --- > Cc: Konrad Rzeszutek Wilk > Cc: "Roger Pau Monné" > Cc: Jens Axboe > Cc: Dongli Zhang > > v2: > - Remove now-spurious error path special-case when nr_grefs == 1 > --- > drivers/block/xen-blkback/common.h | 1 + > drivers/block/xen-blkback/xenbus.c | 38 +- > 2 files changed, 17 insertions(+), 22 deletions(-) > > diff --git a/drivers/block/xen-blkback/common.h > b/drivers/block/xen-blkback/common.h > index b0c71d3a81a0..524a79f10de6 100644 > --- a/drivers/block/xen-blkback/common.h > +++ b/drivers/block/xen-blkback/common.h > @@ -313,6 +313,7 @@ struct xen_blkif { > > struct work_struct free_work; > unsigned intnr_ring_pages; > +boolmulti_ref; > >>> > >>> Is it really necessary to introduce 'multi_ref' here or we may just re-use > >>> 'nr_ring_pages'? > >>> > >>> According to blkfront code, 'ring-page-order' is set only when it is not > >>> zero, > >>> that is, only when (info->nr_ring_pages > 1). > >> > > > > That's how it is *supposed* to be. Windows certainly behaves that way too. > > > >> Did you look into all other OS's (Windows, OpenBSD, FreebSD, NetBSD, > >> Solaris, Netware, other proprietary systems) implementations to verify > >> that claim? > >> > >> I don't think so. So better safe than sorry. > >> > > > > Indeed. It was unfortunate that the commit to blkif.h documenting > > multi-page (829f2a9c6dfae) was not > crystal clear and (possibly as a consequence) blkback was implemented to read > ring-ref0 rather than > ring-ref if ring-page-order was present and 0. Hence the only safe thing to > do is to restore that > behaviour. > > > > Thank you very much for the explanation! > > Reviewed-by: Dongli Zhang > Thanks. Roger, Konrad, can I get a maintainer ack or otherwise, please? Paul
Re: [PATCH] xenstore: Fix all builds
On Mon, Feb 01, 2021 at 11:35:13PM +, Andrew Cooper wrote: > This diff is easier viewed through `cat -A` > > diff --git a/tools/xenstore/include/xenstore_state.h > b/tools/xenstore/include/xenstore_state.h$ > index 1bd443f61a..f7e4da2b2c 100644$ > --- a/tools/xenstore/include/xenstore_state.h$ > +++ b/tools/xenstore/include/xenstore_state.h$ > @@ -21,7 +21,7 @@$ >#ifndef XENSTORE_STATE_H$ >#define XENSTORE_STATE_H$ >$ > -#if defined(__FreeBSD__) ||M-BM- defined(__NetBSD__)$ > +#if defined(__FreeBSD__) || defined(__NetBSD__)$ >#include $ >#else$ >#include $ > > A non-breaking space isn't a valid C preprocessor token. > > Fixes: ffbb8aa282de ("xenstore: fix build on {Net/Free}BSD") Sorry. I fixed this locally but forgot to refresh the patch and ended up sending the broken version. I need to figure a way to make format-patch fail if there are uncommitted changes in the repository. Thanks, Roger.
Re: [PATCH v9 02/11] xen/domain: Add vmtrace_size domain creation parameter
On 02.02.2021 00:26, Andrew Cooper wrote: > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -132,6 +132,56 @@ static void vcpu_info_reset(struct vcpu *v) > v->vcpu_info_mfn = INVALID_MFN; > } > > +static void vmtrace_free_buffer(struct vcpu *v) > +{ > +const struct domain *d = v->domain; > +struct page_info *pg = v->vmtrace.pg; > +unsigned int i; > + > +if ( !pg ) > +return; > + > +v->vmtrace.pg = NULL; > + > +for ( i = 0; i < (d->vmtrace_size >> PAGE_SHIFT); i++ ) > +{ > +put_page_alloc_ref(&pg[i]); > +put_page_and_type(&pg[i]); > +} > +} > + > +static int vmtrace_alloc_buffer(struct vcpu *v) > +{ > +struct domain *d = v->domain; > +struct page_info *pg; > +unsigned int i; > + > +if ( !d->vmtrace_size ) > +return 0; > + > +pg = alloc_domheap_pages(d, get_order_from_bytes(d->vmtrace_size), > + MEMF_no_refcount); > +if ( !pg ) > +return -ENOMEM; > + > +for ( i = 0; i < (d->vmtrace_size >> PAGE_SHIFT); i++ ) > +if ( unlikely(!get_page_and_type(&pg[i], d, PGT_writable_page)) ) > +goto refcnt_err; > + > +/* > + * We must only let vmtrace_free_buffer() take any action in the success > + * case when we've taken all the refs it intends to drop. > + */ > +v->vmtrace.pg = pg; > +return 0; > + > + refcnt_err: > +while ( i-- ) > +put_page_and_type(&pg[i]); > + > +return -ENODATA; Would you mind at least logging how many pages may be leaked here? I also don't understand why you don't call put_page_alloc_ref() in the loop - that's fine to do prior to the put_page_and_type(), and will at least limit the leak. The buffer size here typically isn't insignificant, and it may be helpful to not unnecessarily defer the freeing to relinquish_resources() (assuming we will make that one also traverse the list of "extra" pages, but I understand that's not going to happen for 4.15 anymore anyway). Additionally, while I understand you're not in favor of the comments we have on all three similar code paths, I think replicating their comments here would help easily spotting (by grep-ing e.g. for "fishy") all instances that will need adjusting once we have figured a better overall solution. Jan
Re: [PATCH] xenstore: Fix all builds
On 02.02.2021 00:35, Andrew Cooper wrote: > This diff is easier viewed through `cat -A` > > diff --git a/tools/xenstore/include/xenstore_state.h > b/tools/xenstore/include/xenstore_state.h$ > index 1bd443f61a..f7e4da2b2c 100644$ > --- a/tools/xenstore/include/xenstore_state.h$ > +++ b/tools/xenstore/include/xenstore_state.h$ > @@ -21,7 +21,7 @@$ >#ifndef XENSTORE_STATE_H$ >#define XENSTORE_STATE_H$ >$ > -#if defined(__FreeBSD__) ||M-BM- defined(__NetBSD__)$ > +#if defined(__FreeBSD__) || defined(__NetBSD__)$ >#include $ >#else$ >#include $ > > A non-breaking space isn't a valid C preprocessor token. > > Fixes: ffbb8aa282de ("xenstore: fix build on {Net/Free}BSD") > Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich I wonder why you didn't throw this in right away, without waiting for any acks. Jan
RE: staging: unable to restore HVM with Viridian param set
> -Original Message- > From: Igor Druzhinin > Sent: 02 February 2021 00:20 > To: Andrew Cooper ; Tamas K Lengyel > ; Xen-devel > ; Wei Liu ; Ian Jackson > ; Anthony > PERARD ; Paul Durrant > Subject: Re: staging: unable to restore HVM with Viridian param set > > n 01/02/2021 22:57, Andrew Cooper wrote: > > On 01/02/2021 22:51, Tamas K Lengyel wrote: > >> Hi all, > >> trying to restore a Windows VM saved on Xen 4.14 with Xen staging results > >> in: > >> > >> # xl restore -p /shared/cfg/windows10.save > >> Loading new save file /shared/cfg/windows10.save (new xl fmt info > >> 0x3/0x0/1475) > >> Savefile contains xl domain config in JSON format > >> Parsing config from > >> xc: info: Found x86 HVM domain from Xen 4.14 > >> xc: info: Restoring domain > >> xc: error: set HVM param 9 = 0x0065 (17 = File exists): > >> Internal error > >> xc: error: Restore failed (17 = File exists): Internal error > >> libxl: error: libxl_stream_read.c:850:libxl__xc_domain_restore_done: > >> restoring domain: File exists > >> libxl: error: libxl_create.c:1581:domcreate_rebuild_done: Domain > >> 8:cannot (re-)build domain: -3 > >> libxl: error: libxl_domain.c:1182:libxl__destroy_domid: Domain > >> 8:Non-existant domain > >> libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain > >> 8:Unable to destroy guest > >> libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain > >> 8:Destruction of domain failed > >> > >> Running on staging 419cd07895891c6642f29085aee07be72413f08c > > > > CC'ing maintainers and those who've edited the code recently. > > > > What is happening is xl/libxl is selecting some viridian settings, > > applying them to the domain, and then the migrations stream has a > > different set of viridian settings. > > > > For a migrating-in VM, nothing should be set during domain build. > > Viridian state has been part of the migrate stream since before mig-v2, > > so can be considered to be everywhere relevant now. > > The fallout is likely from my changes that modified default set of viridian > settings. The relevant commits: > 983524671031fcfdb24a6c0da988203ebb47aebe > 7e5cffcd1e9300cab46a1816b5eb676caaeed2c1 > > The same config from migrated domains now implies different set of viridian > extensions then those set at the source side. That creates inconsistency in > libxl. I don't really know how to address it properly in libxl other than > don't extend the default set ever. > Surely it should be addressed properly in libxl by not messing with the viridian settings on migrate-in/resume, as Andrew says? TBH I thought this was already the case. There should be no problem with adding to the default set as this is just an xl/libxl concept; the param flags in Xen are always define the *exact* set of enabled enlightenments. Paul > Igor
Re: [PATCH] memory: fix build with COVERAGE but !HVM
On 01.02.2021 19:26, Andrew Cooper wrote: > On 01/02/2021 16:20, Jan Beulich wrote: >> Xen is heavily relying on the DCE stage to remove unused code so the >> linker doesn't throw an error because a function is not implemented >> yet we defined a prototype for it. >> >> On some GCC versions (such as 9.4 provided by Debian sid), the compiler >> DCE stage will not manage to figure that out for >> xenmem_add_to_physmap_batch(): >> >> ld: ld: prelink.o: in function `xenmem_add_to_physmap_batch': >> /xen/xen/common/memory.c:942: undefined reference to >> `xenmem_add_to_physmap_one' >> /xen/xen/common/memory.c:942:(.text+0x22145): relocation truncated >> to fit: R_X86_64_PLT32 against undefined symbol `xenmem_add_to_physmap_one' >> prelink-efi.o: in function `xenmem_add_to_physmap_batch': >> /xen/xen/common/memory.c:942: undefined reference to >> `xenmem_add_to_physmap_one' >> make[2]: *** [Makefile:215: /root/xen/xen/xen.efi] Error 1 >> make[2]: *** Waiting for unfinished jobs >> ld: /xen/xen/.xen-syms.0: hidden symbol `xenmem_add_to_physmap_one' isn't >> defined >> ld: final link failed: bad value >> >> It is not entirely clear why the compiler DCE is not detecting the >> unused code. However, cloning the check introduced by the commit below >> into xenmem_add_to_physmap_batch() does the trick. >> >> No functional change intended. >> >> Fixes: d4f699a0df6c ("x86/mm: p2m_add_foreign() is HVM-only") >> Reported-by: Oleksandr Tyshchenko >> Signed-off-by: Julien Grall >> Signed-off-by: Jan Beulich >> Release-Acked-by: Ian Jackson >> --- >> Julien, since I reused most of your patch'es description, I've kept your >> S-o-b. Please let me know if you want me to drop it. >> >> --- a/xen/common/memory.c >> +++ b/xen/common/memory.c >> @@ -904,6 +904,19 @@ static int xenmem_add_to_physmap_batch(s >> { >> union add_to_physmap_extra extra = {}; >> >> +/* >> + * While, unlike xenmem_add_to_physmap(), this function is static, there >> + * still have been cases observed where xatp_permission_check(), invoked >> + * by our caller, doesn't lead to elimination of this entire function >> when >> + * the compile time evaluation of paging_mode_translate(d) is false. >> Guard >> + * against this be replicating the same check here. >> + */ > > Acked-by: Andrew Cooper , Thanks. > but I feel this > comment can be far more precise/concise. > > /* In some configurations, (!HVM, COVERAGE), the > xenmem_add_to_physmap_one() call doesn't succumb to > dead-code-elimination. Duplicate the short-circut from > xatp_permission_check() to try and help the compiler out. */ I'm perfectly fine to take this one. I have to admit though that I first needed to look up "succumb" ... Jan
Re: [PATCH v2 3/3] x86/time: don't move TSC backwards in time_calibration_tsc_rendezvous()
On 01.02.2021 13:43, Jan Beulich wrote: > As per the comment ahead of it, the original purpose of the function was > to deal with TSCs halted in deep C states. While this probably explains > why only forward moves were ever expected, I don't see how this could > have been reliable in case CPU0 was deep-sleeping for a sufficiently > long time. My only guess here is a hidden assumption of CPU0 never being > idle for long enough. Furthermore that comment looks to be contradicting the actual use of the function: It gets installed when !RELIABLE_TSC, while the comment would suggest !NONSTOP_TSC. I suppose the comment is simply misleading, because RELIABLE_TSC implies NONSTOP_TSC according to all the places where either of the two feature bits gets played with. Plus in the !NONSTOP_TSC case we write the TSC explicitly anyway when coming back out of a (deep; see below) C-state. As an implication from the above mwait_idle_cpu_init() then looks to pointlessly clear "reliable" when "nonstop" is clear. It further looks odd that mwait_idle() (unlike acpi_processor_idle()) calls cstate_restore_tsc() independent of what C-state was active. > @@ -1719,9 +1737,12 @@ static void time_calibration_tsc_rendezv > while ( atomic_read(&r->semaphore) > total_cpus ) > cpu_relax(); > } > + > +/* Just in case a read above ended up reading zero. */ > +tsc += !tsc; > } > > -time_calibration_rendezvous_tail(r, r->master_tsc_stamp); > +time_calibration_rendezvous_tail(r, tsc, r->master_tsc_stamp); This, in particular, wouldn't be valid when !NONSTOP_TSC without cstate_restore_tsc(). We then wouldn't have a way to know whether the observed gap is because of the TSC having been halted for a while (as the comment ahead of the function - imo wrongly, as per above - suggests), or whether - like in Claudemir's case - the individual TSCs were offset against one another. Jan
Re: Question about xen and Rasp 4B
Hi Roman, Good catch. GPU works now and I can start X! Thanks! I was also able to create domU that runs Raspian OS. This is very interesting that you were able to achieve that - congrats! Now, sorry to be a bit dense -- but since this thread went into all sorts of interesting directions all at once -- I just have a very particular question: what is exact combination of versions of Xen, Linux kernel and a set of patches that went on top that allowed you to do that? I'd love to obviously see it productized in Xen upstream, but for now -- I'd love to make available to Project EVE/Xen community since there seems to be a few folks interested in EVE/Xen combo being able to do that. I have tried Xen Release 4.14.0, 4.14.1 and master (from week 4, 2021). Kernel rpi-5.9.y and rpi-5.10.y branches from https://github.com/raspberrypi/linux and U-boot (master). For the GPU to work it was enough to disable swiotlb from the kernel(s) as suggested in this thread. If you use Xen master then you need to revert the 25849c8b16f2a5b7fcd0a823e80a5f1b590291f9. Apparently v3d uses same resources and will not start. I was able to get most of the combinations to work without any big efforts. In case you use USB SSD U-boot needs a fix if you use 5.9 kernel. The 5.10 works with the lates Xen master but then you need one small workaround to the xen since there is address that Xen cannot map. Some usb address cannot be found (address was 0 if recall correctly). I just by passed the error: diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index e824ba34b0..3e8a175f2e 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1409,7 +1409,7 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev, { printk(XENLOG_ERR "Unable to retrieve address %u for %s\n", i, dt_node_full_name(dev)); -return res; +continue; //return res; } res = map_range_to_domain(dev, addr, size, &mr_data);