Re: [Xen-devel] [PATCH 01/11] public: adjust documentation following XSA-217
>>> On 21.06.17 at 18:53, wrote: > On 21/06/17 16:54, Jan Beulich wrote: > On 21.06.17 at 17:44, wrote: >>> When you have a long series like this, could you name the files in a way >>> that makes it easier for a shell script to get their order? e.g., >>> 01-xsa217-doc.patch, 02-gnttab-xfer-xenheap.patch, &c? >>> >>> Having to manually save-and-apply the name of each patch of an >>> eleven-patch series separately is fairly annoying. If they started with >>> a number, I could save them all to the same directory and then use "for >>> patch in *.patch ; do..." to apply them. >>> >>> (Using `git send-email` would also make things a lot easier...) >> >> Well, that's kind of difficult without using git for development work. >> I can try to remember to name patches suitably, but that's >> another manual step for me then > > Well, it's either an extra manual step for you, or an extra manual step > for every person who reviews your code. That's a good point (which in fact I did think of while replying), yet with one questionable aspect: Patch reviewing doesn't generally mean patch application. Afaic, it's very rare that I find it necessary to apply a patch in order to review it. Hence I'm not convinced this is an extra manual step for everyone looking at the patch. > You don't need to justify to me > why you don't want to use git for development. However, it's not fair > to externalize the cost of your development preferences to everybody else. > > When it's just one or two patches I'm willing to make some > accommodation, but a series like this it's too much. As said - I'll try to remember next time. >> , and while I continue to attach >> files I would have hoped that the mail bodies nowadays come >> through uncorrupted (and hence I'd expect file names to be >> chosen by your mail client based on subject, which includes >> numbering - that's at least how mine behaves - and there being >> no actual need to use the attachments). > > I don't have a simple way of saving the text of the mail message without > saving the attachment in-line in the file; so without writing a special > tool just for your messages, I can't easily apply the patches into the > git tree (so I can see the change in-situ). Oh, okay. Mail programs seem to be even more different than so far I did think they would be. > It looks like the mail bodies might actually be uncorrupted now; so it's > possible that if you send the mail without attachments anymore I might > be able to use the same workflow for you as I use for everyone else > (which presumably would extend to other people who might want to review > your patches). If you send a test mail (or perhaps a series) I can give > it a try. Perhaps best if I try to remember to do this next time round when Cc-ing you on (not to small) a patch (I don't think it really needs to be a series for that purpose). Not having to attach patches would, in the end, remove one manual step for me. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3] VT-d PI: track the vcpu number in pi blocking list
>>> On 22.06.17 at 07:16, wrote: > On Fri, Jun 16, 2017 at 08:34:20AM -0600, Jan Beulich wrote: > On 24.05.17 at 08:56, wrote: >>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>> @@ -82,6 +82,7 @@ static int vmx_vmfunc_intercept(struct cpu_user_regs >>> *regs); >>> struct vmx_pi_blocking_vcpu { >>> struct list_head list; >>> spinlock_t lock; >>> +uint32_t counter; >> >>Is there any reason for this to be fixed width? Other than that the > > I will use 'int' instead of 'uint32_t'. Can the counter go negative? I don't think so, in which case you really want to use "unsigned int". Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/disk: don't leak stack data via response ring
>>> On 21.06.17 at 20:46, wrote: > On Wed, 21 Jun 2017, Jan Beulich wrote: >> >>> On 20.06.17 at 23:48, wrote: >> > On Tue, 20 Jun 2017, Jan Beulich wrote: >> >> @@ -36,13 +33,7 @@ struct blkif_x86_32_request_discard { >> >> blkif_sector_t sector_number;/* start sector idx on disk (r/w > only) */ >> >> uint64_t nr_sectors; /* # of contiguous sectors to >> >> discard > */ >> >> }; >> >> -struct blkif_x86_32_response { >> >> -uint64_tid; /* copied from request */ >> >> -uint8_t operation; /* copied from request */ >> >> -int16_t status; /* BLKIF_RSP_??? */ >> >> -}; >> >> typedef struct blkif_x86_32_request blkif_x86_32_request_t; >> >> -typedef struct blkif_x86_32_response blkif_x86_32_response_t; >> >> #pragma pack(pop) >> >> >> >> /* x86_64 protocol version */ >> >> @@ -62,20 +53,14 @@ struct blkif_x86_64_request_discard { >> >> blkif_sector_t sector_number;/* start sector idx on disk (r/w > only) */ >> >> uint64_t nr_sectors; /* # of contiguous sectors to >> >> discard > */ >> >> }; >> >> -struct blkif_x86_64_response { >> >> -uint64_t __attribute__((__aligned__(8))) id; >> >> -uint8_t operation; /* copied from request */ >> >> -int16_t status; /* BLKIF_RSP_??? */ >> >> -}; >> >> >> >> typedef struct blkif_x86_64_request blkif_x86_64_request_t; >> >> -typedef struct blkif_x86_64_response blkif_x86_64_response_t; >> >> >> >> DEFINE_RING_TYPES(blkif_common, struct blkif_common_request, >> >> - struct blkif_common_response); >> >> + struct blkif_response); >> >> DEFINE_RING_TYPES(blkif_x86_32, struct blkif_x86_32_request, >> >> - struct blkif_x86_32_response); >> >> + struct blkif_response QEMU_PACKED); >> > >> > In my test, the previous sizes and alignments of the response structs >> > were (on both x86_32 and x86_64): >> > >> > sizeof(blkif_x86_32_response)=12 sizeof(blkif_x86_64_response)=16 >> > align(blkif_x86_32_response)=4 align(blkif_x86_64_response)=8 >> > >> > While with these changes are now, when compiled on x86_64: >> > sizeof(blkif_x86_32_response)=11 sizeof(blkif_x86_64_response)=16 >> > align(blkif_x86_32_response)=1 align(blkif_x86_64_response)=8 >> > >> > when compiled on x86_32: >> > sizeof(blkif_x86_32_response)=11 sizeof(blkif_x86_64_response)=12 >> > align(blkif_x86_32_response)=1 align(blkif_x86_64_response)=4 >> > >> > Did I do my tests wrong? >> > >> > QEMU_PACKED is not the same as #pragma pack(push, 4). In fact, it is the >> > same as #pragma pack(push, 1), causing the struct to be densely packed, >> > leaving no padding whatsever. >> > >> > In addition, without __attribute__((__aligned__(8))), >> > blkif_x86_64_response won't be 8 bytes aligned when built on x86_32. >> > >> > Am I missing something? >> >> Well, you're mixing attribute application upon structure >> declaration with attribute application upon structure use. It's >> the latter here, and hence the attribute doesn't affect >> structure layout at all. All it does is avoid the _containing_ >> 32-bit union to become 8-byte aligned (and tail padding to be >> inserted). > > Thanks for the explanation. I admit it's the first time I see the > aligned attribute being used at structure usage only. I think it's the > first time QEMU_PACKED is used this way in QEMU too. > > Anyway, even taking that into account, things are still not completely > right: the alignment of struct blkif_x86_32_response QEMU_PACKED is 4 > bytes as you wrote, but the size of struct blkif_x86_32_response is > still 16 bytes instead of 12 bytes in my test. I suspect it worked for > you because the other member of the union (blkif_x86_32_request) is > larger than that. However, I think is not a good idea to rely on this > implementation detail. The implementation of DEFINE_RING_TYPES should be > opaque from our point of view. We shouldn't have to know that there is a > union there. I don't follow - why should we not rely on this? It is a fundamental aspect of the shared ring model that requests and responses share space. > Moreover, the other problem is still unaddressed: the size and alignment > of blkif_x86_64_response when built on x86_32 are 12 and 4 instead of 16 > and 8 bytes. Is that working also because it's relying on the other > member of the union to enforce the right alignment and bigger size? Yes. For these as well as your comments further up - sizeof() and alignof() are completely uninteresting as long as we don't instantiate objects of those types _and then use them for communication_. The patch specifically removes instantiation, switching to a purely pointer based approach. And that is ... > I think it's a bad idea to rely on that, especially given that this > obscure but important detail is not even mentioned in the commit message. ... wh
Re: [Xen-devel] [PATCH 03/14 v4] xen/arm: vpl011: Add pl011 uart emulation in Xen
Hi Andre, Thanks for your comments. >> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig >> index d46b98c..c1a0e7f 100644 >> --- a/xen/arch/arm/Kconfig >> +++ b/xen/arch/arm/Kconfig >> @@ -50,6 +50,11 @@ config HAS_ITS >> prompt "GICv3 ITS MSI controller support" if EXPERT = "y" >> depends on HAS_GICV3 >> >> +config VPL011_CONSOLE >> + bool "Emulated pl011 console support" >> + default y >> + ---help--- >> + Allows a guest to use pl011 UART as a console > > I admit that I am rather late with this comment, but I am not sure we > should advertise PL011 emulation here. > Technically what you emulate is a "SBSA Generic UART", which is indeed a > subset of the PL011, but really only a subset. You confirm this already > in patch 13/14, where you use the respective compatible name instead of > the generic arm,pl011 one. > > So while I don't dare to ask for renaming every identifier, I wonder if > we should at least keep the publicly visible part confined to "SBSA > UART". You could mention the subset nature in the help message here, for > instance. > The same reasoning applies to other parts of this series which introduce > user-visible strings (like in libxl). > >> endmenu >> I will rename the user visible part to "sbsa_uart". >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include > > I think Julien mentioned this already, but I think you should just use > "#include " here. > ok. >> + >> +static bool vpl011_reg32_check_access(struct hsr_dabt dabt) >> +{ >> +return (dabt.size != DABT_DOUBLE_WORD); >> +} >> + >> +static void vpl011_update(struct domain *d) > > Can you please rename this function to indicate that it updates the > *interrupt status*? That name as it is rather generic at the moment. > I will rename it to vpl011_update_interrupt_status. >> +{ >> +struct vpl011 *vpl011 = &d->arch.vpl011; >> + >> +/* >> + * TODO: PL011 interrupts are level triggered which means >> + * that interrupt needs to be set/clear instead of being >> + * injected. However, currently vGIC does not handle level >> + * triggered interrupts properly. This function needs to be >> + * revisited once vGIC starts handling level triggered >> + * interrupts. >> + */ >> +if ( vpl011->uartris & vpl011->uartimsc ) >> +vgic_vcpu_inject_spi(d, GUEST_VPL011_SPI); > > So my understanding is that this is using an edge triggered semantic at > the moment. While this is not what an PL011 actually implements and not > what the driver really expects, this is fine on itself for now. > BUT I think we may want to avoid injecting spurious interrupts now: > If for instance the receive interrupt condition is set and we clear the > transmit interrupt bit, then call this function, it would inject a new > interrupt, although in a edge-triggered world we really wouldn't need to > do anything. > So I believe we would need to have some kind of shadowed interrupt > state, which stores the interrupt condition the guest already knows > about. As soon as we *add* a bit to this state, we inject the SPI. > This would act as a kind of temporary bridge between the level triggered > SBSA/PL011 UART and the edge-only VGIC implementation for now. > Later when the VGIC properly handles level triggered interrupts, this > implementation can be adjusted. But this change should then be confined > to this very function. > ok. I will update the logic accordingly. >> +} >> + >> +static uint8_t vpl011_read_data(struct domain *d) >> +{ >> +unsigned long flags; >> +uint8_t data = 0; >> +struct vpl011 *vpl011 = &d->arch.vpl011; >> +struct xencons_interface *intf = vpl011->ring_buf; >> +XENCONS_RING_IDX in_cons = intf->in_cons; >> +XENCONS_RING_IDX in_prod = intf->in_prod; >> + >> +VPL011_LOCK(d, flags); >> + >> +/* >> + * It is expected that there will be data in the ring buffer when this >> + * function is called since the guest is expected to read the data >> register >> + * only if the TXFE flag is not set. >> + * If the guest still does read when TXFE bit is set then 0 will be >> returned. >> + */ >> +if ( xencons_queued(in_prod, in_cons, sizeof(intf->in)) > 0 ) >> +{ >> +data = intf->in[xencons_mask(in_cons, sizeof(intf->in))]; >> +in_cons += 1; >> +intf->in_cons = in_cons; >> +smp_mb(); >> +} >> +else >> +{ >> +gprintk(XENLOG_ERR, "vpl011: Unexpected IN ring buffer empty\n"); >> +} >> + >> +if ( xencons_queued(in_prod, in_cons, sizeof(intf->in)) == 0 ) >> +{ >> +vpl011->uartfr |= RXFE; >> +vpl011->uartris &= ~RXI; > > In a level triggered world you would need to possibly change the status > of the interrupt line here, so a call to (a renamed and fixed) > vpl011_update() function would be due here. To make the transition later > as
Re: [Xen-devel] [PATCH v3] VT-d PI: track the vcpu number in pi blocking list
On Fri, Jun 16, 2017 at 08:34:20AM -0600, Jan Beulich wrote: On 24.05.17 at 08:56, wrote: >> --- a/xen/arch/x86/hvm/vmx/vmx.c >> +++ b/xen/arch/x86/hvm/vmx/vmx.c >> @@ -82,6 +82,7 @@ static int vmx_vmfunc_intercept(struct cpu_user_regs >> *regs); >> struct vmx_pi_blocking_vcpu { >> struct list_head list; >> spinlock_t lock; >> +uint32_t counter; > >Is there any reason for this to be fixed width? Other than that the I will use 'int' instead of 'uint32_t'. >non-tracing parts look fine, but I really wonder whether the >introduction of the counter and the tracing wouldn't better be split. Will split them apart. Thanks chao ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [libvirt test] 110918: tolerable all pass - PUSHED
flight 110918 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/110918/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 13 saverestore-support-checkfail like 110512 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail like 110512 test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail like 110512 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt 13 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 11 migrate-support-checkfail never pass test-arm64-arm64-libvirt-qcow2 12 saverestore-support-checkfail never pass version targeted for testing: libvirt 296a53313f447d2f251cbea2cb050d2f695a7991 baseline version: libvirt e13e8808f9270f4b3b6f4abb8ec473eef81cc1b9 Last test of basis 110512 2017-06-17 04:20:16 Z4 days Testing same since 110918 2017-06-21 04:21:26 Z0 days1 attempts People who touched revisions under test: Andrea Bolognani Farhan Ali Ján Tomko Martin Kletzander Peter Krempa jobs: build-amd64-xsm pass build-arm64-xsm pass build-armhf-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-arm64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-libvirt-xsm pass test-arm64-arm64-libvirt-xsm pass test-armhf-armhf-libvirt-xsm pass test-amd64-i386-libvirt-xsm pass test-amd64-amd64-libvirt pass test-arm64-arm64-libvirt pass test-armhf-armhf-libvirt pass test-amd64-i386-libvirt pass test-amd64-amd64-libvirt-pairpass test-amd64-i386-libvirt-pair pass test-arm64-arm64-libvirt-qcow2 pass test-armhf-armhf-libvirt-raw pass test-amd64-amd64-libvirt-vhd pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=sum
[Xen-devel] [xen-unstable test] 110910: regressions - FAIL
flight 110910 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/110910/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-libvirt 5 libvirt-buildfail REGR. vs. 110465 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail REGR. vs. 110465 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-win7-amd64 17 guest-start/win.repeat fail blocked in 110465 test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail like 110465 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-start/win.repeat fail like 110465 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail like 110465 test-amd64-amd64-xl-rtds 9 debian-install fail like 110465 build-amd64-prev 6 xen-build/dist-test fail never pass test-amd64-amd64-xl-qemut-ws16-amd64 9 windows-installfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 9 windows-installfail never pass test-arm64-arm64-libvirt-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 12 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 13 saverestore-support-checkfail never pass build-i386-prev 6 xen-build/dist-test fail never pass test-arm64-arm64-xl-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-arm64-arm64-xl 12 migrate-support-checkfail never pass test-arm64-arm64-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-win10-i386 9 windows-install fail never pass test-amd64-i386-xl-qemuu-win10-i386 9 windows-install fail never pass test-amd64-amd64-xl-qemut-win10-i386 9 windows-installfail never pass test-amd64-amd64-xl-qemuu-win10-i386 9 windows-installfail never pass test-amd64-i386-xl-qemuu-ws16-amd64 9 windows-install fail never pass test-amd64-i386-xl-qemut-ws16-amd64 9 windows-install fail never pass version targeted for testing: xen 461b2165346de236fff2d00d1c318062f1daab08 baseline version: xen 695bb5f504ab48c1d546446f104c1b6c0ead126d Last test of basis 110465 2017-06-15 09:46:33 Z6 days Failing since110484 2017-06-16 09:32:22 Z5 days7 attempts Testing same since 110910 2017-06-21 00:21:37 Z1 days1 attempts People who touched revisions under test: Dushyant Behl
[Xen-devel] [PATCH] xen-blkfront: fix mq start/stop race
When ring buf full, hw queue will be stopped. While blkif interrupt consume request and make free space in ring buf, hw queue will be started again. But since start queue is protected by spin lock while stop not, that will cause a race. interrupt: process: blkif_interrupt() blkif_queue_rq() kick_pending_request_queues_locked() blk_mq_start_stopped_hw_queues() clear_bit(BLK_MQ_S_STOPPED, &hctx->state) blk_mq_stop_hw_queue(hctx) blk_mq_run_hw_queue(hctx, async) If ring buf is made empty in this case, interrupt will never come, then the hw queue will be stopped forever, all processes waiting for the pending io in the queue will hung. Signed-off-by: Junxiao Bi Reviewed-by: Ankur Arora --- drivers/block/xen-blkfront.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 8bb160cd00e1..4767b82b2cf6 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -912,8 +912,8 @@ out_err: return BLK_MQ_RQ_QUEUE_ERROR; out_busy: - spin_unlock_irqrestore(&rinfo->ring_lock, flags); blk_mq_stop_hw_queue(hctx); + spin_unlock_irqrestore(&rinfo->ring_lock, flags); return BLK_MQ_RQ_QUEUE_BUSY; } -- 1.7.9.5 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] Is [PATCH for-4.9] Was:Re: [PATCH 1/2] xen/livepatch: Clean up arch relocation handling
On Tue, Jun 13, 2017 at 09:51:35PM +0100, Andrew Cooper wrote: > * Reduce symbol scope and initalisation as much as possible > * Annotate a fallthrough case in arm64 > * Fix switch statement style in arm32 > > No functional change. > > Signed-off-by: Andrew Cooper > --- > CC: Konrad Rzeszutek Wilk > CC: Ross Lagerwall > CC: Jan Beulich > CC: Stefano Stabellini > CC: Julien Grall > > The purpose of this patch is simply to make the following patch easier to > review. Jan Acked it and Reviewed-by: Konrad Rzeszutek Wilk and Tested-by: Konrad Rzeszutek Wilk [on x86 right now] > --- > xen/arch/arm/arm32/livepatch.c | 27 --- > xen/arch/arm/arm64/livepatch.c | 19 +++ > xen/arch/x86/livepatch.c | 13 + > 3 files changed, 24 insertions(+), 35 deletions(-) > > diff --git a/xen/arch/arm/arm32/livepatch.c b/xen/arch/arm/arm32/livepatch.c > index a7fd5e2..a328179 100644 > --- a/xen/arch/arm/arm32/livepatch.c > +++ b/xen/arch/arm/arm32/livepatch.c > @@ -224,21 +224,21 @@ int arch_livepatch_perform(struct livepatch_elf *elf, > const struct livepatch_elf_sec *rela, > bool use_rela) > { > -const Elf_RelA *r_a; > -const Elf_Rel *r; > -unsigned int symndx, i; > -uint32_t val; > -void *dest; > +unsigned int i; > int rc = 0; > > for ( i = 0; i < (rela->sec->sh_size / rela->sec->sh_entsize); i++ ) > { > +unsigned int symndx; > +uint32_t val; > +void *dest; > unsigned char type; > -s32 addend = 0; > +s32 addend; > > if ( use_rela ) > { > -r_a = rela->data + i * rela->sec->sh_entsize; > +const Elf_RelA *r_a = rela->data + i * rela->sec->sh_entsize; > + > symndx = ELF32_R_SYM(r_a->r_info); > type = ELF32_R_TYPE(r_a->r_info); > dest = base->load_addr + r_a->r_offset; /* P */ > @@ -246,10 +246,12 @@ int arch_livepatch_perform(struct livepatch_elf *elf, > } > else > { > -r = rela->data + i * rela->sec->sh_entsize; > +const Elf_Rel *r = rela->data + i * rela->sec->sh_entsize; > + > symndx = ELF32_R_SYM(r->r_info); > type = ELF32_R_TYPE(r->r_info); > dest = base->load_addr + r->r_offset; /* P */ > +addend = get_addend(type, dest); > } > > if ( symndx > elf->nsym ) > @@ -259,13 +261,11 @@ int arch_livepatch_perform(struct livepatch_elf *elf, > return -EINVAL; > } > > -if ( !use_rela ) > -addend = get_addend(type, dest); > - > val = elf->sym[symndx].sym->st_value; /* S */ > > rc = perform_rel(type, dest, val, addend); > -switch ( rc ) { > +switch ( rc ) > +{ > case -EOVERFLOW: > dprintk(XENLOG_ERR, LIVEPATCH "%s: Overflow in relocation %u in > %s for %s!\n", > elf->name, i, rela->name, base->name); > @@ -275,9 +275,6 @@ int arch_livepatch_perform(struct livepatch_elf *elf, > dprintk(XENLOG_ERR, LIVEPATCH "%s: Unhandled relocation #%x\n", > elf->name, type); > break; > - > -default: > -break; > } > > if ( rc ) > diff --git a/xen/arch/arm/arm64/livepatch.c b/xen/arch/arm/arm64/livepatch.c > index dae64f5..63929b1 100644 > --- a/xen/arch/arm/arm64/livepatch.c > +++ b/xen/arch/arm/arm64/livepatch.c > @@ -241,19 +241,16 @@ int arch_livepatch_perform_rela(struct livepatch_elf > *elf, > const struct livepatch_elf_sec *base, > const struct livepatch_elf_sec *rela) > { > -const Elf_RelA *r; > -unsigned int symndx, i; > -uint64_t val; > -void *dest; > -bool_t overflow_check; > +unsigned int i; > > for ( i = 0; i < (rela->sec->sh_size / rela->sec->sh_entsize); i++ ) > { > +const Elf_RelA *r = rela->data + i * rela->sec->sh_entsize; > +unsigned int symndx = ELF64_R_SYM(r->r_info); > +void *dest = base->load_addr + r->r_offset; /* P */ > +bool overflow_check = true; > int ovf = 0; > - > -r = rela->data + i * rela->sec->sh_entsize; > - > -symndx = ELF64_R_SYM(r->r_info); > +uint64_t val; > > if ( symndx > elf->nsym ) > { > @@ -262,11 +259,8 @@ int arch_livepatch_perform_rela(struct livepatch_elf > *elf, > return -EINVAL; > } > > -dest = base->load_addr + r->r_offset; /* P */ > val = elf->sym[symndx].sym->st_value + r->r_addend; /* S+A */ > > -overflow_check = true; > - > /* ARM64 operations at minimum are always 32-bit. */ > if ( r->r_offset >= base->sec->sh_size || > (r->r_offset + sizeof(uint32_t)) > base->sec->sh_size ) > @@ -403,6 +397,7 @@ int
Re: [Xen-devel] [PATCH for-4.9 v2] xen/livepatch: Don't crash on encountering STN_UNDEF relocations
On Wed, Jun 21, 2017 at 07:13:36PM +0100, Andrew Cooper wrote: > A symndx of STN_UNDEF is special, and means a symbol value of 0. While > legitimate in the ELF standard, its existance in a livepatch is questionable > at best. Until a plausible usecase presents itself, reject such a relocation > with -EOPNOTSUPP. > > Additionally, perform a safety check on elf->sym[symndx].sym before > derefencing it, to avoid tripping over a NULL pointer when calculating val. > > Signed-off-by: Andrew Cooper Reviewed-by: Konrad Rzeszutek Wilk Tested-by: Konrad Rzeszutek Wilk [x86 right now, will do arm32 tomorrow] I naturally had to have "xen/livepatch: Clean up arch relocation handling" on top of this. > --- > CC: Konrad Rzeszutek Wilk > CC: Ross Lagerwall > CC: Jan Beulich > CC: Stefano Stabellini > CC: Julien Grall > > v2: > * Reject STN_UNDEF with -EOPNOTSUPP > --- > xen/arch/arm/arm32/livepatch.c | 17 +++-- > xen/arch/arm/arm64/livepatch.c | 17 +++-- > xen/arch/x86/livepatch.c | 17 +++-- > 3 files changed, 45 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/arm/arm32/livepatch.c b/xen/arch/arm/arm32/livepatch.c > index a328179..53fee91 100644 > --- a/xen/arch/arm/arm32/livepatch.c > +++ b/xen/arch/arm/arm32/livepatch.c > @@ -254,14 +254,27 @@ int arch_livepatch_perform(struct livepatch_elf *elf, > addend = get_addend(type, dest); > } > > +if ( symndx == STN_UNDEF ) > +{ > +dprintk(XENLOG_ERR, LIVEPATCH "%s: Encountered STN_UNDEF\n", > +elf->name); > +return -EOPNOTSUPP; > +} > + > if ( symndx > elf->nsym ) > { > dprintk(XENLOG_ERR, LIVEPATCH "%s: Relative symbol wants > symbol@%u which is past end!\n", > elf->name, symndx); > return -EINVAL; > } > - > -val = elf->sym[symndx].sym->st_value; /* S */ > +else if ( !elf->sym[symndx].sym ) > +{ > +dprintk(XENLOG_ERR, LIVEPATCH "%s: No relative symbol@%u\n", > +elf->name, symndx); > +return -EINVAL; > +} > +else > +val = elf->sym[symndx].sym->st_value; /* S */ > > rc = perform_rel(type, dest, val, addend); > switch ( rc ) > diff --git a/xen/arch/arm/arm64/livepatch.c b/xen/arch/arm/arm64/livepatch.c > index 63929b1..b033763 100644 > --- a/xen/arch/arm/arm64/livepatch.c > +++ b/xen/arch/arm/arm64/livepatch.c > @@ -252,14 +252,27 @@ int arch_livepatch_perform_rela(struct livepatch_elf > *elf, > int ovf = 0; > uint64_t val; > > +if ( symndx == STN_UNDEF ) > +{ > +dprintk(XENLOG_ERR, LIVEPATCH "%s: Encountered STN_UNDEF\n", > +elf->name); > +return -EOPNOTSUPP; > +} > + > if ( symndx > elf->nsym ) > { > dprintk(XENLOG_ERR, LIVEPATCH "%s: Relative relocation wants > symbol@%u which is past end!\n", > elf->name, symndx); > return -EINVAL; > } > - > -val = elf->sym[symndx].sym->st_value + r->r_addend; /* S+A */ > +else if ( !elf->sym[symndx].sym ) > +{ > +dprintk(XENLOG_ERR, LIVEPATCH "%s: No relative symbol@%u\n", > +elf->name, symndx); > +return -EINVAL; > +} > +else > +val = elf->sym[symndx].sym->st_value + r->r_addend; /* S+A */ > > /* ARM64 operations at minimum are always 32-bit. */ > if ( r->r_offset >= base->sec->sh_size || > diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c > index 7917610..bfa576c 100644 > --- a/xen/arch/x86/livepatch.c > +++ b/xen/arch/x86/livepatch.c > @@ -170,14 +170,27 @@ int arch_livepatch_perform_rela(struct livepatch_elf > *elf, > uint8_t *dest = base->load_addr + r->r_offset; > uint64_t val; > > +if ( symndx == STN_UNDEF ) > +{ > +dprintk(XENLOG_ERR, LIVEPATCH "%s: Encountered STN_UNDEF\n", > +elf->name); > +return -EOPNOTSUPP; > +} > + > if ( symndx > elf->nsym ) > { > dprintk(XENLOG_ERR, LIVEPATCH "%s: Relative relocation wants > symbol@%u which is past end!\n", > elf->name, symndx); > return -EINVAL; > } > - > -val = r->r_addend + elf->sym[symndx].sym->st_value; > +else if ( !elf->sym[symndx].sym ) > +{ > +dprintk(XENLOG_ERR, LIVEPATCH "%s: No symbol@%u\n", > +elf->name, symndx); > +return -EINVAL; > +} > +else > +val = r->r_addend + elf->sym[symndx].sym->st_value; > > switch ( ELF64_R_TYPE(r->r_info) ) > { > -- > 2.1.4 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists
Re: [Xen-devel] [PATCH v2 1/3] xen-disk: only advertize feature-persistent if grant copy is not available
On Wed, 21 Jun 2017, Paul Durrant wrote: > If grant copy is available then it will always be used in preference to > persistent maps. In this case feature-persistent should not be advertized > to the frontend, otherwise it may needlessly copy data into persistently > granted buffers. > > Signed-off-by: Paul Durrant Reviewed-by: Stefano Stabellini > --- > Cc: Stefano Stabellini > Cc: Anthony Perard > Cc: Kevin Wolf > Cc: Max Reitz > --- > hw/block/xen_disk.c | 15 --- > 1 file changed, 8 insertions(+), 7 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 3a22805fbc..9b06e3aa81 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -1023,11 +1023,18 @@ static int blk_init(struct XenDevice *xendev) > > blkdev->file_blk = BLOCK_SIZE; > > +blkdev->feature_grant_copy = > +(xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == > 0); > + > +xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", > + blkdev->feature_grant_copy ? "enabled" : "disabled"); > + > /* fill info > * blk_connect supplies sector-size and sectors > */ > xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1); > -xenstore_write_be_int(&blkdev->xendev, "feature-persistent", 1); > +xenstore_write_be_int(&blkdev->xendev, "feature-persistent", > + !blkdev->feature_grant_copy); > xenstore_write_be_int(&blkdev->xendev, "info", info); > > blk_parse_discard(blkdev); > @@ -1202,12 +1209,6 @@ static int blk_connect(struct XenDevice *xendev) > > xen_be_bind_evtchn(&blkdev->xendev); > > -blkdev->feature_grant_copy = > -(xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == > 0); > - > -xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n", > - blkdev->feature_grant_copy ? "enabled" : "disabled"); > - > xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, " >"remote port %d, local port %d\n", >blkdev->xendev.protocol, blkdev->ring_ref, > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/3] xen-disk: add support for multi-page shared rings
On Wed, 21 Jun 2017, Paul Durrant wrote: > The blkif protocol has had provision for negotiation of multi-page shared > rings for some time now and many guest OS have support in their frontend > drivers. > > This patch makes the necessary modifications to xen-disk support a shared > ring up to order 4 (i.e. 16 pages). > > Signed-off-by: Paul Durrant Reviewed-by: Stefano Stabellini > --- > Cc: Stefano Stabellini > Cc: Anthony Perard > Cc: Kevin Wolf > Cc: Max Reitz > > v2: > - Fix memory leak in error path > - Print warning if ring-page-order exceeds limits > --- > hw/block/xen_disk.c | 144 > +--- > 1 file changed, 113 insertions(+), 31 deletions(-) > > diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c > index 9b06e3aa81..0e6513708e 100644 > --- a/hw/block/xen_disk.c > +++ b/hw/block/xen_disk.c > @@ -36,8 +36,6 @@ > > static int batch_maps = 0; > > -static int max_requests = 32; > - > /* - */ > > #define BLOCK_SIZE 512 > @@ -84,6 +82,8 @@ struct ioreq { > BlockAcctCookie acct; > }; > > +#define MAX_RING_PAGE_ORDER 4 > + > struct XenBlkDev { > struct XenDevicexendev; /* must be first */ > char*params; > @@ -94,7 +94,8 @@ struct XenBlkDev { > booldirectiosafe; > const char *fileproto; > const char *filename; > -int ring_ref; > +unsigned intring_ref[1 << MAX_RING_PAGE_ORDER]; > +unsigned intnr_ring_ref; > void*sring; > int64_t file_blk; > int64_t file_size; > @@ -110,6 +111,7 @@ struct XenBlkDev { > int requests_total; > int requests_inflight; > int requests_finished; > +unsigned intmax_requests; > > /* Persistent grants extension */ > gbooleanfeature_discard; > @@ -199,7 +201,7 @@ static struct ioreq *ioreq_start(struct XenBlkDev *blkdev) > struct ioreq *ioreq = NULL; > > if (QLIST_EMPTY(&blkdev->freelist)) { > -if (blkdev->requests_total >= max_requests) { > +if (blkdev->requests_total >= blkdev->max_requests) { > goto out; > } > /* allocate new struct */ > @@ -905,7 +907,7 @@ static void blk_handle_requests(struct XenBlkDev *blkdev) > ioreq_runio_qemu_aio(ioreq); > } > > -if (blkdev->more_work && blkdev->requests_inflight < max_requests) { > +if (blkdev->more_work && blkdev->requests_inflight < > blkdev->max_requests) { > qemu_bh_schedule(blkdev->bh); > } > } > @@ -918,15 +920,6 @@ static void blk_bh(void *opaque) > blk_handle_requests(blkdev); > } > > -/* > - * We need to account for the grant allocations requiring contiguous > - * chunks; the worst case number would be > - * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, > - * but in order to keep things simple just use > - * 2 * max_req * max_seg. > - */ > -#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) > - > static void blk_alloc(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > @@ -938,11 +931,6 @@ static void blk_alloc(struct XenDevice *xendev) > if (xen_mode != XEN_EMULATE) { > batch_maps = 1; > } > -if (xengnttab_set_max_grants(xendev->gnttabdev, > -MAX_GRANTS(max_requests, BLKIF_MAX_SEGMENTS_PER_REQUEST)) < 0) { > -xen_pv_printf(xendev, 0, "xengnttab_set_max_grants failed: %s\n", > - strerror(errno)); > -} > } > > static void blk_parse_discard(struct XenBlkDev *blkdev) > @@ -1037,6 +1025,9 @@ static int blk_init(struct XenDevice *xendev) >!blkdev->feature_grant_copy); > xenstore_write_be_int(&blkdev->xendev, "info", info); > > +xenstore_write_be_int(&blkdev->xendev, "max-ring-page-order", > + MAX_RING_PAGE_ORDER); > + > blk_parse_discard(blkdev); > > g_free(directiosafe); > @@ -1058,12 +1049,25 @@ out_error: > return -1; > } > > +/* > + * We need to account for the grant allocations requiring contiguous > + * chunks; the worst case number would be > + * max_req * max_seg + (max_req - 1) * (max_seg - 1) + 1, > + * but in order to keep things simple just use > + * 2 * max_req * max_seg. > + */ > +#define MAX_GRANTS(max_req, max_seg) (2 * (max_req) * (max_seg)) > + > static int blk_connect(struct XenDevice *xendev) > { > struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, > xendev); > int pers, index, qflags; > bool readonly = true; > bool writethrough = true; > +int order, ring_ref; > +unsigned int ring_size, max_grants; > +unsigned int i; > +uint32_t *domids; > > /* read-only ? */ > if (blkdev->
[Xen-devel] [xen-4.5-testing test] 110906: tolerable FAIL - PUSHED
flight 110906 xen-4.5-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/110906/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-rtds 6 xen-boot fail like 108183 test-xtf-amd64-amd64-2 55 leak-check/check fail like 108183 test-xtf-amd64-amd64-3 55 leak-check/check fail like 108183 test-xtf-amd64-amd64-4 55 leak-check/check fail like 108183 test-xtf-amd64-amd64-5 55 leak-check/check fail like 108183 test-xtf-amd64-amd64-1 55 leak-check/check fail like 108183 test-armhf-armhf-libvirt 13 saverestore-support-checkfail like 108183 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail like 108183 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail like 108183 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 108183 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 108183 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 108183 test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-xtf-amd64-amd64-2 18 xtf/test-hvm32-cpuid-faulting fail never pass test-xtf-amd64-amd64-2 31 xtf/test-hvm32pae-cpuid-faulting fail never pass test-xtf-amd64-amd64-2 37 xtf/test-hvm32pse-cpuid-faulting fail never pass test-xtf-amd64-amd64-2 41 xtf/test-hvm64-cpuid-faulting fail never pass test-xtf-amd64-amd64-4 18 xtf/test-hvm32-cpuid-faulting fail never pass test-xtf-amd64-amd64-5 18 xtf/test-hvm32-cpuid-faulting fail never pass test-xtf-amd64-amd64-4 31 xtf/test-hvm32pae-cpuid-faulting fail never pass test-xtf-amd64-amd64-1 18 xtf/test-hvm32-cpuid-faulting fail never pass test-xtf-amd64-amd64-4 37 xtf/test-hvm32pse-cpuid-faulting fail never pass test-xtf-amd64-amd64-4 41 xtf/test-hvm64-cpuid-faulting fail never pass test-xtf-amd64-amd64-5 31 xtf/test-hvm32pae-cpuid-faulting fail never pass test-xtf-amd64-amd64-1 31 xtf/test-hvm32pae-cpuid-faulting fail never pass test-xtf-amd64-amd64-1 37 xtf/test-hvm32pse-cpuid-faulting fail never pass test-xtf-amd64-amd64-5 37 xtf/test-hvm32pse-cpuid-faulting fail never pass test-xtf-amd64-amd64-5 41 xtf/test-hvm64-cpuid-faulting fail never pass test-xtf-amd64-amd64-1 41 xtf/test-hvm64-cpuid-faulting fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-xtf-amd64-amd64-2 54 xtf/test-hvm64-xsa-195 fail never pass test-xtf-amd64-amd64-3 54 xtf/test-hvm64-xsa-195 fail never pass test-xtf-amd64-amd64-4 54 xtf/test-hvm64-xsa-195 fail never pass test-xtf-amd64-amd64-5 54 xtf/test-hvm64-xsa-195 fail never pass test-xtf-amd64-amd64-1 54 xtf/test-hvm64-xsa-195 fail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 10 guest-start fail never pass test-armhf-armhf-libvirt-raw 10 guest-start fail never pass test-amd64-amd64-xl-qemut-win10-i386 9 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 9 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 9 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 9 windows-install fail never pass test-amd64-amd64-xl-qemut-ws16-amd64 9 windows-installfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 9 windows-installfail never pass test-amd64-i3
Re: [Xen-devel] [PATCH v2 03/16] xen/arm: mm: Use typesafe mfn for xenheap_mfn_*
On Wed, 21 Jun 2017, Stefano Stabellini wrote: > On Mon, 19 Jun 2017, Julien Grall wrote: > > Add more safety when using xenheap_mfn_*. > > > > Signed-off-by: Julien Grall > > Reviewed-by: Stefano Stabellini > > > > --- > > > > I haven't introduced mfn_less_than() and mfn_greather_than() as > > there would be only a couple of usage. We would be able to introduce > > them and replace the open-coding easily in the future grepping > > mfn_x(). > > --- > > xen/arch/arm/mm.c| 16 > > xen/arch/arm/setup.c | 18 +- > > xen/include/asm-arm/mm.h | 11 ++- > > 3 files changed, 23 insertions(+), 22 deletions(-) > > > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > > index 7b313ca123..452c1e26c3 100644 > > --- a/xen/arch/arm/mm.c > > +++ b/xen/arch/arm/mm.c > > @@ -138,8 +138,8 @@ uint64_t init_ttbr; > > static paddr_t phys_offset; > > > > /* Limits of the Xen heap */ > > -unsigned long xenheap_mfn_start __read_mostly = ~0UL; > > -unsigned long xenheap_mfn_end __read_mostly; > > +mfn_t xenheap_mfn_start __read_mostly = INVALID_MFN; > > +mfn_t xenheap_mfn_end __read_mostly; Actually I get the following build error with gcc-linaro-4.9-2014.05-aarch64-linux-gnu-x86_64-linux-gnu: mm.c:141:1: error: initializer element is not constant mfn_t xenheap_mfn_start __read_mostly = INVALID_MFN; ^ make[4]: *** [mm.o] Error 1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 16/16] xen/arm: mm: Use __func__ rather than plain name in format string
On Mon, 19 Jun 2017, Julien Grall wrote: > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > --- > > Changes in v2: > - Patch added > --- > xen/arch/arm/mm.c | 14 +++--- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 657fee0b17..91af4c8743 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -996,7 +996,7 @@ static int create_xen_entries(enum xenmap_operation op, > { > rc = create_xen_table(entry); > if ( rc < 0 ) { > -printk("create_xen_entries: L2 failed\n"); > +printk("%s: L2 failed\n", __func__); > goto out; > } > } > @@ -1011,8 +1011,8 @@ static int create_xen_entries(enum xenmap_operation op, > case RESERVE: > if ( lpae_valid(*entry) ) > { > -printk("create_xen_entries: trying to replace an > existing mapping addr=%lx mfn=%"PRI_mfn"\n", > - addr, mfn_x(mfn)); > +printk("%s: trying to replace an existing mapping > addr=%lx mfn=%"PRI_mfn"\n", > + __func__, addr, mfn_x(mfn)); > return -EINVAL; > } > if ( op == RESERVE ) > @@ -1025,8 +1025,8 @@ static int create_xen_entries(enum xenmap_operation op, > case REMOVE: > if ( !lpae_valid(*entry) ) > { > -printk("create_xen_entries: trying to %s a non-existing > mapping addr=%lx\n", > - op == REMOVE ? "remove" : "modify", addr); > +printk("%s: trying to %s a non-existing mapping > addr=%lx\n", > + __func__, op == REMOVE ? "remove" : "modify", > addr); > return -EINVAL; > } > if ( op == REMOVE ) > @@ -1038,8 +1038,8 @@ static int create_xen_entries(enum xenmap_operation op, > pte.pt.xn = PTE_NX_MASK(ai); > if ( !pte.pt.ro && !pte.pt.xn ) > { > -printk("create_xen_entries: Incorrect combination > for addr=%lx\n", > - addr); > +printk("%s: Incorrect combination for addr=%lx\n", > + __func__, addr); > return -EINVAL; > } > } > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 15/16] xen/arm: mm: Introduce temporary variable in create_xen_entries
On Mon, 19 Jun 2017, Julien Grall wrote: > This is improving the code readability and avoid to dereference the > table every single time we need to access the entry. > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > --- > > Changes in v2: > - Patch added > --- > xen/arch/arm/mm.c | 22 -- > 1 file changed, 12 insertions(+), 10 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 6241c53821..657fee0b17 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -986,28 +986,30 @@ static int create_xen_entries(enum xenmap_operation op, > { > int rc; > unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE; > -lpae_t pte; > +lpae_t pte, *entry; > lpae_t *third = NULL; > > for(; addr < addr_end; addr += PAGE_SIZE, mfn = mfn_add(mfn, 1)) > { > -if ( !lpae_table(xen_second[second_linear_offset(addr)]) ) > +entry = &xen_second[second_linear_offset(addr)]; > +if ( !lpae_table(*entry) ) > { > -rc = create_xen_table(&xen_second[second_linear_offset(addr)]); > +rc = create_xen_table(entry); > if ( rc < 0 ) { > printk("create_xen_entries: L2 failed\n"); > goto out; > } > } > > -BUG_ON(!lpae_valid(xen_second[second_linear_offset(addr)])); > +BUG_ON(!lpae_valid(*entry)); > > -third = mfn_to_virt(xen_second[second_linear_offset(addr)].pt.base); > +third = mfn_to_virt(entry->pt.base); > +entry = &third[third_table_offset(addr)]; > > switch ( op ) { > case INSERT: > case RESERVE: > -if ( lpae_valid(third[third_table_offset(addr)]) ) > +if ( lpae_valid(*entry) ) > { > printk("create_xen_entries: trying to replace an > existing mapping addr=%lx mfn=%"PRI_mfn"\n", > addr, mfn_x(mfn)); > @@ -1017,11 +1019,11 @@ static int create_xen_entries(enum xenmap_operation > op, > break; > pte = mfn_to_xen_entry(mfn, ai); > pte.pt.table = 1; > -write_pte(&third[third_table_offset(addr)], pte); > +write_pte(entry, pte); > break; > case MODIFY: > case REMOVE: > -if ( !lpae_valid(third[third_table_offset(addr)]) ) > +if ( !lpae_valid(*entry) ) > { > printk("create_xen_entries: trying to %s a non-existing > mapping addr=%lx\n", > op == REMOVE ? "remove" : "modify", addr); > @@ -1031,7 +1033,7 @@ static int create_xen_entries(enum xenmap_operation op, > pte.bits = 0; > else > { > -pte = third[third_table_offset(addr)]; > +pte = *entry; > pte.pt.ro = PTE_RO_MASK(ai); > pte.pt.xn = PTE_NX_MASK(ai); > if ( !pte.pt.ro && !pte.pt.xn ) > @@ -1041,7 +1043,7 @@ static int create_xen_entries(enum xenmap_operation op, > return -EINVAL; > } > } > -write_pte(&third[third_table_offset(addr)], pte); > +write_pte(entry, pte); > break; > default: > BUG(); > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 14/16] xen/arm: mm: Use lpae_valid and lpae_table in create_xen_entries
On Mon, 19 Jun 2017, Julien Grall wrote: > This newly introduced lpae_valid and lpae_table helpers will recude the > code and make more readable. > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > --- > > Changes in v2: > - Patch added > --- > xen/arch/arm/mm.c | 9 - > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 8d34ae7279..6241c53821 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -991,8 +991,7 @@ static int create_xen_entries(enum xenmap_operation op, > > for(; addr < addr_end; addr += PAGE_SIZE, mfn = mfn_add(mfn, 1)) > { > -if ( !xen_second[second_linear_offset(addr)].pt.valid || > - !xen_second[second_linear_offset(addr)].pt.table ) > +if ( !lpae_table(xen_second[second_linear_offset(addr)]) ) > { > rc = create_xen_table(&xen_second[second_linear_offset(addr)]); > if ( rc < 0 ) { > @@ -1001,14 +1000,14 @@ static int create_xen_entries(enum xenmap_operation > op, > } > } > > -BUG_ON(!xen_second[second_linear_offset(addr)].pt.valid); > +BUG_ON(!lpae_valid(xen_second[second_linear_offset(addr)])); > > third = mfn_to_virt(xen_second[second_linear_offset(addr)].pt.base); > > switch ( op ) { > case INSERT: > case RESERVE: > -if ( third[third_table_offset(addr)].pt.valid ) > +if ( lpae_valid(third[third_table_offset(addr)]) ) > { > printk("create_xen_entries: trying to replace an > existing mapping addr=%lx mfn=%"PRI_mfn"\n", > addr, mfn_x(mfn)); > @@ -1022,7 +1021,7 @@ static int create_xen_entries(enum xenmap_operation op, > break; > case MODIFY: > case REMOVE: > -if ( !third[third_table_offset(addr)].pt.valid ) > +if ( !lpae_valid(third[third_table_offset(addr)]) ) > { > printk("create_xen_entries: trying to %s a non-existing > mapping addr=%lx\n", > op == REMOVE ? "remove" : "modify", addr); > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 13/16] xen/arm: p2m: Move lpae_* helpers in lpae.h
On Mon, 19 Jun 2017, Julien Grall wrote: > lpae_* helpers can work on any LPAE translation tables. Move them in > lpae.h to allow other part of Xen to use them. > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > --- > > Cc: prosku...@sec.in.tum.de > > Changes in v2: > - Patch added > --- > xen/arch/arm/p2m.c | 23 --- > xen/include/asm-arm/lpae.h | 25 + > 2 files changed, 25 insertions(+), 23 deletions(-) > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > index 1136d837fb..f9145f052f 100644 > --- a/xen/arch/arm/p2m.c > +++ b/xen/arch/arm/p2m.c > @@ -52,29 +52,6 @@ static const paddr_t level_masks[] = > static const uint8_t level_orders[] = > { ZEROETH_ORDER, FIRST_ORDER, SECOND_ORDER, THIRD_ORDER }; > > -static inline bool_t lpae_valid(lpae_t pte) > -{ > -return pte.walk.valid; > -} > -/* > - * These two can only be used on L0..L2 ptes because L3 mappings set > - * the table bit and therefore these would return the opposite to what > - * you would expect. > - */ > -static inline bool_t lpae_table(lpae_t pte) > -{ > -return lpae_valid(pte) && pte.walk.table; > -} > -static inline bool_t lpae_mapping(lpae_t pte) > -{ > -return lpae_valid(pte) && !pte.walk.table; > -} > - > -static inline bool lpae_is_superpage(lpae_t pte, unsigned int level) > -{ > -return (level < 3) && lpae_mapping(pte); > -} > - > static void p2m_flush_tlb(struct p2m_domain *p2m); > > /* Unlock the flush and do a P2M TLB flush if necessary */ > diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h > index aa85cb8112..6fbf7c606c 100644 > --- a/xen/include/asm-arm/lpae.h > +++ b/xen/include/asm-arm/lpae.h > @@ -126,6 +126,31 @@ typedef union { > lpae_walk_t walk; > } lpae_t; > > +static inline bool_t lpae_valid(lpae_t pte) > +{ > +return pte.walk.valid; > +} > + > +/* > + * These two can only be used on L0..L2 ptes because L3 mappings set > + * the table bit and therefore these would return the opposite to what > + * you would expect. > + */ > +static inline bool_t lpae_table(lpae_t pte) > +{ > +return lpae_valid(pte) && pte.walk.table; > +} > + > +static inline bool_t lpae_mapping(lpae_t pte) > +{ > +return lpae_valid(pte) && !pte.walk.table; > +} > + > +static inline bool lpae_is_superpage(lpae_t pte, unsigned int level) > +{ > +return (level < 3) && lpae_mapping(pte); > +} > + > #endif /* __ASSEMBLY__ */ > > /* > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 12/16] xen/arm: p2m: Rename p2m_valid, p2m_table, p2m_mapping and p2m_is_superpage
On Mon, 19 Jun 2017, Julien Grall wrote: > The helpers p2m_valid, p2m_table, p2m_mapping and p2m_is_superpage are > not specific to the stage-2 translation tables. They can also work on > any LPAE translation tables. So rename then to lpae_* and use pte.walk > to look for the value of the field. > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > --- > > Cc: prosku...@sec.in.tum.de > > Changes in v2: > - Patch added > --- > xen/arch/arm/p2m.c | 45 +++-- > 1 file changed, 23 insertions(+), 22 deletions(-) > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > index 6c1ac70044..1136d837fb 100644 > --- a/xen/arch/arm/p2m.c > +++ b/xen/arch/arm/p2m.c > @@ -52,27 +52,27 @@ static const paddr_t level_masks[] = > static const uint8_t level_orders[] = > { ZEROETH_ORDER, FIRST_ORDER, SECOND_ORDER, THIRD_ORDER }; > > -static inline bool_t p2m_valid(lpae_t pte) > +static inline bool_t lpae_valid(lpae_t pte) > { > -return pte.p2m.valid; > +return pte.walk.valid; > } > /* > * These two can only be used on L0..L2 ptes because L3 mappings set > * the table bit and therefore these would return the opposite to what > * you would expect. > */ > -static inline bool_t p2m_table(lpae_t pte) > +static inline bool_t lpae_table(lpae_t pte) > { > -return p2m_valid(pte) && pte.p2m.table; > +return lpae_valid(pte) && pte.walk.table; > } > -static inline bool_t p2m_mapping(lpae_t pte) > +static inline bool_t lpae_mapping(lpae_t pte) > { > -return p2m_valid(pte) && !pte.p2m.table; > +return lpae_valid(pte) && !pte.walk.table; > } > > -static inline bool p2m_is_superpage(lpae_t pte, unsigned int level) > +static inline bool lpae_is_superpage(lpae_t pte, unsigned int level) > { > -return (level < 3) && p2m_mapping(pte); > +return (level < 3) && lpae_mapping(pte); > } > > static void p2m_flush_tlb(struct p2m_domain *p2m); > @@ -281,7 +281,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool > read_only, > > entry = *table + offset; > > -if ( !p2m_valid(*entry) ) > +if ( !lpae_valid(*entry) ) > { > if ( read_only ) > return GUEST_TABLE_MAP_FAILED; > @@ -292,7 +292,7 @@ static int p2m_next_level(struct p2m_domain *p2m, bool > read_only, > } > > /* The function p2m_next_level is never called at the 3rd level */ > -if ( p2m_mapping(*entry) ) > +if ( lpae_mapping(*entry) ) > return GUEST_TABLE_SUPER_PAGE; > > mfn = _mfn(entry->p2m.base); > @@ -372,7 +372,7 @@ mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn, > > entry = table[offsets[level]]; > > -if ( p2m_valid(entry) ) > +if ( lpae_valid(entry) ) > { > *t = entry.p2m.type; > > @@ -577,7 +577,7 @@ static int p2m_create_table(struct p2m_domain *p2m, > lpae_t *entry) > lpae_t *p; > lpae_t pte; > > -ASSERT(!p2m_valid(*entry)); > +ASSERT(!lpae_valid(*entry)); > > page = alloc_domheap_page(NULL, 0); > if ( page == NULL ) > @@ -645,7 +645,7 @@ enum p2m_operation { > */ > static void p2m_put_l3_page(const lpae_t pte) > { > -ASSERT(p2m_valid(pte)); > +ASSERT(lpae_valid(pte)); > > /* > * TODO: Handle other p2m types > @@ -673,11 +673,11 @@ static void p2m_free_entry(struct p2m_domain *p2m, > struct page_info *pg; > > /* Nothing to do if the entry is invalid. */ > -if ( !p2m_valid(entry) ) > +if ( !lpae_valid(entry) ) > return; > > /* Nothing to do but updating the stats if the entry is a super-page. */ > -if ( p2m_is_superpage(entry, level) ) > +if ( lpae_is_superpage(entry, level) ) > { > p2m->stats.mappings[level]--; > return; > @@ -733,7 +733,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, > lpae_t *entry, > * a superpage. > */ > ASSERT(level < target); > -ASSERT(p2m_is_superpage(*entry, level)); > +ASSERT(lpae_is_superpage(*entry, level)); > > page = alloc_domheap_page(NULL, 0); > if ( !page ) > @@ -870,7 +870,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m, > /* We need to split the original page. */ > lpae_t split_pte = *entry; > > -ASSERT(p2m_is_superpage(*entry, level)); > +ASSERT(lpae_is_superpage(*entry, level)); > > if ( !p2m_split_superpage(p2m, &split_pte, level, target, offsets) ) > { > @@ -944,12 +944,12 @@ static int __p2m_set_entry(struct p2m_domain *p2m, > * sequence when updating the translation table (D4.7.1 in ARM DDI > * 0487A.j). > */ > -if ( p2m_valid(orig_pte) ) > +if ( lpae_valid(orig_pte) ) > p2m_remove_pte(entry, p2m->clean_pte); > > if ( mfn_eq(smfn, INVALID_MFN) ) > /* Flush can be deferred if the entry is removed */ > -p2m->need_flush |= !!p2m_valid(orig_pte); > +p2m->need_flush |= !!lpae_valid(orig_pte);
Re: [Xen-devel] [PATCH v2 12/16] xen/arm: p2m: Rename p2m_valid, p2m_table, p2m_mapping and p2m_is_superpage
On Wed, 21 Jun 2017, Julien Grall wrote: > Hi Andrew, > > On 20/06/17 09:14, Andrew Cooper wrote: > > On 19/06/2017 17:57, Julien Grall wrote: > > > The helpers p2m_valid, p2m_table, p2m_mapping and p2m_is_superpage are > > > not specific to the stage-2 translation tables. They can also work on > > > any LPAE translation tables. So rename then to lpae_* and use pte.walk > > > to look for the value of the field. > > > > > > Signed-off-by: Julien Grall > > > > s/bool_t/bool/ as you go? > > AFAICT, this is the only change required on this series so far (patch #1 was > modified and pushed by Jan). So, I would suggest to send a follow-up for the > s/bool_t/bool/ if that's fine by Stefano? Sure ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 11/16] xen/arm: lpae: Fix comments coding style
On Mon, 19 Jun 2017, Julien Grall wrote: > Also adding one missing full stop + fix description > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > --- > > Cc: prosku...@sec.in.tum.de > > I haven't retained Stefano's reviewed-by because of the description > update. > >Changes in v2: > - Fix description regarding x86 page-table > --- > xen/include/asm-arm/lpae.h | 49 > ++ > 1 file changed, 32 insertions(+), 17 deletions(-) > > diff --git a/xen/include/asm-arm/lpae.h b/xen/include/asm-arm/lpae.h > index ad8c571ea5..aa85cb8112 100644 > --- a/xen/include/asm-arm/lpae.h > +++ b/xen/include/asm-arm/lpae.h > @@ -3,10 +3,12 @@ > > #ifndef __ASSEMBLY__ > > -/* WARNING! Unlike the Intel pagetable code, where l1 is the lowest > - * level and l4 is the root of the trie, the ARM pagetables follow ARM's > - * documentation: the levels are called first, second &c in the order > - * that the MMU walks them (i.e. "first" is the root of the trie). */ > +/* > + * WARNING! Unlike the x86 pagetable code, where l1 is the lowest level and > + * l4 is the root of the trie, the ARM pagetables follow ARM's documentation: > + * the levels are called first, second &c in the order that the MMU walks > them > + * (i.e. "first" is the root of the trie). > + */ > > > /** > * ARMv7-A LPAE pagetables: 3-level trie, mapping 40-bit input to > @@ -17,15 +19,18 @@ > * different place from those in leaf nodes seems to be to allow linear > * pagetable tricks. If we're not doing that then the set of permission > * bits that's not in use in a given node type can be used as > - * extra software-defined bits. */ > + * extra software-defined bits. > + */ > > typedef struct __packed { > /* These are used in all kinds of entry. */ > unsigned long valid:1; /* Valid mapping */ > unsigned long table:1; /* == 1 in 4k map entries too */ > > -/* These ten bits are only used in Block entries and are ignored > - * in Table entries. */ > +/* > + * These ten bits are only used in Block entries and are ignored > + * in Table entries. > + */ > unsigned long ai:3; /* Attribute Index */ > unsigned long ns:1; /* Not-Secure */ > unsigned long user:1; /* User-visible */ > @@ -38,30 +43,38 @@ typedef struct __packed { > unsigned long long base:36; /* Base address of block or next table */ > unsigned long sbz:4;/* Must be zero */ > > -/* These seven bits are only used in Block entries and are ignored > - * in Table entries. */ > +/* > + * These seven bits are only used in Block entries and are ignored > + * in Table entries. > + */ > unsigned long contig:1; /* In a block of 16 contiguous entries */ > unsigned long pxn:1;/* Privileged-XN */ > unsigned long xn:1; /* eXecute-Never */ > unsigned long avail:4; /* Ignored by hardware */ > > -/* These 5 bits are only used in Table entries and are ignored in > - * Block entries */ > +/* > + * These 5 bits are only used in Table entries and are ignored in > + * Block entries. > + */ > unsigned long pxnt:1; /* Privileged-XN */ > unsigned long xnt:1;/* eXecute-Never */ > unsigned long apt:2;/* Access Permissions */ > unsigned long nst:1;/* Not-Secure */ > } lpae_pt_t; > > -/* The p2m tables have almost the same layout, but some of the permission > - * and cache-control bits are laid out differently (or missing) */ > +/* > + * The p2m tables have almost the same layout, but some of the permission > + * and cache-control bits are laid out differently (or missing). > + */ > typedef struct __packed { > /* These are used in all kinds of entry. */ > unsigned long valid:1; /* Valid mapping */ > unsigned long table:1; /* == 1 in 4k map entries too */ > > -/* These ten bits are only used in Block entries and are ignored > - * in Table entries. */ > +/* > + * These ten bits are only used in Block entries and are ignored > + * in Table entries. > + */ > unsigned long mattr:4; /* Memory Attributes */ > unsigned long read:1; /* Read access */ > unsigned long write:1; /* Write access */ > @@ -73,8 +86,10 @@ typedef struct __packed { > unsigned long long base:36; /* Base address of block or next table */ > unsigned long sbz3:4; > > -/* These seven bits are only used in Block entries and are ignored > - * in Table entries. */ > +/* > + * These seven bits are only used in Block entries and are ignored > + * in Table entries. > + */ > unsigned long contig:1; /* In a block of 16 contiguous entries */ > unsigned long sbz2:1; > unsigned long xn:1; /* eXecute-Never */ > --
Re: [Xen-devel] [PATCH v2 02/16] xen/arm: setup: Remove bogus xenheap_mfn_end in setup_mm for arm64
On Mon, 19 Jun 2017, Julien Grall wrote: > xenheap_mfn_end is storing an MFN and not a physical address. Xen is not > currently using xenheap_mfn_end and the value will be reset after the > loop. So drop this bogus xenheap_mfn_end. > > Signed-off-by: Julien Grall I would drop the mention of the fact that Xen is not currently using xenheap_mfn_end entirely. I might do that as I commit. Reviewed-by: Stefano Stabellini > --- > Changes in v2: > - Update commit message > --- > xen/arch/arm/setup.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c > index f00f29a45b..ab4d8e4218 100644 > --- a/xen/arch/arm/setup.c > +++ b/xen/arch/arm/setup.c > @@ -654,8 +654,6 @@ static void __init setup_mm(unsigned long dtb_paddr, > size_t dtb_size) > if ( e > bank_end ) > e = bank_end; > > -xenheap_mfn_end = e; > - > dt_unreserved_regions(s, e, init_boot_pages, 0); > s = n; > } > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/16] xen/arm: mm: Redefine virt_to_mfn to support typesafe
On Mon, 19 Jun 2017, Julien Grall wrote: > The file xen/arch/arm/mm.c is using the typesafe MFN in most of the > place. This requires all caller of virt_to_mfn to prefixed by _mfn(...). > > To avoid the extra _mfn(...), re-defined virt_to_mfn within arch/arm/mm.c > to handle typesafe MFN. > > This patch also introduce __virt_to_mfn, so virt_to_mfn can be > re-defined easily. > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > Changes in v2: > - Use __virt_to_mfn rather than mfn_x(virt_to_mfn()). > --- > xen/arch/arm/mm.c| 16 ++-- > xen/include/asm-arm/mm.h | 3 ++- > 2 files changed, 12 insertions(+), 7 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 452c1e26c3..5612834dfc 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -44,6 +44,10 @@ > > struct domain *dom_xen, *dom_io, *dom_cow; > > +/* Override macros from asm/page.h to make them work with mfn_t */ > +#undef virt_to_mfn > +#define virt_to_mfn(va) _mfn(__virt_to_mfn(va)) > + > /* Static start-of-day pagetables that we use before the allocators > * are up. These are used by all CPUs during bringup before switching > * to the CPUs own pagetables. > @@ -479,7 +483,7 @@ unsigned long domain_page_map_to_mfn(const void *ptr) > unsigned long offset = (va>>THIRD_SHIFT) & LPAE_ENTRY_MASK; > > if ( va >= VMAP_VIRT_START && va < VMAP_VIRT_END ) > -return virt_to_mfn(va); > +return __virt_to_mfn(va); > > ASSERT(slot >= 0 && slot < DOMHEAP_ENTRIES); > ASSERT(map[slot].pt.avail != 0); > @@ -764,7 +768,7 @@ int init_secondary_pagetables(int cpu) > * domheap mapping pages. */ > for ( i = 0; i < DOMHEAP_SECOND_PAGES; i++ ) > { > -pte = mfn_to_xen_entry(_mfn(virt_to_mfn(domheap+i*LPAE_ENTRIES)), > +pte = mfn_to_xen_entry(virt_to_mfn(domheap+i*LPAE_ENTRIES), > WRITEALLOC); > pte.pt.table = 1; > > write_pte(&first[first_table_offset(DOMHEAP_VIRT_START+i*FIRST_SIZE)], pte); > @@ -961,7 +965,7 @@ static int create_xen_table(lpae_t *entry) > if ( p == NULL ) > return -ENOMEM; > clear_page(p); > -pte = mfn_to_xen_entry(_mfn(virt_to_mfn(p)), WRITEALLOC); > +pte = mfn_to_xen_entry(virt_to_mfn(p), WRITEALLOC); > pte.pt.table = 1; > write_pte(entry, pte); > return 0; > @@ -1216,7 +1220,7 @@ int xenmem_add_to_physmap_one( > unsigned long idx, > gfn_t gfn) > { > -unsigned long mfn = 0; > +mfn_t mfn = INVALID_MFN; > int rc; > p2m_type_t t; > struct page_info *page = NULL; > @@ -1302,7 +1306,7 @@ int xenmem_add_to_physmap_one( > return -EINVAL; > } > > -mfn = page_to_mfn(page); > +mfn = _mfn(page_to_mfn(page)); > t = p2m_map_foreign; > > rcu_unlock_domain(od); > @@ -1321,7 +1325,7 @@ int xenmem_add_to_physmap_one( > } > > /* Map at new location. */ > -rc = guest_physmap_add_entry(d, gfn, _mfn(mfn), 0, t); > +rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); > > /* If we fail to add the mapping, we need to drop the reference we > * took earlier on foreign pages */ > diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h > index b2e7ea7761..6e2b3c7f2b 100644 > --- a/xen/include/asm-arm/mm.h > +++ b/xen/include/asm-arm/mm.h > @@ -264,7 +264,7 @@ static inline int gvirt_to_maddr(vaddr_t va, paddr_t *pa, > unsigned int flags) > #define __va(x) (maddr_to_virt(x)) > > /* Convert between Xen-heap virtual addresses and machine frame numbers. */ > -#define virt_to_mfn(va) (virt_to_maddr(va) >> PAGE_SHIFT) > +#define __virt_to_mfn(va) (virt_to_maddr(va) >> PAGE_SHIFT) > #define mfn_to_virt(mfn) (maddr_to_virt((paddr_t)(mfn) << PAGE_SHIFT)) > > /* > @@ -274,6 +274,7 @@ static inline int gvirt_to_maddr(vaddr_t va, paddr_t *pa, > unsigned int flags) > */ > #define mfn_to_page(mfn)__mfn_to_page(mfn) > #define page_to_mfn(pg) __page_to_mfn(pg) > +#define virt_to_mfn(va) __virt_to_mfn(va) > > /* Convert between Xen-heap virtual addresses and page-info structures. */ > static inline struct page_info *virt_to_page(const void *v) > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 04/16] xen/arm: p2m: Redefine mfn_to_page and page_to_mfn to use typesafe
On Mon, 19 Jun 2017, Julien Grall wrote: > The file xen/arch/arm/p2m.c is using typesafe MFN in most of the place. > This requires caller to mfn_to_page and page_to_mfn to use _mfn/mfn_x. > > To avoid extra _mfn/mfn_x, re-define mfn_to_page and page_to_mfn within > xen/arch/arm/p2m.c to handle typesafe MFN. > > Signed-off-by: Julien Grall > --- > The idea behind redefining locally mfn_to_page and page_to_mfn is > splitting the introduction of typesafe MFN in smaller series and > between multiple people. We know the file is typesafe ready and if > we decide to make the main helper typesafe, we would just need to > drop the definition at the top of the file. > --- > xen/arch/arm/p2m.c | 20 +--- > 1 file changed, 13 insertions(+), 7 deletions(-) > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > index 266d1c3bd6..6c1ac70044 100644 > --- a/xen/arch/arm/p2m.c > +++ b/xen/arch/arm/p2m.c > @@ -38,6 +38,12 @@ static unsigned int __read_mostly max_vmid = > MAX_VMID_8_BIT; > > #define P2M_ROOT_PAGES(1< > +/* Override macros from asm/mm.h to make them work with mfn_t */ > +#undef mfn_to_page > +#define mfn_to_page(mfn) __mfn_to_page(mfn_x(mfn)) > +#undef page_to_mfn > +#define page_to_mfn(pg) _mfn(__page_to_mfn(pg)) > + > unsigned int __read_mostly p2m_ipa_bits; > > /* Helpers to lookup the properties of each level */ > @@ -115,7 +121,7 @@ void dump_p2m_lookup(struct domain *d, paddr_t addr) > printk("dom%d IPA 0x%"PRIpaddr"\n", d->domain_id, addr); > > printk("P2M @ %p mfn:0x%lx\n", > - p2m->root, page_to_mfn(p2m->root)); > + p2m->root, mfn_x(page_to_mfn(p2m->root))); __page_to_mfn > dump_pt_walk(page_to_maddr(p2m->root), addr, > P2M_ROOT_LEVEL, P2M_ROOT_PAGES); > @@ -591,7 +597,7 @@ static int p2m_create_table(struct p2m_domain *p2m, > lpae_t *entry) > * The access value does not matter because the hardware will ignore > * the permission fields for table entry. > */ > -pte = mfn_to_p2m_entry(_mfn(page_to_mfn(page)), p2m_invalid, > +pte = mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid, > p2m->default_access); > > p2m_write_pte(entry, pte, p2m->clean_pte); > @@ -650,9 +656,9 @@ static void p2m_put_l3_page(const lpae_t pte) > */ > if ( p2m_is_foreign(pte.p2m.type) ) > { > -unsigned long mfn = pte.p2m.base; > +mfn_t mfn = _mfn(pte.p2m.base); > > -ASSERT(mfn_valid(_mfn(mfn))); > +ASSERT(mfn_valid(mfn)); > put_page(mfn_to_page(mfn)); > } > } > @@ -702,7 +708,7 @@ static void p2m_free_entry(struct p2m_domain *p2m, > mfn = _mfn(entry.p2m.base); > ASSERT(mfn_valid(mfn)); > > -pg = mfn_to_page(mfn_x(mfn)); > +pg = mfn_to_page(mfn); > > page_list_del(pg, &p2m->pages); > free_domheap_page(pg); > @@ -780,7 +786,7 @@ static bool p2m_split_superpage(struct p2m_domain *p2m, > lpae_t *entry, > > unmap_domain_page(table); > > -pte = mfn_to_p2m_entry(_mfn(page_to_mfn(page)), p2m_invalid, > +pte = mfn_to_p2m_entry(page_to_mfn(page), p2m_invalid, > p2m->default_access); > > /* > @@ -1443,7 +1449,7 @@ struct page_info *get_page_from_gva(struct vcpu *v, > vaddr_t va, > if ( !mfn_valid(maddr_to_mfn(maddr)) ) > goto err; > > -page = mfn_to_page(mfn_x(maddr_to_mfn(maddr))); > +page = mfn_to_page(maddr_to_mfn(maddr)); > ASSERT(page); > > if ( unlikely(!get_page(page, d)) ) > -- > 2.11.0 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 03/16] xen/arm: mm: Use typesafe mfn for xenheap_mfn_*
On Mon, 19 Jun 2017, Julien Grall wrote: > Add more safety when using xenheap_mfn_*. > > Signed-off-by: Julien Grall Reviewed-by: Stefano Stabellini > --- > > I haven't introduced mfn_less_than() and mfn_greather_than() as > there would be only a couple of usage. We would be able to introduce > them and replace the open-coding easily in the future grepping > mfn_x(). > --- > xen/arch/arm/mm.c| 16 > xen/arch/arm/setup.c | 18 +- > xen/include/asm-arm/mm.h | 11 ++- > 3 files changed, 23 insertions(+), 22 deletions(-) > > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 7b313ca123..452c1e26c3 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -138,8 +138,8 @@ uint64_t init_ttbr; > static paddr_t phys_offset; > > /* Limits of the Xen heap */ > -unsigned long xenheap_mfn_start __read_mostly = ~0UL; > -unsigned long xenheap_mfn_end __read_mostly; > +mfn_t xenheap_mfn_start __read_mostly = INVALID_MFN; > +mfn_t xenheap_mfn_end __read_mostly; > vaddr_t xenheap_virt_end __read_mostly; > #ifdef CONFIG_ARM_64 > vaddr_t xenheap_virt_start __read_mostly; > @@ -801,8 +801,8 @@ void __init setup_xenheap_mappings(unsigned long base_mfn, > > /* Record where the xenheap is, for translation routines. */ > xenheap_virt_end = XENHEAP_VIRT_START + nr_mfns * PAGE_SIZE; > -xenheap_mfn_start = base_mfn; > -xenheap_mfn_end = base_mfn + nr_mfns; > +xenheap_mfn_start = _mfn(base_mfn); > +xenheap_mfn_end = _mfn(base_mfn + nr_mfns); > } > #else /* CONFIG_ARM_64 */ > void __init setup_xenheap_mappings(unsigned long base_mfn, > @@ -816,16 +816,16 @@ void __init setup_xenheap_mappings(unsigned long > base_mfn, > mfn = base_mfn & ~((FIRST_SIZE>>PAGE_SHIFT)-1); > > /* First call sets the xenheap physical and virtual offset. */ > -if ( xenheap_mfn_start == ~0UL ) > +if ( mfn_eq(xenheap_mfn_start, INVALID_MFN) ) > { > -xenheap_mfn_start = base_mfn; > +xenheap_mfn_start = _mfn(base_mfn); > xenheap_virt_start = DIRECTMAP_VIRT_START + > (base_mfn - mfn) * PAGE_SIZE; > } > > -if ( base_mfn < xenheap_mfn_start ) > +if ( base_mfn < mfn_x(xenheap_mfn_start) ) > panic("cannot add xenheap mapping at %lx below heap start %lx", > - base_mfn, xenheap_mfn_start); > + base_mfn, mfn_x(xenheap_mfn_start)); > > end_mfn = base_mfn + nr_mfns; > > diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c > index ab4d8e4218..3b34855668 100644 > --- a/xen/arch/arm/setup.c > +++ b/xen/arch/arm/setup.c > @@ -555,8 +555,8 @@ static void __init setup_mm(unsigned long dtb_paddr, > size_t dtb_size) > * and enough mapped pages for copying the DTB. > */ > dtb_pages = (dtb_size + PAGE_SIZE-1) >> PAGE_SHIFT; > -boot_mfn_start = xenheap_mfn_end - dtb_pages - 1; > -boot_mfn_end = xenheap_mfn_end; > +boot_mfn_start = mfn_x(xenheap_mfn_end) - dtb_pages - 1; > +boot_mfn_end = mfn_x(xenheap_mfn_end); > > init_boot_pages(pfn_to_paddr(boot_mfn_start), > pfn_to_paddr(boot_mfn_end)); > > @@ -591,11 +591,11 @@ static void __init setup_mm(unsigned long dtb_paddr, > size_t dtb_size) > e = bank_end; > > /* Avoid the xenheap */ > -if ( s < pfn_to_paddr(xenheap_mfn_start+xenheap_pages) > - && pfn_to_paddr(xenheap_mfn_start) < e ) > +if ( s < mfn_to_maddr(mfn_add(xenheap_mfn_start, xenheap_pages)) > + && mfn_to_maddr(xenheap_mfn_start) < e ) > { > -e = pfn_to_paddr(xenheap_mfn_start); > -n = pfn_to_paddr(xenheap_mfn_start+xenheap_pages); > +e = mfn_to_maddr(xenheap_mfn_start); > +n = mfn_to_maddr(mfn_add(xenheap_mfn_start, xenheap_pages)); > } > > dt_unreserved_regions(s, e, init_boot_pages, 0); > @@ -610,7 +610,7 @@ static void __init setup_mm(unsigned long dtb_paddr, > size_t dtb_size) > > /* Add xenheap memory that was not already added to the boot > allocator. */ > -init_xenheap_pages(pfn_to_paddr(xenheap_mfn_start), > +init_xenheap_pages(mfn_to_maddr(xenheap_mfn_start), > pfn_to_paddr(boot_mfn_start)); > } > #else /* CONFIG_ARM_64 */ > @@ -662,8 +662,8 @@ static void __init setup_mm(unsigned long dtb_paddr, > size_t dtb_size) > total_pages += ram_size >> PAGE_SHIFT; > > xenheap_virt_end = XENHEAP_VIRT_START + ram_end - ram_start; > -xenheap_mfn_start = ram_start >> PAGE_SHIFT; > -xenheap_mfn_end = ram_end >> PAGE_SHIFT; > +xenheap_mfn_start = maddr_to_mfn(ram_start); > +xenheap_mfn_end = maddr_to_mfn(ram_end); > > /* > * Need enough mapped pages for copying the DTB. > diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h > index 274b1752b3..b2e7ea7761 100644 > --- a/xen/include/asm-arm/mm.h >
Re: [Xen-devel] [PATCH 06/11] ARM: simplify page type handling
On Wed, 21 Jun 2017, Jan Beulich wrote: > There's no need to have anything here on ARM other than the distinction > between writable and non-writable pages (and even that could likely be > eliminated, but with a more intrusive change). Limit type to a single > bit and drop pinned and validated flags altogether. > > Signed-off-by: Jan Beulich Reviewed-by: Stefano Stabellini > --- > Note: Compile tested only. > > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -1113,8 +1113,7 @@ void share_xen_page_with_guest(struct pa > spin_lock(&d->page_alloc_lock); > > /* The incremented type count pins as writable or read-only. */ > -page->u.inuse.type_info = (readonly ? PGT_none : PGT_writable_page); > -page->u.inuse.type_info |= PGT_validated | 1; > +page->u.inuse.type_info = (readonly ? PGT_none : PGT_writable_page) | 1; > > page_set_owner(page, d); > smp_wmb(); /* install valid domain ptr before updating refcnt. */ > --- a/xen/common/memory.c > +++ b/xen/common/memory.c > @@ -354,8 +354,10 @@ int guest_remove_page(struct domain *d, > > rc = guest_physmap_remove_page(d, _gfn(gmfn), mfn, 0); > > +#ifdef _PGT_pinned > if ( !rc && test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) ) > put_page_and_type(page); > +#endif > > /* > * With the lack of an IOMMU on some platforms, domains with DMA-capable > --- a/xen/include/asm-arm/mm.h > +++ b/xen/include/asm-arm/mm.h > @@ -77,20 +77,12 @@ struct page_info > #define PG_shift(idx) (BITS_PER_LONG - (idx)) > #define PG_mask(x, idx) (x ## UL << PG_shift(idx)) > > -#define PGT_none PG_mask(0, 4) /* no special uses of this page */ > -#define PGT_writable_page PG_mask(7, 4) /* has writable mappings? */ > -#define PGT_type_mask PG_mask(15, 4) /* Bits 28-31 or 60-63. */ > - > - /* Owning guest has pinned this page to its current type? */ > -#define _PGT_pinned PG_shift(5) > -#define PGT_pinnedPG_mask(1, 5) > - > - /* Has this page been validated for use as its current type? */ > -#define _PGT_validatedPG_shift(6) > -#define PGT_validated PG_mask(1, 6) > +#define PGT_none PG_mask(0, 1) /* no special uses of this page */ > +#define PGT_writable_page PG_mask(1, 1) /* has writable mappings? */ > +#define PGT_type_mask PG_mask(1, 1) /* Bits 31 or 63. */ > > /* Count of uses of this frame as its current type. */ > -#define PGT_count_width PG_shift(9) > +#define PGT_count_width PG_shift(2) > #define PGT_count_mask((1UL< > /* Cleared when the owning guest 'frees' this page. */ > > > > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 16/18] xen/pvcalls: implement read
On Wed, 21 Jun 2017, Boris Ostrovsky wrote: > On 06/15/2017 03:09 PM, Stefano Stabellini wrote: > > When an active socket has data available, increment the io and read > > counters, and schedule the ioworker. > > > > Implement the read function by reading from the socket, writing the data > > to the data ring. > > > > Set in_error on error. > > > > Signed-off-by: Stefano Stabellini > > CC: boris.ostrov...@oracle.com > > CC: jgr...@suse.com > > --- > > drivers/xen/pvcalls-back.c | 85 > > ++ > > 1 file changed, 85 insertions(+) > > > > diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c > > index b9a10b9..65d9eba 100644 > > --- a/drivers/xen/pvcalls-back.c > > +++ b/drivers/xen/pvcalls-back.c > > @@ -100,6 +100,81 @@ static int pvcalls_back_release_active(struct > > xenbus_device *dev, > > > > static void pvcalls_conn_back_read(void *opaque) > > { > > + struct sock_mapping *map = (struct sock_mapping *)opaque; > > + struct msghdr msg; > > + struct kvec vec[2]; > > + RING_IDX cons, prod, size, wanted, array_size, masked_prod, masked_cons; > > + int32_t error; > > + struct pvcalls_data_intf *intf = map->ring; > > + struct pvcalls_data *data = &map->data; > > + unsigned long flags; > > + int ret; > > + > > + array_size = XEN_FLEX_RING_SIZE(map->ring_order); > > I noticed that in the next patch you call this 'ring_size. Can you make > those things consistent? (There may be more than just this variable and, > in fact, perhaps some things can be factored out? There are code > fragments that look similar) Yes, I'll make them more consistent. I don't think we can actually share code between the two functions are they do different things. > > + cons = intf->in_cons; > > + prod = intf->in_prod; > > + error = intf->in_error; > > + /* read the indexes first, then deal with the data */ > > + virt_mb(); > > + > > + if (error) > > + return; > > + > > + size = pvcalls_queued(prod, cons, array_size); > > + if (size >= array_size) > > + return; > > + spin_lock_irqsave(&map->sock->sk->sk_receive_queue.lock, flags); > > + if (skb_queue_empty(&map->sock->sk->sk_receive_queue)) { > > + atomic_set(&map->read, 0); > > + spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, > > + flags); > > + return; > > + } > > + spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, flags); > > + wanted = array_size - size; > > + masked_prod = pvcalls_mask(prod, array_size); > > + masked_cons = pvcalls_mask(cons, array_size); > > + > > + memset(&msg, 0, sizeof(msg)); > > + msg.msg_iter.type = ITER_KVEC|WRITE; > > + msg.msg_iter.count = wanted; > > + if (masked_prod < masked_cons) { > > + vec[0].iov_base = data->in + masked_prod; > > + vec[0].iov_len = wanted; > > + msg.msg_iter.kvec = vec; > > + msg.msg_iter.nr_segs = 1; > > + } else { > > + vec[0].iov_base = data->in + masked_prod; > > + vec[0].iov_len = array_size - masked_prod; > > + vec[1].iov_base = data->in; > > + vec[1].iov_len = wanted - vec[0].iov_len; > > + msg.msg_iter.kvec = vec; > > + msg.msg_iter.nr_segs = 2; > > + } > > > This is probably obvious to everyone but me but can you explain what is > going on here? ;-) We are setting up iovecs based on the "in" array (similarly the write function does the same for the "out" array). Then we are passing the iovecs to inet_recvmsg to do IO. Depending on the indexes on the array we need one iovec entry or two, in case we need to wrap around the circular buffer. > > + > > + atomic_set(&map->read, 0); > > Is this not atomic_dec() by any chance? It is meant to be atomic_set: the idea is that we are going to drain all the data. If there is any remaming data after inet_recvmsg, we'll increase map->read again. > > + ret = inet_recvmsg(map->sock, &msg, wanted, MSG_DONTWAIT); > > + WARN_ON(ret > wanted); > > + if (ret == -EAGAIN) /* shouldn't happen */ > > + return; > > + if (!ret) > > + ret = -ENOTCONN; > > + spin_lock_irqsave(&map->sock->sk->sk_receive_queue.lock, flags); > > + if (ret > 0 && !skb_queue_empty(&map->sock->sk->sk_receive_queue)) > > + atomic_inc(&map->read); > > + spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, flags); > > + > > + /* write the data, then modify the indexes */ > > + virt_wmb(); > > + if (ret < 0) > > + intf->in_error = ret; > > + else > > + intf->in_prod = prod + ret; > > + /* update the indexes, then notify the other end */ > > + virt_wmb(); > > + notify_remote_via_irq(map->irq); > > + > > + return; > > } > > > > static int pvcalls_conn_back_write(struct sock_mapping *map) > > @@ -172,6 +247,16 @@ static void pvcalls_sk_state_change(struct sock *sock) > > > > static void pvcalls_sk_da
Re: [Xen-devel] [osstest test] 110909: tolerable FAIL - PUSHED
On 21/06/2017 23:59, Ian Jackson wrote: > osstest service owner writes ("[osstest test] 110909: tolerable FAIL - > PUSHED"): >> flight 110909 osstest real [real] >> http://logs.test-lab.xenproject.org/osstest/logs/110909/ >> >> Failures :-/ but no regressions. > ... >> Tests which did not succeed, but are not blocking: > ... >> test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail like >> 110373 > This guest had ~31G of disk and 1.5G of RAM. > > The logfile > > > http://logs.test-lab.xenproject.org/osstest/logs/110909/test-amd64-i386-xl-qemuu-win7-amd64/15.ts-guest-localmigrate.log > > seems to show that the guest is paused (state "p") following the 9th > migration. This is weird, given that xl seems to say earlier > "migration target: Domain started successsfully", which message > follows the call to libxl_domain_unpause. > > I wonder if it is possible that the domain still appears paused > briefly after xl/libxlq tries to unpause it. That is, that > XEN_DOMINF_paused might be set in the return from > xc_domain_getinfolist even after the unpause domctl returns. > > By the time log collection runs, the domain seems unpaused. XEN_DOMINF_paused is a straight reflection of d->controller_pause_count. A domain is created with 1 reference count, requiring the toolstack to call DOMCTL_unpause_domain once to cause it to start executing. Other than that, it is strictly reference counted based on pause and unpause hypercalls from toolstack components (in this case, all in dom0). One issue which XenServer has found in combination with Introspection is that any toolstack entity which can call pause/unpause (even for a short period of time) can result in XEN_DOMINF_paused being sampled as being set. The fix ^W utterly gross hack for XenServer's purposes is https://github.com/xenserver/xen-4.7.pg/blob/master/master/xen-introspection-pause.patch but I don't yet have a sensible plan for how to fix this in general. One option would be to introduce hypercall pairs per toolstack component, but that doesn't scale sensibly. In this case, what condition causes the failure? Is it simply seeing the domain as paused (in which case, there will definitely be a low-probability false negative rate if anything else in dom0 uses domain pause), or is it some other failure which prompts for the paused state check? ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [osstest test] 110909: tolerable FAIL - PUSHED
osstest service owner writes ("[osstest test] 110909: tolerable FAIL - PUSHED"): > flight 110909 osstest real [real] > http://logs.test-lab.xenproject.org/osstest/logs/110909/ > > Failures :-/ but no regressions. ... > Tests which did not succeed, but are not blocking: ... > test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail like > 110373 This guest had ~31G of disk and 1.5G of RAM. The logfile http://logs.test-lab.xenproject.org/osstest/logs/110909/test-amd64-i386-xl-qemuu-win7-amd64/15.ts-guest-localmigrate.log seems to show that the guest is paused (state "p") following the 9th migration. This is weird, given that xl seems to say earlier "migration target: Domain started successsfully", which message follows the call to libxl_domain_unpause. I wonder if it is possible that the domain still appears paused briefly after xl/libxlq tries to unpause it. That is, that XEN_DOMINF_paused might be set in the return from xc_domain_getinfolist even after the unpause domctl returns. By the time log collection runs, the domain seems unpaused. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 16/18] xen/pvcalls: implement read
On 06/15/2017 03:09 PM, Stefano Stabellini wrote: > When an active socket has data available, increment the io and read > counters, and schedule the ioworker. > > Implement the read function by reading from the socket, writing the data > to the data ring. > > Set in_error on error. > > Signed-off-by: Stefano Stabellini > CC: boris.ostrov...@oracle.com > CC: jgr...@suse.com > --- > drivers/xen/pvcalls-back.c | 85 > ++ > 1 file changed, 85 insertions(+) > > diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c > index b9a10b9..65d9eba 100644 > --- a/drivers/xen/pvcalls-back.c > +++ b/drivers/xen/pvcalls-back.c > @@ -100,6 +100,81 @@ static int pvcalls_back_release_active(struct > xenbus_device *dev, > > static void pvcalls_conn_back_read(void *opaque) > { > + struct sock_mapping *map = (struct sock_mapping *)opaque; > + struct msghdr msg; > + struct kvec vec[2]; > + RING_IDX cons, prod, size, wanted, array_size, masked_prod, masked_cons; > + int32_t error; > + struct pvcalls_data_intf *intf = map->ring; > + struct pvcalls_data *data = &map->data; > + unsigned long flags; > + int ret; > + > + array_size = XEN_FLEX_RING_SIZE(map->ring_order); I noticed that in the next patch you call this 'ring_size. Can you make those things consistent? (There may be more than just this variable and, in fact, perhaps some things can be factored out? There are code fragments that look similar) > + cons = intf->in_cons; > + prod = intf->in_prod; > + error = intf->in_error; > + /* read the indexes first, then deal with the data */ > + virt_mb(); > + > + if (error) > + return; > + > + size = pvcalls_queued(prod, cons, array_size); > + if (size >= array_size) > + return; > + spin_lock_irqsave(&map->sock->sk->sk_receive_queue.lock, flags); > + if (skb_queue_empty(&map->sock->sk->sk_receive_queue)) { > + atomic_set(&map->read, 0); > + spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, > + flags); > + return; > + } > + spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, flags); > + wanted = array_size - size; > + masked_prod = pvcalls_mask(prod, array_size); > + masked_cons = pvcalls_mask(cons, array_size); > + > + memset(&msg, 0, sizeof(msg)); > + msg.msg_iter.type = ITER_KVEC|WRITE; > + msg.msg_iter.count = wanted; > + if (masked_prod < masked_cons) { > + vec[0].iov_base = data->in + masked_prod; > + vec[0].iov_len = wanted; > + msg.msg_iter.kvec = vec; > + msg.msg_iter.nr_segs = 1; > + } else { > + vec[0].iov_base = data->in + masked_prod; > + vec[0].iov_len = array_size - masked_prod; > + vec[1].iov_base = data->in; > + vec[1].iov_len = wanted - vec[0].iov_len; > + msg.msg_iter.kvec = vec; > + msg.msg_iter.nr_segs = 2; > + } This is probably obvious to everyone but me but can you explain what is going on here? ;-) > + > + atomic_set(&map->read, 0); Is this not atomic_dec() by any chance? -boris > + ret = inet_recvmsg(map->sock, &msg, wanted, MSG_DONTWAIT); > + WARN_ON(ret > wanted); > + if (ret == -EAGAIN) /* shouldn't happen */ > + return; > + if (!ret) > + ret = -ENOTCONN; > + spin_lock_irqsave(&map->sock->sk->sk_receive_queue.lock, flags); > + if (ret > 0 && !skb_queue_empty(&map->sock->sk->sk_receive_queue)) > + atomic_inc(&map->read); > + spin_unlock_irqrestore(&map->sock->sk->sk_receive_queue.lock, flags); > + > + /* write the data, then modify the indexes */ > + virt_wmb(); > + if (ret < 0) > + intf->in_error = ret; > + else > + intf->in_prod = prod + ret; > + /* update the indexes, then notify the other end */ > + virt_wmb(); > + notify_remote_via_irq(map->irq); > + > + return; > } > > static int pvcalls_conn_back_write(struct sock_mapping *map) > @@ -172,6 +247,16 @@ static void pvcalls_sk_state_change(struct sock *sock) > > static void pvcalls_sk_data_ready(struct sock *sock) > { > + struct sock_mapping *map = sock->sk_user_data; > + struct pvcalls_ioworker *iow; > + > + if (map == NULL) > + return; > + > + iow = &map->ioworker; > + atomic_inc(&map->read); > + atomic_inc(&map->io); > + queue_work(iow->wq, &iow->register_work); > } > > static struct sock_mapping *pvcalls_new_active_socket( ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [linux-linus test] 110908: regressions - FAIL
flight 110908 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/110908/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-i386-pvops 5 kernel-build fail REGR. vs. 110515 test-arm64-arm64-xl 9 debian-install fail REGR. vs. 110515 test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail REGR. vs. 110515 build-armhf-pvops 5 kernel-build fail REGR. vs. 110515 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl-multivcpu 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-amd64-i386-xl-xsm1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-raw1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-win10-i386 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win10-i386 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-examine 1 build-check(1) blocked n/a test-armhf-armhf-xl 1 build-check(1) blocked n/a test-amd64-i386-examine 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-ws16-amd64 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-armhf-armhf-xl-vhd 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-qemut-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-armhf-armhf-xl-credit2 1 build-check(1) blocked n/a test-amd64-i386-rumprun-i386 1 build-check(1) blocked n/a test-armhf-armhf-xl-cubietruck 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ws16-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-qemut-rhel6hvm-amd 1 build-check(1) blocked n/a test-armhf-armhf-xl-rtds 1 build-check(1) blocked n/a test-armhf-armhf-xl-arndale 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-armhf-armhf-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 110515 test-amd64-amd64-xl-rtds 9 debian-install fail like 110515 test-amd64-amd64-xl-qemut-ws16-amd64 9 windows-installfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-arm64-arm64-xl-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 13 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 12 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 9 windows-installfail never pass test-arm64-arm64-libvirt-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-win10-i386 9 windows-installfail never pass test-amd64-amd64-xl-qemuu-win
Re: [Xen-devel] [PATCH v4 12/18] xen/pvcalls: implement poll command
On Wed, 21 Jun 2017, Boris Ostrovsky wrote: > >>> + > >>> + mappass->reqcopy = *req; > >>> + icsk = inet_csk(mappass->sock->sk); > >>> + queue = &icsk->icsk_accept_queue; > >>> + spin_lock(&queue->rskq_lock); > >>> + data = queue->rskq_accept_head != NULL; > >>> + spin_unlock(&queue->rskq_lock); > >> What is the purpose of the queue lock here? > > It is only there to protect accesses to rskq_accept_head. Functions that > > change rskq_accept_head take this lock, see for example > > net/ipv4/inet_connection_sock.c:inet_csk_reqsk_queue_add. I'll add an > > in-code comment. > > I am not sure I follow. You are not changing rskq_accept_head, you are > simply reading it under the lock. It may be set by others to NULL as > soon as you drop the lock, at which point 'data' test below will be > obsolete. > > In inet_csk_reqsk_queue_add() it is read and then, based on read result, > is written with a value so a lock is indeed need there. I think you are right. The only thing is that without the lock we might read a transitory value as the rskq_accept_head reads/writes are not guaranteed to be atomic. However, I don't think we care about it, since this is just a != NULL test and, as you wrote, the result could be obsolete immediately after. I'll drop the lock. > > > > > >>> + if (data) { > >>> + mappass->reqcopy.cmd = 0; > >>> + ret = 0; > >>> + goto out; > >>> + } > >>> + spin_unlock_irqrestore(&mappass->copy_lock, flags); > >>> + > >>> + /* Tell the caller we don't need to send back a notification yet */ > >>> + return -1; > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 13/18] xen/pvcalls: implement release command
On Tue, 20 Jun 2017, Boris Ostrovsky wrote: > > + > > +static int pvcalls_back_release_passive(struct xenbus_device *dev, > > + struct pvcalls_fedata *fedata, > > + struct sockpass_mapping *mappass) > > +{ > > + if (mappass->sock->sk != NULL) { > > + write_lock_bh(&mappass->sock->sk->sk_callback_lock); > > + mappass->sock->sk->sk_user_data = NULL; > > + mappass->sock->sk->sk_data_ready = mappass->saved_data_ready; > > + write_unlock_bh(&mappass->sock->sk->sk_callback_lock); > > + } > > + down(&fedata->socket_lock); > > + radix_tree_delete(&fedata->socketpass_mappings, mappass->id); > > + sock_release(mappass->sock); > > + flush_workqueue(mappass->wq); > > + destroy_workqueue(mappass->wq); > > + kfree(mappass); > > + up(&fedata->socket_lock); > > Can you raise the semaphore earlier, once the mapping is deleted from > the tree? Yes, I think it can. > Also, why are you not locking the tree in pvcalls_back_accept()? Good point! Although socket_lock is used to protect insertions and deletions to socketpass_mappings and socket_mappings, many of the sites that only access socket_mappings and socketpass_mappings without making modifications are left unprotected at the moment. I'll fix that, and to do it I'll move radix_tree_delete out of pvcalls_back_release_passive. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 110948: tolerable trouble: broken/pass - PUSHED
flight 110948 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/110948/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen ded7303f161fd578f0ac9fbaf52cfd9d72b4a320 baseline version: xen 93de762bcafb458b91955399d55acd5b2104490d Last test of basis 110940 2017-06-21 16:29:06 Z0 days Testing same since 110948 2017-06-21 19:02:24 Z0 days1 attempts People who touched revisions under test: Julien Grall Stefano Stabellini jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm broken test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=ded7303f161fd578f0ac9fbaf52cfd9d72b4a320 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke ded7303f161fd578f0ac9fbaf52cfd9d72b4a320 + branch=xen-unstable-smoke + revision=ded7303f161fd578f0ac9fbaf52cfd9d72b4a320 + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.9-testing + '[' xded7303f161fd578f0ac9fbaf52cfd9d72b4a320 = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : git://
Re: [Xen-devel] [PATCH v4 12/18] xen/pvcalls: implement poll command
>>> + >>> + mappass->reqcopy = *req; >>> + icsk = inet_csk(mappass->sock->sk); >>> + queue = &icsk->icsk_accept_queue; >>> + spin_lock(&queue->rskq_lock); >>> + data = queue->rskq_accept_head != NULL; >>> + spin_unlock(&queue->rskq_lock); >> What is the purpose of the queue lock here? > It is only there to protect accesses to rskq_accept_head. Functions that > change rskq_accept_head take this lock, see for example > net/ipv4/inet_connection_sock.c:inet_csk_reqsk_queue_add. I'll add an > in-code comment. I am not sure I follow. You are not changing rskq_accept_head, you are simply reading it under the lock. It may be set by others to NULL as soon as you drop the lock, at which point 'data' test below will be obsolete. In inet_csk_reqsk_queue_add() it is read and then, based on read result, is written with a value so a lock is indeed need there. -boris > > >>> + if (data) { >>> + mappass->reqcopy.cmd = 0; >>> + ret = 0; >>> + goto out; >>> + } >>> + spin_unlock_irqrestore(&mappass->copy_lock, flags); >>> + >>> + /* Tell the caller we don't need to send back a notification yet */ >>> + return -1; ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 12/18] xen/pvcalls: implement poll command
On Tue, 20 Jun 2017, Boris Ostrovsky wrote: > > @@ -499,6 +521,55 @@ static int pvcalls_back_accept(struct xenbus_device > > *dev, > > static int pvcalls_back_poll(struct xenbus_device *dev, > > struct xen_pvcalls_request *req) > > { > > + struct pvcalls_fedata *fedata; > > + struct sockpass_mapping *mappass; > > + struct xen_pvcalls_response *rsp; > > + struct inet_connection_sock *icsk; > > + struct request_sock_queue *queue; > > + unsigned long flags; > > + int ret; > > + bool data; > > + > > + fedata = dev_get_drvdata(&dev->dev); > > + > > + mappass = radix_tree_lookup(&fedata->socketpass_mappings, > > req->u.poll.id); > > + if (mappass == NULL) > > + return -EINVAL; > > + > > + /* > > +* Limitation of the current implementation: only support one > > +* concurrent accept or poll call on one socket. > > +*/ > > + spin_lock_irqsave(&mappass->copy_lock, flags); > > + if (mappass->reqcopy.cmd != 0) { > > + ret = -EINTR; > > + goto out; > > + } > > + > > + mappass->reqcopy = *req; > > + icsk = inet_csk(mappass->sock->sk); > > + queue = &icsk->icsk_accept_queue; > > + spin_lock(&queue->rskq_lock); > > + data = queue->rskq_accept_head != NULL; > > + spin_unlock(&queue->rskq_lock); > > What is the purpose of the queue lock here? It is only there to protect accesses to rskq_accept_head. Functions that change rskq_accept_head take this lock, see for example net/ipv4/inet_connection_sock.c:inet_csk_reqsk_queue_add. I'll add an in-code comment. > > + if (data) { > > + mappass->reqcopy.cmd = 0; > > + ret = 0; > > + goto out; > > + } > > + spin_unlock_irqrestore(&mappass->copy_lock, flags); > > + > > + /* Tell the caller we don't need to send back a notification yet */ > > + return -1; ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 11/18] xen/pvcalls: implement accept command
On Tue, 20 Jun 2017, Boris Ostrovsky wrote: > > static void __pvcalls_back_accept(struct work_struct *work) > > { > > + struct sockpass_mapping *mappass = container_of( > > + work, struct sockpass_mapping, register_work); > > + struct sock_mapping *map; > > + struct pvcalls_ioworker *iow; > > + struct pvcalls_fedata *fedata; > > + struct socket *sock; > > + struct xen_pvcalls_response *rsp; > > + struct xen_pvcalls_request *req; > > + int notify; > > + int ret = -EINVAL; > > + unsigned long flags; > > + > > + fedata = mappass->fedata; > > + /* > > +* __pvcalls_back_accept can race against pvcalls_back_accept. > > +* We only need to check the value of "cmd" on read. It could be > > +* done atomically, but to simplify the code on the write side, we > > +* use a spinlock. > > +*/ > > + spin_lock_irqsave(&mappass->copy_lock, flags); > > + req = &mappass->reqcopy; > > + if (req->cmd != PVCALLS_ACCEPT) { > > + spin_unlock_irqrestore(&mappass->copy_lock, flags); > > + return; > > + } > > + spin_unlock_irqrestore(&mappass->copy_lock, flags); > > + > > + sock = sock_alloc(); > > + if (sock == NULL) > > + goto out_error; > > + sock->type = mappass->sock->type; > > + sock->ops = mappass->sock->ops; > > + > > + ret = inet_accept(mappass->sock, sock, O_NONBLOCK, true); > > + if (ret == -EAGAIN) { > > + sock_release(sock); > > + goto out_error; > > + } > > + > > + map = pvcalls_new_active_socket(fedata, > > + req->u.accept.id_new, > > + req->u.accept.ref, > > + req->u.accept.evtchn, > > + sock); > > + if (!map) { > > + sock_release(sock); > > + goto out_error; > > Again, lost ret value. Yep, I'll fix > > + } > > + > > + map->sockpass = mappass; > > + iow = &map->ioworker; > > + atomic_inc(&map->read); > > + atomic_inc(&map->io); > > + queue_work(iow->wq, &iow->register_work); > > + > > +out_error: > > + rsp = RING_GET_RESPONSE(&fedata->ring, fedata->ring.rsp_prod_pvt++); > > + rsp->req_id = req->req_id; > > + rsp->cmd = req->cmd; > > + rsp->u.accept.id = req->u.accept.id; > > + rsp->ret = ret; > > + RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(&fedata->ring, notify); > > + if (notify) > > + notify_remote_via_irq(fedata->irq); > > + > > + mappass->reqcopy.cmd = 0; > > } > > > > static void pvcalls_pass_sk_data_ready(struct sock *sock) > > { > > + struct sockpass_mapping *mappass = sock->sk_user_data; > > + > > + if (mappass == NULL) > > + return; > > + > > + queue_work(mappass->wq, &mappass->register_work); > > } > > > > static int pvcalls_back_bind(struct xenbus_device *dev, > > @@ -383,6 +456,43 @@ static int pvcalls_back_listen(struct xenbus_device > > *dev, > > static int pvcalls_back_accept(struct xenbus_device *dev, > >struct xen_pvcalls_request *req) > > { > > + struct pvcalls_fedata *fedata; > > + struct sockpass_mapping *mappass; > > + int ret = -EINVAL; > > Unnecessary initialization. It is not unnecessary: if mappass is NULL few lines below, it will goto out_error without setting ret. > > + struct xen_pvcalls_response *rsp; > > + unsigned long flags; > > + > > + fedata = dev_get_drvdata(&dev->dev); > > + > > + mappass = radix_tree_lookup(&fedata->socketpass_mappings, > > + req->u.accept.id); > > + if (mappass == NULL) > > + goto out_error; > > + > > + /* > > +* Limitation of the current implementation: only support one > > +* concurrent accept or poll call on one socket. > > +*/ > > + spin_lock_irqsave(&mappass->copy_lock, flags); > > + if (mappass->reqcopy.cmd != 0) { > > + spin_unlock_irqrestore(&mappass->copy_lock, flags); > > + ret = -EINTR; > > + goto out_error; > > + } > > + > > + mappass->reqcopy = *req; > > + spin_unlock_irqrestore(&mappass->copy_lock, flags); > > + queue_work(mappass->wq, &mappass->register_work); > > + > > + /* Tell the caller we don't need to send back a notification yet */ > > + return -1; > > + > > +out_error: > > + rsp = RING_GET_RESPONSE(&fedata->ring, fedata->ring.rsp_prod_pvt++); > > + rsp->req_id = req->req_id; > > + rsp->cmd = req->cmd; > > + rsp->u.accept.id = req->u.accept.id; > > + rsp->ret = ret; > > return 0; > > } > > > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 08/18] xen/pvcalls: implement connect command
On Tue, 20 Jun 2017, Boris Ostrovsky wrote: > >> + > >> static int pvcalls_back_connect(struct xenbus_device *dev, > >>struct xen_pvcalls_request *req) > >> { > >> + struct pvcalls_fedata *fedata; > >> + int ret = -EINVAL; > >> + struct socket *sock; > >> + struct sock_mapping *map; > >> + struct xen_pvcalls_response *rsp; > >> + > >> + fedata = dev_get_drvdata(&dev->dev); > >> + > >> + ret = sock_create(AF_INET, SOCK_STREAM, 0, &sock); > >> + if (ret < 0) > >> + goto out; > >> + ret = inet_stream_connect(sock, (struct sockaddr *)&req->u.connect.addr, > >> +req->u.connect.len, req->u.connect.flags); > >> + if (ret < 0) { > >> + sock_release(sock); > >> + goto out; > >> + } > >> + > >> + map = pvcalls_new_active_socket(fedata, > >> + req->u.connect.id, > >> + req->u.connect.ref, > >> + req->u.connect.evtchn, > >> + sock); > >> + if (!map) { > >> + sock_release(map->sock); > >> + goto out; > > Unnecessary goto. > > Oh, and also ret needs to be set since it will be cleared by > inet_stream_connect(). Right, thanks! > > > >> + } > >> + > >> +out: > >> + rsp = RING_GET_RESPONSE(&fedata->ring, fedata->ring.rsp_prod_pvt++); > >> + rsp->req_id = req->req_id; > >> + rsp->cmd = req->cmd; > >> + rsp->u.connect.id = req->u.connect.id; > >> + rsp->ret = ret; > >> + > >> + return ret; > > > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 08/18] xen/pvcalls: implement connect command
On Tue, 20 Jun 2017, Boris Ostrovsky wrote: > On 06/15/2017 03:09 PM, Stefano Stabellini wrote: > > Allocate a socket. Keep track of socket <-> ring mappings with a new data > > structure, called sock_mapping. Implement the connect command by calling > > inet_stream_connect, and mapping the new indexes page and data ring. > > Allocate a workqueue and a work_struct, called ioworker, to perform > > reads and writes to the socket. > > > > When an active socket is closed (sk_state_change), set in_error to > > -ENOTCONN and notify the other end, as specified by the protocol. > > > > sk_data_ready and pvcalls_back_ioworker will be implemented later. > > > > Signed-off-by: Stefano Stabellini > > CC: boris.ostrov...@oracle.com > > CC: jgr...@suse.com > > --- > > drivers/xen/pvcalls-back.c | 171 > > + > > 1 file changed, 171 insertions(+) > > > > diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c > > index 953458b..4eecd83 100644 > > --- a/drivers/xen/pvcalls-back.c > > +++ b/drivers/xen/pvcalls-back.c > > @@ -56,6 +56,39 @@ struct pvcalls_fedata { > > struct work_struct register_work; > > }; > > > > +struct pvcalls_ioworker { > > + struct work_struct register_work; > > + struct workqueue_struct *wq; > > +}; > > + > > +struct sock_mapping { > > + struct list_head list; > > + struct pvcalls_fedata *fedata; > > + struct socket *sock; > > + uint64_t id; > > + grant_ref_t ref; > > + struct pvcalls_data_intf *ring; > > + void *bytes; > > + struct pvcalls_data data; > > + uint32_t ring_order; > > + int irq; > > + atomic_t read; > > + atomic_t write; > > + atomic_t io; > > + atomic_t release; > > + void (*saved_data_ready)(struct sock *sk); > > + struct pvcalls_ioworker ioworker; > > +}; > > + > > +static irqreturn_t pvcalls_back_conn_event(int irq, void *sock_map); > > +static int pvcalls_back_release_active(struct xenbus_device *dev, > > + struct pvcalls_fedata *fedata, > > + struct sock_mapping *map); > > + > > +static void pvcalls_back_ioworker(struct work_struct *work) > > +{ > > +} > > + > > static int pvcalls_back_socket(struct xenbus_device *dev, > > struct xen_pvcalls_request *req) > > { > > @@ -84,9 +117,142 @@ static int pvcalls_back_socket(struct xenbus_device > > *dev, > > return 0; > > } > > > > +static void pvcalls_sk_state_change(struct sock *sock) > > +{ > > + struct sock_mapping *map = sock->sk_user_data; > > + struct pvcalls_data_intf *intf; > > + > > + if (map == NULL) > > + return; > > + > > + intf = map->ring; > > + intf->in_error = -ENOTCONN; > > + notify_remote_via_irq(map->irq); > > +} > > + > > +static void pvcalls_sk_data_ready(struct sock *sock) > > +{ > > +} > > + > > +static struct sock_mapping *pvcalls_new_active_socket( > > + struct pvcalls_fedata *fedata, > > + uint64_t id, > > + grant_ref_t ref, > > + uint32_t evtchn, > > + struct socket *sock) > > +{ > > + int ret; > > + struct sock_mapping *map; > > + void *page; > > + > > + map = kzalloc(sizeof(*map), GFP_KERNEL); > > + if (map == NULL) > > + return NULL; > > + > > + map->fedata = fedata; > > + map->sock = sock; > > + map->id = id; > > + map->ref = ref; > > + > > + ret = xenbus_map_ring_valloc(fedata->dev, &ref, 1, &page); > > + if (ret < 0) > > + goto out; > > + map->ring = page; > > + map->ring_order = map->ring->ring_order; > > + /* first read the order, then map the data ring */ > > + virt_rmb(); > > + if (map->ring_order > MAX_RING_ORDER) { > > + pr_warn("%s frontend requested ring_order %u, which is > MAX > > (%u)\n", > > + __func__, map->ring_order, MAX_RING_ORDER); > > + goto out; > > + } > > + ret = xenbus_map_ring_valloc(fedata->dev, map->ring->ref, > > +(1 << map->ring_order), &page); > > + if (ret < 0) > > + goto out; > > + map->bytes = page; > > + > > + ret = bind_interdomain_evtchn_to_irqhandler(fedata->dev->otherend_id, > > + evtchn, > > + pvcalls_back_conn_event, > > + 0, > > + "pvcalls-backend", > > + map); > > + if (ret < 0) > > + goto out; > > + map->irq = ret; > > + > > + map->data.in = map->bytes; > > + map->data.out = map->bytes + XEN_FLEX_RING_SIZE(map->ring_order); > > + > > + map->ioworker.wq = alloc_workqueue("pvcalls_io", WQ_UNBOUND, 1); > > + if (!map->ioworker.wq) > > + goto out; > > + atomic_set(&map->io, 1); > > + INIT_WORK(&map->ioworker.register_work, pvcalls_back_ioworker); > > + > > + down(&fedata->socket_lock); > > + lis
Re: [Xen-devel] [PATCH v4 07/18] xen/pvcalls: implement socket command
On Tue, 20 Jun 2017, Roger Pau Monné wrote: > On Thu, Jun 15, 2017 at 12:09:36PM -0700, Stefano Stabellini wrote: > > Just reply with success to the other end for now. Delay the allocation > > of the actual socket to bind and/or connect. > > > > Signed-off-by: Stefano Stabellini > > CC: boris.ostrov...@oracle.com > > CC: jgr...@suse.com > > --- > > drivers/xen/pvcalls-back.c | 27 +++ > > 1 file changed, 27 insertions(+) > > > > diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c > > index 437c2ad..953458b 100644 > > --- a/drivers/xen/pvcalls-back.c > > +++ b/drivers/xen/pvcalls-back.c > > @@ -12,12 +12,17 @@ > > * GNU General Public License for more details. > > */ > > > > +#include > > #include > > #include > > #include > > #include > > #include > > #include > > +#include > > +#include > > +#include > > +#include > > > > #include > > #include > > @@ -54,6 +59,28 @@ struct pvcalls_fedata { > > static int pvcalls_back_socket(struct xenbus_device *dev, > > struct xen_pvcalls_request *req) > > { > > + struct pvcalls_fedata *fedata; > > + int ret; > > + struct xen_pvcalls_response *rsp; > > + > > + fedata = dev_get_drvdata(&dev->dev); > > + > > + if (req->u.socket.domain != AF_INET || > > + req->u.socket.type != SOCK_STREAM || > > + (req->u.socket.protocol != IPPROTO_IP && > > +req->u.socket.protocol != AF_INET)) > > + ret = -EAFNOSUPPORT; > > Sorry for jumping into this out of the blue, but shouldn't all the > constants used above be part of the protocol? AF_INET/SOCK_STREAM/... > are all part of POSIX, but their specific value is not defined in the > standard, hence we should have XEN_AF_INET/XEN_SOCK_STREAM/... Or am I > just missing something? The values of these constants for the pvcalls protocol are defined by docs/misc/pvcalls.markdown under "Socket families and address format". They happen to be the same as the ones defined by Linux as AF_INET, SOCK_STREAM, etc, so in Linux I am just using those, but that is just an implementation detail internal to the Linux kernel driver. What is important from the protocol ABI perspective are the values defined by docs/misc/pvcalls.markdown.___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 08/36] x86/mm: Add support to enable SME in early boot processing
On Wed, 21 Jun 2017, Tom Lendacky wrote: > On 6/21/2017 10:38 AM, Thomas Gleixner wrote: > > /* > > * Sanitize CPU configuration and retrieve the modifier > > * for the initial pgdir entry which will be programmed > > * into CR3. Depends on enabled SME encryption, normally 0. > > */ > > call __startup_secondary_64 > > > > addq$(init_top_pgt - __START_KERNEL_map), %rax > > > > You can hide that stuff in C-code nicely without adding any cruft to the > > ASM code. > > > > Moving the call to verify_cpu into the C-code might be quite a bit of > change. Currently, the verify_cpu code is included code and not a > global function. Ah. Ok. I missed that. > I can still do the __startup_secondary_64() function and then look to > incorporate verify_cpu into both __startup_64() and > __startup_secondary_64() as a post-patch to this series. Yes, just having __startup_secondary_64() for now and there the extra bits for that encryption stuff is fine. > At least the secondary path will have a base C routine to which > modifications can be made in the future if needed. How does that sound? Sounds like a plan. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/disk: don't leak stack data via response ring
On Wed, 21 Jun 2017, Jan Beulich wrote: > >>> On 20.06.17 at 23:48, wrote: > > On Tue, 20 Jun 2017, Jan Beulich wrote: > >> @@ -36,13 +33,7 @@ struct blkif_x86_32_request_discard { > >> blkif_sector_t sector_number;/* start sector idx on disk (r/w > >> only) */ > >> uint64_t nr_sectors; /* # of contiguous sectors to > >> discard */ > >> }; > >> -struct blkif_x86_32_response { > >> -uint64_tid; /* copied from request */ > >> -uint8_t operation; /* copied from request */ > >> -int16_t status; /* BLKIF_RSP_??? */ > >> -}; > >> typedef struct blkif_x86_32_request blkif_x86_32_request_t; > >> -typedef struct blkif_x86_32_response blkif_x86_32_response_t; > >> #pragma pack(pop) > >> > >> /* x86_64 protocol version */ > >> @@ -62,20 +53,14 @@ struct blkif_x86_64_request_discard { > >> blkif_sector_t sector_number;/* start sector idx on disk (r/w > >> only) */ > >> uint64_t nr_sectors; /* # of contiguous sectors to > >> discard */ > >> }; > >> -struct blkif_x86_64_response { > >> -uint64_t __attribute__((__aligned__(8))) id; > >> -uint8_t operation; /* copied from request */ > >> -int16_t status; /* BLKIF_RSP_??? */ > >> -}; > >> > >> typedef struct blkif_x86_64_request blkif_x86_64_request_t; > >> -typedef struct blkif_x86_64_response blkif_x86_64_response_t; > >> > >> DEFINE_RING_TYPES(blkif_common, struct blkif_common_request, > >> - struct blkif_common_response); > >> + struct blkif_response); > >> DEFINE_RING_TYPES(blkif_x86_32, struct blkif_x86_32_request, > >> - struct blkif_x86_32_response); > >> + struct blkif_response QEMU_PACKED); > > > > In my test, the previous sizes and alignments of the response structs > > were (on both x86_32 and x86_64): > > > > sizeof(blkif_x86_32_response)=12 sizeof(blkif_x86_64_response)=16 > > align(blkif_x86_32_response)=4 align(blkif_x86_64_response)=8 > > > > While with these changes are now, when compiled on x86_64: > > sizeof(blkif_x86_32_response)=11 sizeof(blkif_x86_64_response)=16 > > align(blkif_x86_32_response)=1 align(blkif_x86_64_response)=8 > > > > when compiled on x86_32: > > sizeof(blkif_x86_32_response)=11 sizeof(blkif_x86_64_response)=12 > > align(blkif_x86_32_response)=1 align(blkif_x86_64_response)=4 > > > > Did I do my tests wrong? > > > > QEMU_PACKED is not the same as #pragma pack(push, 4). In fact, it is the > > same as #pragma pack(push, 1), causing the struct to be densely packed, > > leaving no padding whatsever. > > > > In addition, without __attribute__((__aligned__(8))), > > blkif_x86_64_response won't be 8 bytes aligned when built on x86_32. > > > > Am I missing something? > > Well, you're mixing attribute application upon structure > declaration with attribute application upon structure use. It's > the latter here, and hence the attribute doesn't affect > structure layout at all. All it does is avoid the _containing_ > 32-bit union to become 8-byte aligned (and tail padding to be > inserted). Thanks for the explanation. I admit it's the first time I see the aligned attribute being used at structure usage only. I think it's the first time QEMU_PACKED is used this way in QEMU too. Anyway, even taking that into account, things are still not completely right: the alignment of struct blkif_x86_32_response QEMU_PACKED is 4 bytes as you wrote, but the size of struct blkif_x86_32_response is still 16 bytes instead of 12 bytes in my test. I suspect it worked for you because the other member of the union (blkif_x86_32_request) is larger than that. However, I think is not a good idea to rely on this implementation detail. The implementation of DEFINE_RING_TYPES should be opaque from our point of view. We shouldn't have to know that there is a union there. Moreover, the other problem is still unaddressed: the size and alignment of blkif_x86_64_response when built on x86_32 are 12 and 4 instead of 16 and 8 bytes. Is that working also because it's relying on the other member of the union to enforce the right alignment and bigger size? I think it's a bad idea to rely on that, especially given that this obscure but important detail is not even mentioned in the commit message. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable-smoke test] 110940: tolerable trouble: broken/pass - PUSHED
flight 110940 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/110940/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass version targeted for testing: xen 93de762bcafb458b91955399d55acd5b2104490d baseline version: xen d8f1b48fd665d7aad1711de2f073540d07d2d041 Last test of basis 110907 2017-06-21 00:20:54 Z0 days Testing same since 110937 2017-06-21 14:01:21 Z0 days2 attempts People who touched revisions under test: Jan Beulich Juergen Gross Wei Liu jobs: build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm broken test-amd64-amd64-xl-qemuu-debianhvm-i386 pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : + branch=xen-unstable-smoke + revision=93de762bcafb458b91955399d55acd5b2104490d + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x '!=' x/home/osstest/repos/lock ']' ++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock ++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push xen-unstable-smoke 93de762bcafb458b91955399d55acd5b2104490d + branch=xen-unstable-smoke + revision=93de762bcafb458b91955399d55acd5b2104490d + . ./cri-lock-repos ++ . ./cri-common +++ . ./cri-getconfig +++ umask 002 +++ getrepos getconfig Repos perl -e ' use Osstest; readglobalconfig(); print $c{"Repos"} or die $!; ' +++ local repos=/home/osstest/repos +++ '[' -z /home/osstest/repos ']' +++ '[' '!' -d /home/osstest/repos ']' +++ echo /home/osstest/repos ++ repos=/home/osstest/repos ++ repos_lock=/home/osstest/repos/lock ++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']' + . ./cri-common ++ . ./cri-getconfig ++ umask 002 + select_xenbranch + case "$branch" in + tree=xen + xenbranch=xen-unstable-smoke + qemuubranch=qemu-upstream-unstable + '[' xxen = xlinux ']' + linuxbranch= + '[' xqemu-upstream-unstable = x ']' + select_prevxenbranch ++ ./cri-getprevxenbranch xen-unstable-smoke + prevxenbranch=xen-4.9-testing + '[' x93de762bcafb458b91955399d55acd5b2104490d = x ']' + : tested/2.6.39.x + . ./ap-common ++ : osst...@xenbits.xen.org +++ getconfig OsstestUpstream +++ perl -e ' use Osstest; readglobalconfig(); print $c{"OsstestUpstream"} or die $!; ' ++ : ++ : git://xenbits.xen.org/xen.git ++ : osst...@xenbits.xen.org:/home/xen/git/xen.git ++ : git://xenbits.xen.org/qemu-xen-traditional.git ++ : git://git.kernel.org ++ : git://git.kernel.org/pub/scm/linux/kernel/git ++ : git ++ : git://xenbits.xen.org/xtf.git ++ : osst...@xenbits.xen.org:/home/xen/git/xtf.git ++ : git://xenbits.xen.org/xtf.git ++ : git://xenbits.xen.org/libvirt.git ++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git ++ : git://xenbits.xen.org/libvirt.git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : git ++ : git://xenbits.xen.org/osstest/rumprun.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/rumprun.git ++ : git://git.seabios.org/seabios.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/seabios.git ++ : git://xenbits.xen.org/osstest/seabios.git ++ : https://github.com/tianocore/edk2.git ++ : osst...@xenbits.xen.org:/home/xen/git/osstest/ovmf.git ++ : git://xenbits.xen.org/osstest/ovmf.git ++ : g
Re: [Xen-devel] [PATCH v7 08/36] x86/mm: Add support to enable SME in early boot processing
On 6/21/2017 10:38 AM, Thomas Gleixner wrote: On Wed, 21 Jun 2017, Tom Lendacky wrote: On 6/21/2017 2:16 AM, Thomas Gleixner wrote: Why is this an unconditional function? Isn't the mask simply 0 when the MEM ENCRYPT support is disabled? I made it unconditional because of the call from head_64.S. I can't make use of the C level static inline function and since the mask is not a variable if CONFIG_AMD_MEM_ENCRYPT is not configured (#defined to 0) I can't reference the variable directly. I could create a #define in head_64.S that changes this to load rax with the variable if CONFIG_AMD_MEM_ENCRYPT is configured or a zero if it's not or add a #ifdef at that point in the code directly. Thoughts on that? See below. That does not make any sense. Neither the call to sme_encrypt_kernel() nor the following call to sme_get_me_mask(). __startup_64() is already C code, so why can't you simply call that from __startup_64() in C and return the mask from there? I was trying to keep it explicit as to what was happening, but I can move those calls into __startup_64(). That's much preferred. And the return value wants to be documented in both C and ASM code. Will do. I'll still need the call to sme_get_me_mask() in the secondary_startup_64 path, though (depending on your thoughts to the above response). call verify_cpu movq$(init_top_pgt - __START_KERNEL_map), %rax So if you make that: /* * Sanitize CPU configuration and retrieve the modifier * for the initial pgdir entry which will be programmed * into CR3. Depends on enabled SME encryption, normally 0. */ call __startup_secondary_64 addq$(init_top_pgt - __START_KERNEL_map), %rax You can hide that stuff in C-code nicely without adding any cruft to the ASM code. Moving the call to verify_cpu into the C-code might be quite a bit of change. Currently, the verify_cpu code is included code and not a global function. I can still do the __startup_secondary_64() function and then look to incorporate verify_cpu into both __startup_64() and __startup_secondary_64() as a post-patch to this series. At least the secondary path will have a base C routine to which modifications can be made in the future if needed. How does that sound? Thanks, Tom Thanks, tglx ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-4.8-testing test] 110904: regressions - FAIL
flight 110904 xen-4.8-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/110904/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-xtf-amd64-amd64-4 45 xtf/test-hvm64-lbr-tsx-vmentry fail REGR. vs. 110437 Tests which did not succeed, but are not blocking: test-xtf-amd64-amd64-1 45 xtf/test-hvm64-lbr-tsx-vmentry fail like 110437 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stopfail like 110437 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 110437 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail like 110437 test-amd64-amd64-xl-rtds 9 debian-install fail like 110437 build-amd64-prev 6 xen-build/dist-test fail never pass build-i386-prev 6 xen-build/dist-test fail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-amd64-xl-qemut-ws16-amd64 9 windows-installfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 9 windows-installfail never pass test-arm64-arm64-xl-credit2 12 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 13 saverestore-support-checkfail never pass test-arm64-arm64-xl 12 migrate-support-checkfail never pass test-arm64-arm64-xl 13 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 13 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-amd64-i386-xl-qemut-ws16-amd64 9 windows-install fail never pass test-amd64-amd64-xl-qemut-win10-i386 9 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 9 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 9 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 9 windows-install fail never pass test-amd64-i386-xl-qemuu-ws16-amd64 9 windows-install fail never pass version targeted for testing: xen aedaa82c2f671544a84009b53154477f4431c629 baseline version: xen c427a81dee142a0f7155b8ed7074e1f489336637 Last test of basis 110437 2017-06-14 11:01:50 Z7 days
[Xen-devel] [PATCH for-4.9 v2] xen/livepatch: Don't crash on encountering STN_UNDEF relocations
A symndx of STN_UNDEF is special, and means a symbol value of 0. While legitimate in the ELF standard, its existance in a livepatch is questionable at best. Until a plausible usecase presents itself, reject such a relocation with -EOPNOTSUPP. Additionally, perform a safety check on elf->sym[symndx].sym before derefencing it, to avoid tripping over a NULL pointer when calculating val. Signed-off-by: Andrew Cooper --- CC: Konrad Rzeszutek Wilk CC: Ross Lagerwall CC: Jan Beulich CC: Stefano Stabellini CC: Julien Grall v2: * Reject STN_UNDEF with -EOPNOTSUPP --- xen/arch/arm/arm32/livepatch.c | 17 +++-- xen/arch/arm/arm64/livepatch.c | 17 +++-- xen/arch/x86/livepatch.c | 17 +++-- 3 files changed, 45 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/arm32/livepatch.c b/xen/arch/arm/arm32/livepatch.c index a328179..53fee91 100644 --- a/xen/arch/arm/arm32/livepatch.c +++ b/xen/arch/arm/arm32/livepatch.c @@ -254,14 +254,27 @@ int arch_livepatch_perform(struct livepatch_elf *elf, addend = get_addend(type, dest); } +if ( symndx == STN_UNDEF ) +{ +dprintk(XENLOG_ERR, LIVEPATCH "%s: Encountered STN_UNDEF\n", +elf->name); +return -EOPNOTSUPP; +} + if ( symndx > elf->nsym ) { dprintk(XENLOG_ERR, LIVEPATCH "%s: Relative symbol wants symbol@%u which is past end!\n", elf->name, symndx); return -EINVAL; } - -val = elf->sym[symndx].sym->st_value; /* S */ +else if ( !elf->sym[symndx].sym ) +{ +dprintk(XENLOG_ERR, LIVEPATCH "%s: No relative symbol@%u\n", +elf->name, symndx); +return -EINVAL; +} +else +val = elf->sym[symndx].sym->st_value; /* S */ rc = perform_rel(type, dest, val, addend); switch ( rc ) diff --git a/xen/arch/arm/arm64/livepatch.c b/xen/arch/arm/arm64/livepatch.c index 63929b1..b033763 100644 --- a/xen/arch/arm/arm64/livepatch.c +++ b/xen/arch/arm/arm64/livepatch.c @@ -252,14 +252,27 @@ int arch_livepatch_perform_rela(struct livepatch_elf *elf, int ovf = 0; uint64_t val; +if ( symndx == STN_UNDEF ) +{ +dprintk(XENLOG_ERR, LIVEPATCH "%s: Encountered STN_UNDEF\n", +elf->name); +return -EOPNOTSUPP; +} + if ( symndx > elf->nsym ) { dprintk(XENLOG_ERR, LIVEPATCH "%s: Relative relocation wants symbol@%u which is past end!\n", elf->name, symndx); return -EINVAL; } - -val = elf->sym[symndx].sym->st_value + r->r_addend; /* S+A */ +else if ( !elf->sym[symndx].sym ) +{ +dprintk(XENLOG_ERR, LIVEPATCH "%s: No relative symbol@%u\n", +elf->name, symndx); +return -EINVAL; +} +else +val = elf->sym[symndx].sym->st_value + r->r_addend; /* S+A */ /* ARM64 operations at minimum are always 32-bit. */ if ( r->r_offset >= base->sec->sh_size || diff --git a/xen/arch/x86/livepatch.c b/xen/arch/x86/livepatch.c index 7917610..bfa576c 100644 --- a/xen/arch/x86/livepatch.c +++ b/xen/arch/x86/livepatch.c @@ -170,14 +170,27 @@ int arch_livepatch_perform_rela(struct livepatch_elf *elf, uint8_t *dest = base->load_addr + r->r_offset; uint64_t val; +if ( symndx == STN_UNDEF ) +{ +dprintk(XENLOG_ERR, LIVEPATCH "%s: Encountered STN_UNDEF\n", +elf->name); +return -EOPNOTSUPP; +} + if ( symndx > elf->nsym ) { dprintk(XENLOG_ERR, LIVEPATCH "%s: Relative relocation wants symbol@%u which is past end!\n", elf->name, symndx); return -EINVAL; } - -val = r->r_addend + elf->sym[symndx].sym->st_value; +else if ( !elf->sym[symndx].sym ) +{ +dprintk(XENLOG_ERR, LIVEPATCH "%s: No symbol@%u\n", +elf->name, symndx); +return -EINVAL; +} +else +val = r->r_addend + elf->sym[symndx].sym->st_value; switch ( ELF64_R_TYPE(r->r_info) ) { -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/11] public: adjust documentation following XSA-217
On Wed, 21 Jun 2017, George Dunlap wrote: > On 21/06/17 16:54, Jan Beulich wrote: > On 21.06.17 at 17:44, wrote: > >> When you have a long series like this, could you name the files in a way > >> that makes it easier for a shell script to get their order? e.g., > >> 01-xsa217-doc.patch, 02-gnttab-xfer-xenheap.patch, &c? > >> > >> Having to manually save-and-apply the name of each patch of an > >> eleven-patch series separately is fairly annoying. If they started with > >> a number, I could save them all to the same directory and then use "for > >> patch in *.patch ; do..." to apply them. > >> > >> (Using `git send-email` would also make things a lot easier...) > > > > Well, that's kind of difficult without using git for development work. > > I can try to remember to name patches suitably, but that's > > another manual step for me then > > Well, it's either an extra manual step for you, or an extra manual step > for every person who reviews your code. You don't need to justify to me > why you don't want to use git for development. However, it's not fair > to externalize the cost of your development preferences to everybody else. > > When it's just one or two patches I'm willing to make some > accommodation, but a series like this it's too much. I don't use git either (I use guilt). My solution to this problem is to `guilt push' all my patches, then export them back as patches from the commits using `git format-patch'. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-4.7-testing test] 110902: trouble: broken/fail/pass
flight 110902 xen-4.7-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/110902/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: test-armhf-armhf-xl 3 host-install(3)broken REGR. vs. 110430 Regressions which are regarded as allowable (not blocking): test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stopfail REGR. vs. 110430 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail like 110430 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail like 110430 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail like 110430 test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail like 110430 test-armhf-armhf-libvirt 13 saverestore-support-checkfail like 110430 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail like 110430 test-amd64-amd64-xl-qemut-ws16-amd64 9 windows-installfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 9 windows-installfail never pass test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-arm64-arm64-xl-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 13 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 saverestore-support-checkfail never pass test-arm64-arm64-xl 12 migrate-support-checkfail never pass test-arm64-arm64-xl 13 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 12 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-amd64-i386-xl-qemuu-win10-i386 9 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 9 windows-installfail never pass test-amd64-amd64-xl-qemut-win10-i386 9 windows-installfail never pass test-amd64-i386-xl-qemut-ws16-amd64 9 windows-install fail never pass test-amd64-i386-xl-qemut-win10-i386 9 windows-install fail never pass test-amd64-i386-xl-qemuu-ws16-amd64 9 windows-install fail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass version targeted for testing: xen db2a8fe8b2f0a65ddc32ad758b7274ac36152209 baseline version: xen 84cd8d3fbdfbc0655ad242da1d2fdadddf5be89e Last test of basis 110430 2017-06-14 06:46:55 Z7 days Testing same since 110902 2017-06-21 00:13:41 Z0 days1 attempts People who touched revisions under test: Andrew Cooper George Dunlap Jan Beulich Julien Gr
Re: [Xen-devel] [RFC 6/6] acpi:arm64: Add support for parsing IORT table
On 12/06/17 14:44, Jan Beulich wrote: On 12.06.17 at 15:36, wrote: >> On 09/06/17 12:15, Robin Murphy wrote: >>> On 08/06/17 20:30, Sameer Goel wrote: >>> [...] /** - * iort_iommu_configure - Set-up IOMMU configuration for a device. + * iort_iommu_configure - Set-up IOMMU configuration for a device. This + * function sets up the fwspec as needed for a given device. Only PCI + * devices are supported for now. * * @dev: device to configure * - * Returns: iommu_ops pointer on configuration success - * NULL on configuration failure + * Returns: Appropriate acpi_status */ -const struct iommu_ops *iort_iommu_configure(struct device *dev) +acpi_status iort_iommu_configure(struct device *dev) { struct acpi_iort_node *node, *parent; - const struct iommu_ops *ops = NULL; u32 streamid = 0; + acpi_status status = AE_OK; if (dev_is_pci(dev)) { - struct pci_bus *bus = to_pci_dev(dev)->bus; + struct pci_dev *pci_device = to_pci_dev(dev); u32 rid; - pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, - &rid); + rid = PCI_BDF2(pci_device->bus,pci_device->devfn); >>> >>> Beware that the Linux code isn't actually correct to begin with[1]. I >>> don't know how much Xen deals with PCI bridges and quirks, but as it >>> stands you should be able to trivially expose the flaw here by plugging >>> in a Marvell 88SE912x-based SATA card and watching either DMA or MSIs >>> (or even both) kick up stream table faults. >>> >>> Robin. >>> >>> [1]:http://www.spinics.net/lists/linux-acpi/msg74844.html >> >> I am not sure how x86 is handling that. The closest I can find would be >> domain_context_mapping. I have CCed x86 folks to get more feedback here. > > I'm lacking enough context to know whether this is the issue > with what we call phantom devices, or something else. Phantom functions, PCIe-PCI bridges, basically anything that can result in traffic from a single endpoint appearing on multiple RIDs, or a RID other than the device's BDF, at the root complex (and IOMMUs/MSI doorbells beyond). Robin. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/11] public: adjust documentation following XSA-217
On 21/06/17 16:54, Jan Beulich wrote: On 21.06.17 at 17:44, wrote: >> When you have a long series like this, could you name the files in a way >> that makes it easier for a shell script to get their order? e.g., >> 01-xsa217-doc.patch, 02-gnttab-xfer-xenheap.patch, &c? >> >> Having to manually save-and-apply the name of each patch of an >> eleven-patch series separately is fairly annoying. If they started with >> a number, I could save them all to the same directory and then use "for >> patch in *.patch ; do..." to apply them. >> >> (Using `git send-email` would also make things a lot easier...) > > Well, that's kind of difficult without using git for development work. > I can try to remember to name patches suitably, but that's > another manual step for me then Well, it's either an extra manual step for you, or an extra manual step for every person who reviews your code. You don't need to justify to me why you don't want to use git for development. However, it's not fair to externalize the cost of your development preferences to everybody else. When it's just one or two patches I'm willing to make some accommodation, but a series like this it's too much. > , and while I continue to attach > files I would have hoped that the mail bodies nowadays come > through uncorrupted (and hence I'd expect file names to be > chosen by your mail client based on subject, which includes > numbering - that's at least how mine behaves - and there being > no actual need to use the attachments). I don't have a simple way of saving the text of the mail message without saving the attachment in-line in the file; so without writing a special tool just for your messages, I can't easily apply the patches into the git tree (so I can see the change in-situ). It looks like the mail bodies might actually be uncorrupted now; so it's possible that if you send the mail without attachments anymore I might be able to use the same workflow for you as I use for everyone else (which presumably would extend to other people who might want to review your patches). If you send a test mail (or perhaps a series) I can give it a try. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/6] x86/hvm: Fixes to hvmemul_insn_fetch()
> -Original Message- > From: Andrew Cooper > Sent: 21 June 2017 17:15 > To: Paul Durrant ; Xen-devel de...@lists.xen.org> > Cc: Jan Beulich > Subject: Re: [PATCH 1/6] x86/hvm: Fixes to hvmemul_insn_fetch() > > On 21/06/17 17:04, Paul Durrant wrote: > >> -Original Message- > >> From: Andrew Cooper [mailto:andrew.coop...@citrix.com] > >> Sent: 21 June 2017 16:12 > >> To: Xen-devel > >> Cc: Andrew Cooper ; Jan Beulich > >> ; Paul Durrant > >> Subject: [PATCH 1/6] x86/hvm: Fixes to hvmemul_insn_fetch() > >> > >> Force insn_off to a single byte, as offset can wrap around or truncate with > >> respect to sh_ctxt->insn_buf_eip under a number of normal > circumstances. > >> > >> Furthermore, don't use an ASSERT() for bounds checking the write into > >> hvmemul_ctxt->insn_buf[]. > >> > >> Signed-off-by: Andrew Cooper > >> --- > >> CC: Jan Beulich > >> CC: Paul Durrant > >> > >> For anyone wondering, this is way to explot XSA-186 > >> --- > >> xen/arch/x86/hvm/emulate.c | 15 +-- > >> 1 file changed, 13 insertions(+), 2 deletions(-) > >> > >> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > >> index 11e4aba..495e312 100644 > >> --- a/xen/arch/x86/hvm/emulate.c > >> +++ b/xen/arch/x86/hvm/emulate.c > >> @@ -939,7 +939,8 @@ int hvmemul_insn_fetch( > >> { > >> struct hvm_emulate_ctxt *hvmemul_ctxt = > >> container_of(ctxt, struct hvm_emulate_ctxt, ctxt); > >> -unsigned int insn_off = offset - hvmemul_ctxt->insn_buf_eip; > >> +/* Careful, as offset can wrap or truncate WRT insn_buf_eip. */ > >> +uint8_t insn_off = offset - hvmemul_ctxt->insn_buf_eip; > > Why the change to a uint8_t? > > XSA-186 caused problems because offset was truncated at 16 bits, but all > calculations here are performed at 64 bits wide, then truncated down to > 32bits wide. As a result, insn_off could become massively positive. > > insn_off needs to be less wide than the minimum truncation width of > incoming parameters for it to work correctly. > > Code hitting the emulator can legitimately cause offset to truncate at > 32bits WRT EIP, and the only reason we aren't still vulnerable is > because insn_off is unsigned int. If it were unsigned long, we'd have > another privilege escalation vulnerability. Ok. Reviewed-by: Paul Durrant > > ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH V3] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
Fixed an issue where the maximum index allowed (31) goes beyond the actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. Coverity-ID: 1412966 Signed-off-by: Razvan Cojocaru Reviewed-by: Andrew Cooper --- Changes since V2: - Removed stale comment. - Indentation. - Added Reviewed-by. --- xen/arch/x86/monitor.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c index bedf13c..764195a 100644 --- a/xen/arch/x86/monitor.c +++ b/xen/arch/x86/monitor.c @@ -132,8 +132,8 @@ int arch_monitor_domctl_event(struct domain *d, unsigned int ctrlreg_bitmask; bool_t old_status; -/* sanity check: avoid left-shift undefined behavior */ -if ( unlikely(mop->u.mov_to_cr.index > 31) ) +if ( unlikely(mop->u.mov_to_cr.index >= + ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)) ) return -EINVAL; if ( unlikely(mop->u.mov_to_cr.pad1 || mop->u.mov_to_cr.pad2) ) -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] passthrough: give XEN_DOMCTL_test_assign_device more sane semantics
On 21/06/17 16:59, Jan Beulich wrote: On 21.06.17 at 16:38, wrote: >> On 21/06/17 11:08, Jan Beulich wrote: >>> So far callers of the libxc interface passed in a domain ID which was >>> then ignored in the hypervisor. Instead, make the hypervisor honor it >>> (accepting DOMID_INVALID to obtain original behavior), allowing to >>> query whether a device is assigned to a particular domain. Ignore the >>> passed in domain ID at the libxc layer instead, in order to not break >>> existing callers. New libxc functions would need to be added if callers >>> wanted to leverage the new functionality. >> >> I don't think your modified description matches the name of the call at all. >> >> It looks like the callers expect "test_assign_device" to answer the >> question: "Can I assign a device to this domain"? > > I don't think so - the question being answered by the original > operation is "Is this device assigned to any domain?" with the > implied inverse "Is this device available to be assigned to some > domain (i.e. it is currently unassigned or owned by Dom0)?" If the question were "Is this device assigned to any domain?", then I would expect: 1. The return value to be a boolean 2. It would always return, "No it's not assigned" in the case where there is no IOMMU. However, that's not what happens: 1. It returns "success" if there is an IOMMU and the device is *not* assigned, and returns an error if the device is assigned 2. It returns an error if there is no IOMMU. The only place in the code this is called 'for real' in the tree is in libxl_pci.c:libxl__device_pci_add() if (libxl__domain_type(gc, domid) == LIBXL_DOMAIN_TYPE_HVM) { rc = xc_test_assign_device(ctx->xch, domid, pcidev_encode_bdf(pcidev)); if (rc) { LOGD(ERROR, domid, "PCI device %04x:%02x:%02x.%u %s?", pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func, errno == ENOSYS ? "cannot be assigned - no IOMMU" : "already assigned to a different guest"); goto out; } } Here 'domid' is the domain to which libxl wants to assign the device. So libxl is now asking Xen, "Am I allowed to assign device $bdf to domain $domain?" Your description provides the *algorithm* by which Xen normally provides an answer: that is, normally the only thing Xen cares about is that it hasn't already been assigned to a domain. But it still remains the case that what libxl is asking is, "Can I assign X to Y?" >> It looks like it's meant to be used in XSM environments, to allow a >> policy to permit or forbid specific guests to have access to specific >> devices. On a default (non-XSM) system, the answer to that question >> doesn't depend on the domain it's being assigned to, but only whether >> the device is already assigned to another domain; but on XSM systems the >> logic can presumably be more complicated. >> >> That sounds like a perfectly sane semantic to me, and this patch removes >> that ability. > > And again I don't think so: Prior to the patch, do_domctl() at its > very top makes sure to entirely ignore the passed in domain ID. > This code sits ahead of the XSM check, so XSM has no way of > knowing which domain has been specified by the caller. Right, I see that now. Still, I assert that the original hypercall semantics is a very useful one, and what you're doing is changing the hypercall such that the question can no longer be asked. It would be better to extend things so that XSM can actually deny device assignment based on both the bdf and the domain. Do you have a particular use case in mind for your alternate hypercall? -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
(Re-sent with CCs preserved). On 06/21/2017 07:06 PM, Jan Beulich wrote: On 21.06.17 at 16:56, wrote: >> --- a/xen/arch/x86/monitor.c >> +++ b/xen/arch/x86/monitor.c >> @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d, >> bool_t old_status; >> >> /* sanity check: avoid left-shift undefined behavior */ >> -if ( unlikely(mop->u.mov_to_cr.index > 31) ) >> +if ( unlikely(mop->u.mov_to_cr.index >= >> + ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)) ) > > Indentation. Right, that should have matched the end of the "unlikely(" above. I'll modify it, remove the comment Wei commented on and submit V3. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [xen-4.9-testing test] 110903: regressions - FAIL
flight 110903 xen-4.9-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/110903/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-pygrub 9 debian-di-installfail REGR. vs. 110550 Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemuu-win7-amd64 17 guest-start/win.repeat fail like 110542 test-amd64-i386-xl-qemut-win7-amd64 17 guest-start/win.repeat fail like 110550 test-amd64-amd64-xl-qemut-win7-amd64 15 guest-localmigrate/x10 fail like 110550 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail like 110550 test-amd64-amd64-xl-rtds 9 debian-install fail like 110550 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-xl-qemuu-ws16-amd64 9 windows-installfail never pass test-amd64-i386-libvirt-xsm 12 migrate-support-checkfail never pass test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass build-amd64-prev 6 xen-build/dist-test fail never pass test-arm64-arm64-xl-credit2 12 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 13 saverestore-support-checkfail never pass test-arm64-arm64-xl 12 migrate-support-checkfail never pass test-arm64-arm64-xl 13 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-xl-qemut-ws16-amd64 9 windows-installfail never pass test-arm64-arm64-xl-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 13 saverestore-support-checkfail never pass build-i386-prev 6 xen-build/dist-test fail never pass test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-xsm 12 migrate-support-checkfail never pass test-armhf-armhf-xl-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass test-amd64-i386-libvirt 12 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 12 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 12 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 12 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 12 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 13 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 12 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 13 saverestore-support-checkfail never pass test-armhf-armhf-xl 12 migrate-support-checkfail never pass test-armhf-armhf-xl 13 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 11 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 12 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check fail never pass test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2 fail never pass test-amd64-amd64-xl-qemut-win10-i386 9 windows-installfail never pass test-amd64-i386-xl-qemuu-win10-i386 9 windows-install fail never pass test-amd64-amd64-xl-qemuu-win10-i386 9 windows-installfail never pass test-amd64-i386-xl-qemut-win10-i386 9 windows-install fail never pass test-amd64-i386-xl-qemut-ws16-amd64 9 windows-install fail never pass test-amd64-i386-xl-qemuu-ws16-amd64 9 windows-install fail never pass version targeted for testing: xen 408f16ddb3b05077e60c377a19d9df1b5ec11fa9 baseline version: xen e197d29514165202308fe65db6effc4835aabfeb Last test of basis 110550 2017-06-18 21:49:42 Z2 days Failing since110568 2017-06-19 13:14:32 Z2 days2 attempts Testing same since 110903 2017-06-21 00:17:01 Z0 days1 attempts ---
Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
On 06/21/2017 07:06 PM, Jan Beulich wrote: On 21.06.17 at 16:56, wrote: >> --- a/xen/arch/x86/monitor.c >> +++ b/xen/arch/x86/monitor.c >> @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d, >> bool_t old_status; >> >> /* sanity check: avoid left-shift undefined behavior */ >> -if ( unlikely(mop->u.mov_to_cr.index > 31) ) >> +if ( unlikely(mop->u.mov_to_cr.index >= >> + ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)) ) > > Indentation. Right, that should have matched the end of the "unlikely(" above. I'll modify it, remove the comment Wei commented on and submit V3. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 6/6] x86/hvm: Implement hvmemul_write() using real mappings
> -Original Message- > From: Andrew Cooper [mailto:andrew.coop...@citrix.com] > Sent: 21 June 2017 16:13 > To: Xen-devel > Cc: Andrew Cooper ; Jan Beulich > ; Paul Durrant ; Razvan > Cojocaru ; Mihai Donțu > > Subject: [PATCH 6/6] x86/hvm: Implement hvmemul_write() using real > mappings > > An access which crosses a page boundary is performed atomically by x86 > hardware, albeit with a severe performance penalty. An important corner > case > is when a straddled access hits two pages which differ in whether a > translation exists, or in net access rights. > > The use of hvm_copy*() in hvmemul_write() is problematic, because it > performs > a translation then completes the partial write, before moving onto the next > translation. > > If an individual emulated write straddles two pages, the first of which is > writable, and the second of which is not, the first half of the write will > complete before #PF is raised from the second half. > > This results in guest state corruption as a side effect of emulation, which > has been observed to cause windows to crash while under introspection. > > Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an > entire contents of a linear access, and vmap() the underlying frames to > provide a contiguous virtual mapping for the emulator to use. This is the > same mechanism as used by the shadow emulation code. > > This will catch any translation issues and abort the emulation before any > modifications occur. > > Signed-off-by: Andrew Cooper > --- > CC: Jan Beulich > CC: Paul Durrant > CC: Razvan Cojocaru > CC: Mihai Donțu > > While the maximum size of linear mapping is capped at 1 page, the logic in > the > helpers is written to work properly as hvmemul_ctxt->mfn[] gets longer, > specifically with XSAVE instruction emulation in mind. > > This has only had light testing so far. > --- > xen/arch/x86/hvm/emulate.c| 179 > ++ > xen/include/asm-x86/hvm/emulate.h | 7 ++ > 2 files changed, 169 insertions(+), 17 deletions(-) > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index 384ad0b..804bea5 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -498,6 +498,159 @@ static int hvmemul_do_mmio_addr(paddr_t > mmio_gpa, > } > > /* > + * Map the frame(s) covering an individual linear access, for writeable > + * access. May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other > errors > + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings. > + * > + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is > + * clean before use, and poisions unused slots with INVALID_MFN. > + */ > +static void *hvmemul_map_linear_addr( > +unsigned long linear, unsigned int bytes, uint32_t pfec, > +struct hvm_emulate_ctxt *hvmemul_ctxt) > +{ > +struct vcpu *curr = current; > +void *err, *mapping; > + > +/* First and final gfns which need mapping. */ > +unsigned long frame = linear >> PAGE_SHIFT, first = frame; > +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; Do we need to worry about linear + bytes overflowing here? Also, is this ever legitimately called with bytes == 0? > + > +/* > + * mfn points to the next free slot. All used slots have a page > reference > + * held on them. > + */ > +mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > + > +/* > + * The caller has no legitimate reason for trying a zero-byte write, but > + * final is calculate to fail safe in release builds. > + * > + * The maximum write size depends on the number of adjacent mfns[] > which > + * can be vmap()'d, accouting for possible misalignment within the > region. > + * The higher level emulation callers are responsible for ensuring that > + * mfns[] is large enough for the requested write size. > + */ > +if ( bytes == 0 || > + final - first > ARRAY_SIZE(hvmemul_ctxt->mfn) - 1 ) > +{ I guess not, so why the weird looking calculation for final? It's value will not be used when bytes == 0. > +ASSERT_UNREACHABLE(); > +goto unhandleable; > +} > + > +do { > +enum hvm_translation_result res; > +struct page_info *page; > +pagefault_info_t pfinfo; > +p2m_type_t p2mt; > + > +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec, > + &pfinfo, &page, NULL, &p2mt); > + > +switch ( res ) > +{ > +case HVMTRANS_okay: > +break; > + > +case HVMTRANS_bad_linear_to_gfn: > +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, > &hvmemul_ctxt->ctxt); > +err = ERR_PTR(~(long)X86EMUL_EXCEPTION); > +goto out; > + > +case HVMTRANS_bad_gfn_to_mfn: > +err = NULL; > +goto out; > + > +case HVMTRANS_gfn_paged_out: > +case HVMTRANS_gfn_shared
Re: [Xen-devel] [PATCH 1/6] x86/hvm: Fixes to hvmemul_insn_fetch()
On 21/06/17 17:04, Paul Durrant wrote: >> -Original Message- >> From: Andrew Cooper [mailto:andrew.coop...@citrix.com] >> Sent: 21 June 2017 16:12 >> To: Xen-devel >> Cc: Andrew Cooper ; Jan Beulich >> ; Paul Durrant >> Subject: [PATCH 1/6] x86/hvm: Fixes to hvmemul_insn_fetch() >> >> Force insn_off to a single byte, as offset can wrap around or truncate with >> respect to sh_ctxt->insn_buf_eip under a number of normal circumstances. >> >> Furthermore, don't use an ASSERT() for bounds checking the write into >> hvmemul_ctxt->insn_buf[]. >> >> Signed-off-by: Andrew Cooper >> --- >> CC: Jan Beulich >> CC: Paul Durrant >> >> For anyone wondering, this is way to explot XSA-186 >> --- >> xen/arch/x86/hvm/emulate.c | 15 +-- >> 1 file changed, 13 insertions(+), 2 deletions(-) >> >> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c >> index 11e4aba..495e312 100644 >> --- a/xen/arch/x86/hvm/emulate.c >> +++ b/xen/arch/x86/hvm/emulate.c >> @@ -939,7 +939,8 @@ int hvmemul_insn_fetch( >> { >> struct hvm_emulate_ctxt *hvmemul_ctxt = >> container_of(ctxt, struct hvm_emulate_ctxt, ctxt); >> -unsigned int insn_off = offset - hvmemul_ctxt->insn_buf_eip; >> +/* Careful, as offset can wrap or truncate WRT insn_buf_eip. */ >> +uint8_t insn_off = offset - hvmemul_ctxt->insn_buf_eip; > Why the change to a uint8_t? XSA-186 caused problems because offset was truncated at 16 bits, but all calculations here are performed at 64 bits wide, then truncated down to 32bits wide. As a result, insn_off could become massively positive. insn_off needs to be less wide than the minimum truncation width of incoming parameters for it to work correctly. Code hitting the emulator can legitimately cause offset to truncate at 32bits WRT EIP, and the only reason we aren't still vulnerable is because insn_off is unsigned int. If it were unsigned long, we'd have another privilege escalation vulnerability. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
>>> On 21.06.17 at 16:56, wrote: > --- a/xen/arch/x86/monitor.c > +++ b/xen/arch/x86/monitor.c > @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d, > bool_t old_status; > > /* sanity check: avoid left-shift undefined behavior */ > -if ( unlikely(mop->u.mov_to_cr.index > 31) ) > +if ( unlikely(mop->u.mov_to_cr.index >= > + ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)) ) Indentation. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/6] x86/hvm: Fixes to hvmemul_insn_fetch()
> -Original Message- > From: Andrew Cooper [mailto:andrew.coop...@citrix.com] > Sent: 21 June 2017 16:12 > To: Xen-devel > Cc: Andrew Cooper ; Jan Beulich > ; Paul Durrant > Subject: [PATCH 1/6] x86/hvm: Fixes to hvmemul_insn_fetch() > > Force insn_off to a single byte, as offset can wrap around or truncate with > respect to sh_ctxt->insn_buf_eip under a number of normal circumstances. > > Furthermore, don't use an ASSERT() for bounds checking the write into > hvmemul_ctxt->insn_buf[]. > > Signed-off-by: Andrew Cooper > --- > CC: Jan Beulich > CC: Paul Durrant > > For anyone wondering, this is way to explot XSA-186 > --- > xen/arch/x86/hvm/emulate.c | 15 +-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c > index 11e4aba..495e312 100644 > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -939,7 +939,8 @@ int hvmemul_insn_fetch( > { > struct hvm_emulate_ctxt *hvmemul_ctxt = > container_of(ctxt, struct hvm_emulate_ctxt, ctxt); > -unsigned int insn_off = offset - hvmemul_ctxt->insn_buf_eip; > +/* Careful, as offset can wrap or truncate WRT insn_buf_eip. */ > +uint8_t insn_off = offset - hvmemul_ctxt->insn_buf_eip; Why the change to a uint8_t? Paul > > /* > * Fall back if requested bytes are not in the prefetch cache. > @@ -953,7 +954,17 @@ int hvmemul_insn_fetch( > > if ( rc == X86EMUL_OKAY && bytes ) > { > -ASSERT(insn_off + bytes <= sizeof(hvmemul_ctxt->insn_buf)); > +/* > + * Will we overflow insn_buf[]? This shouldn't be able to > happen, > + * which means something went wrong with instruction decoding... > + */ > +if ( insn_off > sizeof(hvmemul_ctxt->insn_buf) || > + (insn_off + bytes) > sizeof(hvmemul_ctxt->insn_buf) ) > +{ > +ASSERT_UNREACHABLE(); > +return X86EMUL_UNHANDLEABLE; > +} > + > memcpy(&hvmemul_ctxt->insn_buf[insn_off], p_data, bytes); > hvmemul_ctxt->insn_buf_bytes = insn_off + bytes; > } > -- > 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] passthrough: give XEN_DOMCTL_test_assign_device more sane semantics
>>> On 21.06.17 at 16:38, wrote: > On 21/06/17 11:08, Jan Beulich wrote: >> So far callers of the libxc interface passed in a domain ID which was >> then ignored in the hypervisor. Instead, make the hypervisor honor it >> (accepting DOMID_INVALID to obtain original behavior), allowing to >> query whether a device is assigned to a particular domain. Ignore the >> passed in domain ID at the libxc layer instead, in order to not break >> existing callers. New libxc functions would need to be added if callers >> wanted to leverage the new functionality. > > I don't think your modified description matches the name of the call at all. > > It looks like the callers expect "test_assign_device" to answer the > question: "Can I assign a device to this domain"? I don't think so - the question being answered by the original operation is "Is this device assigned to any domain?" with the implied inverse "Is this device available to be assigned to some domain (i.e. it is currently unassigned or owned by Dom0)?" > It looks like it's meant to be used in XSM environments, to allow a > policy to permit or forbid specific guests to have access to specific > devices. On a default (non-XSM) system, the answer to that question > doesn't depend on the domain it's being assigned to, but only whether > the device is already assigned to another domain; but on XSM systems the > logic can presumably be more complicated. > > That sounds like a perfectly sane semantic to me, and this patch removes > that ability. And again I don't think so: Prior to the patch, do_domctl() at its very top makes sure to entirely ignore the passed in domain ID. This code sits ahead of the XSM check, so XSM has no way of knowing which domain has been specified by the caller. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/11] public: adjust documentation following XSA-217
>>> On 21.06.17 at 17:44, wrote: > When you have a long series like this, could you name the files in a way > that makes it easier for a shell script to get their order? e.g., > 01-xsa217-doc.patch, 02-gnttab-xfer-xenheap.patch, &c? > > Having to manually save-and-apply the name of each patch of an > eleven-patch series separately is fairly annoying. If they started with > a number, I could save them all to the same directory and then use "for > patch in *.patch ; do..." to apply them. > > (Using `git send-email` would also make things a lot easier...) Well, that's kind of difficult without using git for development work. I can try to remember to name patches suitably, but that's another manual step for me then, and while I continue to attach files I would have hoped that the mail bodies nowadays come through uncorrupted (and hence I'd expect file names to be chosen by your mail client based on subject, which includes numbering - that's at least how mine behaves - and there being no actual need to use the attachments). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2] libxc: add xc_domain_add_to_physmap_batch to wrap XENMEM_add_to_physmap_batch
On Wed, Jun 21, 2017 at 01:29:26AM +0800, Zhongze Liu wrote: > This is a preparation for the proposal "allow setting up shared memory areas > between VMs from xl config file". See: > V2: https://lists.xen.org/archives/html/xen-devel/2017-06/msg02256.html > V1: https://lists.xen.org/archives/html/xen-devel/2017-05/msg01288.html > > The plan is to use XENMEM_add_to_physmap_batch in xl to map foregin pages from > one DomU to another so that the page could be shared. But currently there is > no > wrapper for XENMEM_add_to_physmap_batch in libxc, so we just add a wrapper for > it. > > Signed-off-by: Zhongze Liu > --- > +int xc_domain_add_to_physmap_batch(xc_interface *xch, > + domid_t domid, > + domid_t foreign_domid, > + unsigned int space, > + unsigned int size, > + xen_ulong_t *idxs, > + xen_pfn_t *gpfns, > + int *errs) > +{ > +int rc; > +DECLARE_HYPERCALL_BOUNCE(idxs, size * sizeof(*idxs), > XC_HYPERCALL_BUFFER_BOUNCE_IN); > +DECLARE_HYPERCALL_BOUNCE(gpfns, size * sizeof(*gpfns), > XC_HYPERCALL_BUFFER_BOUNCE_IN); > +DECLARE_HYPERCALL_BOUNCE(errs, size * sizeof(*errs), > XC_HYPERCALL_BUFFER_BOUNCE_OUT); > + > +struct xen_add_to_physmap_batch xatp_batch = { > +.domid = domid, > +.space = space, > +.size = size, > +.u = {.foreign_domid = foreign_domid} Coding style issue. Just a note, the struct is different for pre-4.7 and post-4.7 Xen. You don't need to implement a version of this function for pre-4.7 Xen. > +}; > + > +if ( xc_hypercall_bounce_pre(xch, idxs) || > + xc_hypercall_bounce_pre(xch, gpfns) || > + xc_hypercall_bounce_pre(xch, errs) ) > +{ > +PERROR("Could not bounce memory for XENMEM_add_to_physmap_batch"); > +goto out; rc will be uninitialised in this exit path. > +} > + > +set_xen_guest_handle(xatp_batch.idxs, idxs); > +set_xen_guest_handle(xatp_batch.gpfns, gpfns); > +set_xen_guest_handle(xatp_batch.errs, errs); > + > +rc = do_memory_op(xch, XENMEM_add_to_physmap_batch, > + &xatp_batch, sizeof(xatp_batch)); > + > +out: > +xc_hypercall_bounce_post(xch, idxs); > +xc_hypercall_bounce_post(xch, gpfns); > +xc_hypercall_bounce_post(xch, errs); > + > +return rc; > +} > + > int xc_domain_claim_pages(xc_interface *xch, > uint32_t domid, > unsigned long nr_pages) > -- > 2.13.1 > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 01/11] public: adjust documentation following XSA-217
Jan, When you have a long series like this, could you name the files in a way that makes it easier for a shell script to get their order? e.g., 01-xsa217-doc.patch, 02-gnttab-xfer-xenheap.patch, &c? Having to manually save-and-apply the name of each patch of an eleven-patch series separately is fairly annoying. If they started with a number, I could save them all to the same directory and then use "for patch in *.patch ; do..." to apply them. (Using `git send-email` would also make things a lot easier...) -George On 21/06/17 10:30, Jan Beulich wrote: > Signed-off-by: Jan Beulich > > --- a/xen/include/public/grant_table.h > +++ b/xen/include/public/grant_table.h > @@ -411,12 +411,13 @@ typedef struct gnttab_dump_table gnttab_ > DEFINE_XEN_GUEST_HANDLE(gnttab_dump_table_t); > > /* > - * GNTTABOP_transfer_grant_ref: Transfer to a foreign domain. The > - * foreign domain has previously registered its interest in the transfer via > - * . > + * GNTTABOP_transfer: Transfer to a foreign domain. The foreign > domain > + * has previously registered its interest in the transfer via . > * > * Note that, even if the transfer fails, the specified page no longer > belongs > * to the calling domain *unless* the error is GNTST_bad_page. > + * > + * Note further that only PV guests can use this operation. > */ > struct gnttab_transfer { > /* IN parameters. */ > --- a/xen/include/public/memory.h > +++ b/xen/include/public/memory.h > @@ -102,6 +102,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_memory_reser > * Returns zero on complete success, otherwise a negative error code. > * On complete success then always @nr_exchanged == @in.nr_extents. > * On partial success @nr_exchanged indicates how much work was done. > + * > + * Note that only PV guests can use this operation. > */ > #define XENMEM_exchange 11 > struct xen_memory_exchange { > > > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/mm: drop redundant domain parameter from get_page_from_gfn_p2m()
On 21/06/17 11:12, Jan Beulich wrote: > It can always be read from the passed p2m. Take the opportunity and > also rename the function, making the "p2m" suffix a prefix, to match > other p2m functions, and convert the "gfn" parameter's type. > > Signed-off-by: Jan Beulich Looks good, thanks: Reviewed-by: George Dunlap > > --- a/xen/arch/x86/mm/hap/guest_walk.c > +++ b/xen/arch/x86/mm/hap/guest_walk.c > @@ -55,13 +55,13 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA > void *top_map; > p2m_type_t p2mt; > walk_t gw; > -unsigned long top_gfn; > +gfn_t top_gfn; > struct page_info *top_page; > > /* Get the top-level table's MFN */ > -top_gfn = cr3 >> PAGE_SHIFT; > -top_page = get_page_from_gfn_p2m(p2m->domain, p2m, top_gfn, > - &p2mt, NULL, P2M_ALLOC | P2M_UNSHARE); > +top_gfn = _gfn(cr3 >> PAGE_SHIFT); > +top_page = p2m_get_page_from_gfn(p2m, top_gfn, &p2mt, NULL, > + P2M_ALLOC | P2M_UNSHARE); > if ( p2m_is_paging(p2mt) ) > { > ASSERT(p2m_is_hostp2m(p2m)); > @@ -100,8 +100,9 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA > { > gfn_t gfn = guest_walk_to_gfn(&gw); > struct page_info *page; > -page = get_page_from_gfn_p2m(p2m->domain, p2m, gfn_x(gfn), &p2mt, > - NULL, P2M_ALLOC | P2M_UNSHARE); > + > +page = p2m_get_page_from_gfn(p2m, gfn, &p2mt, NULL, > + P2M_ALLOC | P2M_UNSHARE); > if ( page ) > put_page(page); > if ( p2m_is_paging(p2mt) ) > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -472,8 +472,8 @@ void __put_gfn(struct p2m_domain *p2m, u > } > > /* Atomically look up a GFN and take a reference count on the backing page. > */ > -struct page_info *get_page_from_gfn_p2m( > -struct domain *d, struct p2m_domain *p2m, unsigned long gfn, > +struct page_info *p2m_get_page_from_gfn( > +struct p2m_domain *p2m, gfn_t gfn, > p2m_type_t *t, p2m_access_t *a, p2m_query_t q) > { > struct page_info *page = NULL; > @@ -489,7 +489,7 @@ struct page_info *get_page_from_gfn_p2m( > { > /* Fast path: look up and get out */ > p2m_read_lock(p2m); > -mfn = __get_gfn_type_access(p2m, gfn, t, a, 0, NULL, 0); > +mfn = __get_gfn_type_access(p2m, gfn_x(gfn), t, a, 0, NULL, 0); > if ( p2m_is_any_ram(*t) && mfn_valid(mfn) > && !((q & P2M_UNSHARE) && p2m_is_shared(*t)) ) > { > @@ -497,11 +497,12 @@ struct page_info *get_page_from_gfn_p2m( > if ( unlikely(p2m_is_foreign(*t)) ) > { > struct domain *fdom = page_get_owner_and_reference(page); > -ASSERT(fdom != d); > + > +ASSERT(fdom != p2m->domain); > if ( fdom == NULL ) > page = NULL; > } > -else if ( !get_page(page, d) && > +else if ( !get_page(page, p2m->domain) && >/* Page could be shared */ >(!p2m_is_shared(*t) || !get_page(page, dom_cow)) ) > page = NULL; > @@ -517,14 +518,14 @@ struct page_info *get_page_from_gfn_p2m( > } > > /* Slow path: take the write lock and do fixups */ > -mfn = get_gfn_type_access(p2m, gfn, t, a, q, NULL); > +mfn = get_gfn_type_access(p2m, gfn_x(gfn), t, a, q, NULL); > if ( p2m_is_ram(*t) && mfn_valid(mfn) ) > { > page = mfn_to_page(mfn); > -if ( !get_page(page, d) ) > +if ( !get_page(page, p2m->domain) ) > page = NULL; > } > -put_gfn(d, gfn); > +put_gfn(p2m->domain, gfn_x(gfn)); > > return page; > } > @@ -1900,7 +1901,7 @@ void *map_domain_gfn(struct p2m_domain * > } > > /* Translate the gfn, unsharing if shared. */ > -page = get_page_from_gfn_p2m(p2m->domain, p2m, gfn_x(gfn), p2mt, NULL, > q); > +page = p2m_get_page_from_gfn(p2m, gfn, p2mt, NULL, q); > if ( p2m_is_paging(*p2mt) ) > { > ASSERT(p2m_is_hostp2m(p2m)); > --- a/xen/include/asm-x86/p2m.h > +++ b/xen/include/asm-x86/p2m.h > @@ -465,9 +465,7 @@ static inline mfn_t get_gfn_query_unlock > * and should be used by any path that intends to write to the backing page. > * Returns NULL if the page is not backed by RAM. > * The caller is responsible for calling put_page() afterwards. */ > -struct page_info *get_page_from_gfn_p2m(struct domain *d, > -struct p2m_domain *p2m, > -unsigned long gfn, > +struct page_info *p2m_get_page_from_gfn(struct p2m_domain *p2m, gfn_t gfn, > p2m_type_t *t, p2m_access_t *a, > p2m_query_t q); > > @@ -477,7 +475,7 @@ static inline struct page_info *get_page > struct page_info *page; >
Re: [Xen-devel] [PATCH v7 08/36] x86/mm: Add support to enable SME in early boot processing
On Wed, 21 Jun 2017, Tom Lendacky wrote: > On 6/21/2017 2:16 AM, Thomas Gleixner wrote: > > Why is this an unconditional function? Isn't the mask simply 0 when the MEM > > ENCRYPT support is disabled? > > I made it unconditional because of the call from head_64.S. I can't make > use of the C level static inline function and since the mask is not a > variable if CONFIG_AMD_MEM_ENCRYPT is not configured (#defined to 0) I > can't reference the variable directly. > > I could create a #define in head_64.S that changes this to load rax with > the variable if CONFIG_AMD_MEM_ENCRYPT is configured or a zero if it's > not or add a #ifdef at that point in the code directly. Thoughts on > that? See below. > > That does not make any sense. Neither the call to sme_encrypt_kernel() nor > > the following call to sme_get_me_mask(). > > > > __startup_64() is already C code, so why can't you simply call that from > > __startup_64() in C and return the mask from there? > > I was trying to keep it explicit as to what was happening, but I can > move those calls into __startup_64(). That's much preferred. And the return value wants to be documented in both C and ASM code. > I'll still need the call to sme_get_me_mask() in the secondary_startup_64 > path, though (depending on your thoughts to the above response). call verify_cpu movq$(init_top_pgt - __START_KERNEL_map), %rax So if you make that: /* * Sanitize CPU configuration and retrieve the modifier * for the initial pgdir entry which will be programmed * into CR3. Depends on enabled SME encryption, normally 0. */ call __startup_secondary_64 addq$(init_top_pgt - __START_KERNEL_map), %rax You can hide that stuff in C-code nicely without adding any cruft to the ASM code. Thanks, tglx ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 25/36] swiotlb: Add warnings for use of bounce buffers with SME
On 6/21/2017 5:50 AM, Borislav Petkov wrote: On Fri, Jun 16, 2017 at 01:54:36PM -0500, Tom Lendacky wrote: Add warnings to let the user know when bounce buffers are being used for DMA when SME is active. Since the bounce buffers are not in encrypted memory, these notifications are to allow the user to determine some appropriate action - if necessary. Actions can range from utilizing an IOMMU, replacing the device with another device that can support 64-bit DMA, ignoring the message if the device isn't used much, etc. Signed-off-by: Tom Lendacky --- include/linux/dma-mapping.h | 11 +++ include/linux/mem_encrypt.h |8 lib/swiotlb.c |3 +++ 3 files changed, 22 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 4f3eece..ee2307e 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -10,6 +10,7 @@ #include #include #include +#include /** * List of possible attributes associated with a DMA mapping. The semantics @@ -577,6 +578,11 @@ static inline int dma_set_mask(struct device *dev, u64 mask) if (!dev->dma_mask || !dma_supported(dev, mask)) return -EIO; + + /* Since mask is unsigned, this can only be true if SME is active */ + if (mask < sme_dma_mask()) + dev_warn(dev, "SME is active, device will require DMA bounce buffers\n"); + *dev->dma_mask = mask; return 0; } @@ -596,6 +602,11 @@ static inline int dma_set_coherent_mask(struct device *dev, u64 mask) { if (!dma_supported(dev, mask)) return -EIO; + + /* Since mask is unsigned, this can only be true if SME is active */ + if (mask < sme_dma_mask()) + dev_warn(dev, "SME is active, device will require DMA bounce buffers\n"); Looks to me like those two checks above need to be a: void sme_check_mask(struct device *dev, u64 mask) { if (!sme_me_mask) return; /* Since mask is unsigned, this can only be true if SME is active */ if (mask < (((u64)sme_me_mask << 1) - 1)) dev_warn(dev, "SME is active, device will require DMA bounce buffers\n"); } which gets called and sme_dma_mask() is not really needed. Makes a lot of sense, I'll update the patch. Thanks, Tom ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] x86/p2m: make p2m_alloc_ptp() return an MFN
On 21/06/17 11:25, Jan Beulich wrote: > None of the callers really needs the struct page_info pointer. > > Signed-off-by: Jan Beulich Acked-by: George Dunlap > > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -569,7 +569,7 @@ int p2m_set_entry(struct p2m_domain *p2m > return rc; > } > > -struct page_info *p2m_alloc_ptp(struct p2m_domain *p2m, unsigned long type) > +mfn_t p2m_alloc_ptp(struct p2m_domain *p2m, unsigned long type) > { > struct page_info *pg; > > @@ -577,13 +577,13 @@ struct page_info *p2m_alloc_ptp(struct p > ASSERT(p2m->domain); > ASSERT(p2m->domain->arch.paging.alloc_page); > pg = p2m->domain->arch.paging.alloc_page(p2m->domain); > -if (pg == NULL) > -return NULL; > +if ( !pg ) > +return INVALID_MFN; > > page_list_add_tail(pg, &p2m->pages); > pg->u.inuse.type_info = type | 1 | PGT_validated; > > -return pg; > +return page_to_mfn(pg); > } > > void p2m_free_ptp(struct p2m_domain *p2m, struct page_info *pg) > @@ -609,7 +609,7 @@ void p2m_free_ptp(struct p2m_domain *p2m > */ > int p2m_alloc_table(struct p2m_domain *p2m) > { > -struct page_info *p2m_top; > +mfn_t top_mfn; > struct domain *d = p2m->domain; > int rc = 0; > > @@ -632,14 +632,14 @@ int p2m_alloc_table(struct p2m_domain *p > > P2M_PRINTK("allocating p2m table\n"); > > -p2m_top = p2m_alloc_ptp(p2m, PGT_l4_page_table); > -if ( p2m_top == NULL ) > +top_mfn = p2m_alloc_ptp(p2m, PGT_l4_page_table); > +if ( mfn_eq(top_mfn, INVALID_MFN) ) > { > p2m_unlock(p2m); > return -ENOMEM; > } > > -p2m->phys_table = pagetable_from_mfn(page_to_mfn(p2m_top)); > +p2m->phys_table = pagetable_from_mfn(top_mfn); > > if ( hap_enabled(d) ) > iommu_share_p2m_table(d); > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -225,16 +225,16 @@ static void ept_p2m_type_to_flags(struct > /* Fill in middle levels of ept table */ > static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t > *ept_entry) > { > -struct page_info *pg; > +mfn_t mfn; > ept_entry_t *table; > unsigned int i; > > -pg = p2m_alloc_ptp(p2m, 0); > -if ( pg == NULL ) > +mfn = p2m_alloc_ptp(p2m, 0); > +if ( mfn_eq(mfn, INVALID_MFN) ) > return 0; > > ept_entry->epte = 0; > -ept_entry->mfn = page_to_mfn(pg); > +ept_entry->mfn = mfn_x(mfn); > ept_entry->access = p2m->default_access; > > ept_entry->r = ept_entry->w = ept_entry->x = 1; > @@ -243,7 +243,7 @@ static int ept_set_middle_entry(struct p > > ept_entry->suppress_ve = 1; > > -table = __map_domain_page(pg); > +table = map_domain_page(mfn); > > for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ ) > table[i].suppress_ve = 1; > --- a/xen/arch/x86/mm/p2m-pt.c > +++ b/xen/arch/x86/mm/p2m-pt.c > @@ -195,7 +195,7 @@ p2m_next_level(struct p2m_domain *p2m, v > l1_pgentry_t *p2m_entry; > l1_pgentry_t new_entry; > void *next; > -struct page_info *pg; > +mfn_t mfn; > unsigned int i, flags; > unsigned long pfn; > > @@ -206,12 +206,11 @@ p2m_next_level(struct p2m_domain *p2m, v > /* PoD/paging: Not present doesn't imply empty. */ > if ( !l1e_get_flags(*p2m_entry) ) > { > -pg = p2m_alloc_ptp(p2m, type); > -if ( pg == NULL ) > +mfn = p2m_alloc_ptp(p2m, type); > +if ( mfn_eq(mfn, INVALID_MFN) ) > return -ENOMEM; > > -new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)), > - P2M_BASE_FLAGS | _PAGE_RW); > +new_entry = l1e_from_pfn(mfn_x(mfn), P2M_BASE_FLAGS | _PAGE_RW); > > switch ( type ) { > case PGT_l3_page_table: > @@ -239,11 +238,11 @@ p2m_next_level(struct p2m_domain *p2m, v > /* split 1GB pages into 2MB pages */ > if ( type == PGT_l2_page_table && (flags & _PAGE_PSE) ) > { > -pg = p2m_alloc_ptp(p2m, PGT_l2_page_table); > -if ( pg == NULL ) > +mfn = p2m_alloc_ptp(p2m, PGT_l2_page_table); > +if ( mfn_eq(mfn, INVALID_MFN) ) > return -ENOMEM; > > -l1_entry = __map_domain_page(pg); > +l1_entry = map_domain_page(mfn); > for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) > { > new_entry = l1e_from_pfn(pfn | (i * L1_PAGETABLE_ENTRIES), > flags); > @@ -251,8 +250,7 @@ p2m_next_level(struct p2m_domain *p2m, v > p2m->write_p2m_entry(p2m, gfn, l1_entry + i, new_entry, 2); > } > unmap_domain_page(l1_entry); > -new_entry = l1e_from_pfn(mfn_x(page_to_mfn(pg)), > - P2M_BASE_FLAGS | _PAGE_RW); /* disable PSE > */ > +new_entry = l1e_from_pfn(mfn_x(mfn), P2M_BASE_FLAGS | _PAGE_RW); > p2m_add_iommu_flags(&new_entry, 2, IOMMUF_readable|IOMMUF_writable); > p2m->write_p2m_entry(p2m, gfn, p2m_entr
[Xen-devel] [xen-unstable-smoke test] 110937: regressions - trouble: blocked/broken/fail/pass
flight 110937 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/110937/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf 3 host-install(3)broken REGR. vs. 110907 test-amd64-amd64-xl-qemuu-debianhvm-i386 15 guest-localmigrate/x10 fail REGR. vs. 110907 Tests which did not succeed, but are not blocking: test-armhf-armhf-xl 1 build-check(1) blocked n/a test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 12 migrate-support-checkfail never pass version targeted for testing: xen 93de762bcafb458b91955399d55acd5b2104490d baseline version: xen d8f1b48fd665d7aad1711de2f073540d07d2d041 Last test of basis 110907 2017-06-21 00:20:54 Z0 days Testing same since 110937 2017-06-21 14:01:21 Z0 days1 attempts People who touched revisions under test: Jan Beulich Juergen Gross Wei Liu jobs: build-amd64 pass build-armhf broken build-amd64-libvirt pass test-armhf-armhf-xl blocked test-arm64-arm64-xl-xsm broken test-amd64-amd64-xl-qemuu-debianhvm-i386 fail test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary broken-step build-armhf host-install(3) Not pushing. commit 93de762bcafb458b91955399d55acd5b2104490d Author: Wei Liu Date: Mon Jun 5 16:04:57 2017 +0100 x86/traps: remove now unused inclusion of emulate.h Signed-off-by: Wei Liu Acked-by: Jan Beulich commit bb705e608d0b6d41eae717c2af594a2cfa796c2a Author: Wei Liu Date: Mon Jun 5 13:07:16 2017 +0100 x86: move PV invalid op emulation code Move the code to pv/emul-inv-op.c. Both functions are unchanged. Provide pv_emulate_invalid_op and use it in traps.c. Signed-off-by: Wei Liu Acked-by: Jan Beulich commit d3dfd92e592898891af6ba861296f35e9e02bbf1 Author: Wei Liu Date: Mon Jun 5 12:59:07 2017 +0100 x86: move PV gate op emulation code Move the code to pv/emul-gate-op.c. Prefix emulate_gate_op with pv_ and export it via pv/traps.h. Pure code motion except for the rename. Signed-off-by: Wei Liu Acked-by: Jan Beulich commit ecaba067e1e433c7b2d7e309f8eee6f2fdb5c8a0 Author: Wei Liu Date: Mon Jun 5 12:44:51 2017 +0100 x86: move PV privileged instruction emulation code Move the code to pv/emul-priv-op.c. Prefix emulate_privileged_op with pv_ and export it via pv/traps.h. Also move gpr_switch.S since it is used by the privileged instruction emulation code only. Code motion only except for the rename. Cleanup etc will come later. Signed-off-by: Wei Liu Acked-by: Jan Beulich commit 46b488cb661bca81559b24883ffd22602d71a65a Author: Wei Liu Date: Mon Jun 5 12:26:18 2017 +0100 x86: factor out common PV emulation code We're going to split PV emulation code into several files. This patch extracts the functions needed by them into a dedicated file. The functions are now prefixed with "pv_emul_" and exported via a local header file. While at it, change bool_t to bool. Signed-off-by: Wei Liu Acked-by: Jan Beulich commit 48d0c822640f8ce4754de16f1bee5c995bac7078 Author: Juergen Gross Date: Thu Jun 15 11:58:27 2017 +0200 tools/xen-detect: try sysfs node for obtaining guest type Instead of relying on cpuid instruction behaviour to tell which domain type we are just try asking the kernel via the appropriate sysfs node (added in Linux kernel 4.13). Keep the old detection logic as a fallback for older kernels. Signed-off-by: Juergen Gross (qemu changes not included) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] x86/p2m: simplify p2m_next_level()
On 21/06/17 11:25, Jan Beulich wrote: > Calculate entry PFN and flags just once, making the respective > variables (and also pg) function wide. Take the opportunity and also > make the induction variable unsigned. Hmm -- I'm a fan of keeping the scope of variables limited to where they're actually used, to make sure stale values don't end up being used. pg in particular I think should be kept local to the if() statements; there's absolutely no benefit to making it function-wide. I don't really see much benefit in making 'flags' and 'pfn' function-wide either; it's not easier to read, IMHO, it just might save a tiny bit of extra code, at the expense of removing some safety. If you want to avoid code duplication, it seems like merging the two if() statements (of which at most one can be true) would be a better approach. Something like the attached (build-tested only)? -George > > Signed-off-by: Jan Beulich > > --- a/xen/arch/x86/mm/p2m-pt.c > +++ b/xen/arch/x86/mm/p2m-pt.c > @@ -195,7 +195,9 @@ p2m_next_level(struct p2m_domain *p2m, v > l1_pgentry_t *p2m_entry; > l1_pgentry_t new_entry; > void *next; > -int i; > +struct page_info *pg; > +unsigned int i, flags; > +unsigned long pfn; > > if ( !(p2m_entry = p2m_find_entry(*table, gfn_remainder, gfn, >shift, max)) ) > @@ -204,8 +206,6 @@ p2m_next_level(struct p2m_domain *p2m, v > /* PoD/paging: Not present doesn't imply empty. */ > if ( !l1e_get_flags(*p2m_entry) ) > { > -struct page_info *pg; > - > pg = p2m_alloc_ptp(p2m, type); > if ( pg == NULL ) > return -ENOMEM; > @@ -232,21 +232,17 @@ p2m_next_level(struct p2m_domain *p2m, v > } > } > > -ASSERT(l1e_get_flags(*p2m_entry) & (_PAGE_PRESENT|_PAGE_PSE)); > +flags = l1e_get_flags(*p2m_entry); > +pfn = l1e_get_pfn(*p2m_entry); > +ASSERT(flags & (_PAGE_PRESENT|_PAGE_PSE)); > > /* split 1GB pages into 2MB pages */ > -if ( type == PGT_l2_page_table && (l1e_get_flags(*p2m_entry) & > _PAGE_PSE) ) > +if ( type == PGT_l2_page_table && (flags & _PAGE_PSE) ) > { > -unsigned long flags, pfn; > -struct page_info *pg; > - > pg = p2m_alloc_ptp(p2m, PGT_l2_page_table); > if ( pg == NULL ) > return -ENOMEM; > > -flags = l1e_get_flags(*p2m_entry); > -pfn = l1e_get_pfn(*p2m_entry); > - > l1_entry = __map_domain_page(pg); > for ( i = 0; i < L2_PAGETABLE_ENTRIES; i++ ) > { > @@ -263,19 +259,14 @@ p2m_next_level(struct p2m_domain *p2m, v > > > /* split single 2MB large page into 4KB page in P2M table */ > -if ( type == PGT_l1_page_table && (l1e_get_flags(*p2m_entry) & > _PAGE_PSE) ) > +if ( type == PGT_l1_page_table && (flags & _PAGE_PSE) ) > { > -unsigned long flags, pfn; > -struct page_info *pg; > - > pg = p2m_alloc_ptp(p2m, PGT_l1_page_table); > if ( pg == NULL ) > return -ENOMEM; > > /* New splintered mappings inherit the flags of the old superpage, > * with a little reorganisation for the _PAGE_PSE_PAT bit. */ > -flags = l1e_get_flags(*p2m_entry); > -pfn = l1e_get_pfn(*p2m_entry); > if ( pfn & 1 ) /* ==> _PAGE_PSE_PAT was set */ > pfn -= 1;/* Clear it; _PAGE_PSE becomes _PAGE_PAT */ > else > > > From 68931738e1f555c929cc77d9338043dc38f4610e Mon Sep 17 00:00:00 2001 From: George Dunlap Date: Wed, 21 Jun 2017 16:08:31 +0100 Subject: [PATCH] x86/p2m-pt: simplify p2m_next_level() [Insert description here] Signed-off-by: George Dunlap --- xen/arch/x86/mm/p2m-pt.c | 77 +--- 1 file changed, 33 insertions(+), 44 deletions(-) diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c index 06e64b8..4e1028e 100644 --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -195,7 +195,7 @@ p2m_next_level(struct p2m_domain *p2m, void **table, l1_pgentry_t *p2m_entry; l1_pgentry_t new_entry; void *next; -int i; +unsigned int i; if ( !(p2m_entry = p2m_find_entry(*table, gfn_remainder, gfn, shift, max)) ) @@ -232,12 +232,10 @@ p2m_next_level(struct p2m_domain *p2m, void **table, } } -ASSERT(l1e_get_flags(*p2m_entry) & (_PAGE_PRESENT|_PAGE_PSE)); - /* split 1GB pages into 2MB pages */ -if ( type == PGT_l2_page_table && (l1e_get_flags(*p2m_entry) & _PAGE_PSE) ) +if ( l1e_get_flags(*p2m_entry) & _PAGE_PSE ) { -unsigned long flags, pfn; +unsigned long flags, pfn, level; struct page_info *pg; pg = p2m_alloc_ptp(p2m, PGT_l2_page_table); @@ -247,54 +245,45 @@ p2m_next_level(struct p2m_domain *p2m, void **table, flags = l1e_get_flags(*p2m_entry); pfn = l1e_get_pfn(*p2m_entry); -
Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
On Wed, Jun 21, 2017 at 06:12:47PM +0300, Razvan Cojocaru wrote: > On 06/21/2017 06:10 PM, Wei Liu wrote: > > On Wed, Jun 21, 2017 at 05:56:02PM +0300, Razvan Cojocaru wrote: > >> Fixed an issue where the maximum index allowed (31) goes beyond the > >> actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. > >> Coverity-ID: 1412966 > >> > >> Signed-off-by: Razvan Cojocaru > >> > >> --- > >> Changes since V1: > >> - Changed '3' to 'ARRAY_SIZE(...)'. > >> --- > >> xen/arch/x86/monitor.c | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c > >> index bedf13c..af68a79 100644 > >> --- a/xen/arch/x86/monitor.c > >> +++ b/xen/arch/x86/monitor.c > >> @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d, > >> bool_t old_status; > >> > >> /* sanity check: avoid left-shift undefined behavior */ > > > > This comment should be deleted now. > > It technically continues to be correct, but if you'd like I can send V3 > - otherwise (and if it's not too much hassle) it can be deleted on > commit. I'm happy to accomodate either scenario. > I don't think I care enough really. :-) Since Andrew has reviewed this patch, it can be committed (by him) at some point. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 4/9] xen/pci: split code to size BARs from pci_add_device
On Fri, May 19, 2017 at 07:56:55AM -0600, Jan Beulich wrote: > >>> On 27.04.17 at 16:35, wrote: > > @@ -663,38 +708,13 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn, > > seg, bus, slot, func, i); > > continue; > > } > > -pci_conf_write32(seg, bus, slot, func, idx, ~0); > > -if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == > > - PCI_BASE_ADDRESS_MEM_TYPE_64 ) > > -{ > > -if ( i >= PCI_SRIOV_NUM_BARS ) > > -{ > > -printk(XENLOG_WARNING > > - "SR-IOV device %04x:%02x:%02x.%u with > > 64-bit" > > - " vf BAR in last slot\n", > > - seg, bus, slot, func); > > -break; > > -} > > -hi = pci_conf_read32(seg, bus, slot, func, idx + 4); > > -pci_conf_write32(seg, bus, slot, func, idx + 4, ~0); > > -} > > -pdev->vf_rlen[i] = pci_conf_read32(seg, bus, slot, func, > > idx) & > > - PCI_BASE_ADDRESS_MEM_MASK; > > -if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == > > - PCI_BASE_ADDRESS_MEM_TYPE_64 ) > > -{ > > -pdev->vf_rlen[i] |= (u64)pci_conf_read32(seg, bus, > > - slot, func, > > - idx + 4) << > > 32; > > -pci_conf_write32(seg, bus, slot, func, idx + 4, hi); > > -} > > -else if ( pdev->vf_rlen[i] ) > > -pdev->vf_rlen[i] |= (u64)~0 << 32; > > -pci_conf_write32(seg, bus, slot, func, idx, bar); > > -pdev->vf_rlen[i] = -pdev->vf_rlen[i]; > > -if ( (bar & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == > > - PCI_BASE_ADDRESS_MEM_TYPE_64 ) > > -++i; > > +ret = pci_size_bar(seg, bus, slot, func, pos + > > PCI_SRIOV_BAR, > > + PCI_SRIOV_NUM_BARS, &i, &addr, > > + &pdev->vf_rlen[i]); > > +if ( ret ) > > +dprintk(XENLOG_WARNING, > > +"%04x:%02x:%02x.%u: failed to size SR-IOV > > BAR%u\n", > > +seg, bus, slot, func, i); > > You shouldn't log two messages for the same problem (the called > function already logs one). > > A final more general remark: With you intending to call this function > from other than pci_add_device() context, some further care may / > will be needed. For example, are all to be added callers such that > you playing with config space won't interfere with what Dom0 does? > Are you sure you can get away without disabling memory decode > while fiddling with the BARs? So far I've been able to get away, but you are right that callers should disable memory decode before trying to size the BARs. I will do this in the callers however. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 08/36] x86/mm: Add support to enable SME in early boot processing
On 6/21/2017 2:16 AM, Thomas Gleixner wrote: On Fri, 16 Jun 2017, Tom Lendacky wrote: diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index a105796..988b336 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -15,16 +15,24 @@ #ifndef __ASSEMBLY__ +#include + #ifdef CONFIG_AMD_MEM_ENCRYPT extern unsigned long sme_me_mask; +void __init sme_enable(void); + #else /* !CONFIG_AMD_MEM_ENCRYPT */ #define sme_me_mask 0UL +static inline void __init sme_enable(void) { } + #endif/* CONFIG_AMD_MEM_ENCRYPT */ +unsigned long sme_get_me_mask(void); Why is this an unconditional function? Isn't the mask simply 0 when the MEM ENCRYPT support is disabled? I made it unconditional because of the call from head_64.S. I can't make use of the C level static inline function and since the mask is not a variable if CONFIG_AMD_MEM_ENCRYPT is not configured (#defined to 0) I can't reference the variable directly. I could create a #define in head_64.S that changes this to load rax with the variable if CONFIG_AMD_MEM_ENCRYPT is configured or a zero if it's not or add a #ifdef at that point in the code directly. Thoughts on that? diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 6225550..ef12729 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -78,7 +78,29 @@ startup_64: call__startup_64 popq%rsi - movq $(early_top_pgt - __START_KERNEL_map), %rax + /* +* Encrypt the kernel if SME is active. +* The real_mode_data address is in %rsi and that register can be +* clobbered by the called function so be sure to save it. +*/ + push%rsi + callsme_encrypt_kernel + pop %rsi That does not make any sense. Neither the call to sme_encrypt_kernel() nor the following call to sme_get_me_mask(). __startup_64() is already C code, so why can't you simply call that from __startup_64() in C and return the mask from there? I was trying to keep it explicit as to what was happening, but I can move those calls into __startup_64(). I'll still need the call to sme_get_me_mask() in the secondary_startup_64 path, though (depending on your thoughts to the above response). @@ -98,7 +120,20 @@ ENTRY(secondary_startup_64) /* Sanitize CPU configuration */ call verify_cpu - movq $(init_top_pgt - __START_KERNEL_map), %rax + /* +* Get the SME encryption mask. +* The encryption mask will be returned in %rax so we do an ADD +* below to be sure that the encryption mask is part of the +* value that will stored in %cr3. +* +* The real_mode_data address is in %rsi and that register can be +* clobbered by the called function so be sure to save it. +*/ + push%rsi + callsme_get_me_mask + pop %rsi Do we really need a call here? The mask is established at this point, so it's either 0 when the encryption stuff is not compiled in or it can be retrieved from a variable which is accessible at this point. Same as above, this can be updated based on the decided approach. Thanks, Tom + + addq$(init_top_pgt - __START_KERNEL_map), %rax 1: /* Enable PAE mode, PGE and LA57 */ Thanks, tglx ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/6] x86/shadow: Use ERR_PTR infrastructure for sh_emulate_map_dest()
sh_emulate_map_dest() predates the introduction of the generic ERR_PTR() infrasturcture, but take the opportunity to avoid opencoding it. The chosen error constants require need to be negative to work with IS_ERR(), but no other changes. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Tim Deegan v2: * Use ~(long)X86EMUL rather than -X86EMUL so MAPPING_SILENT_FAIL is considered an error to IS_ERR() --- xen/arch/x86/mm/shadow/multi.c | 8 xen/arch/x86/mm/shadow/private.h | 7 +++ 2 files changed, 7 insertions(+), 8 deletions(-) diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c index f65ffc6..74c3641 100644 --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -4752,8 +4752,8 @@ sh_x86_emulate_write(struct vcpu *v, unsigned long vaddr, void *src, return X86EMUL_UNHANDLEABLE; addr = sh_emulate_map_dest(v, vaddr, bytes, sh_ctxt); -if ( sh_emulate_map_dest_failed(addr) ) -return (long)addr; +if ( IS_ERR(addr) ) +return ~PTR_ERR(addr); paging_lock(v->domain); memcpy(addr, src, bytes); @@ -4794,8 +4794,8 @@ sh_x86_emulate_cmpxchg(struct vcpu *v, unsigned long vaddr, return X86EMUL_UNHANDLEABLE; addr = sh_emulate_map_dest(v, vaddr, bytes, sh_ctxt); -if ( sh_emulate_map_dest_failed(addr) ) -return (long)addr; +if ( IS_ERR(addr) ) +return ~PTR_ERR(addr); paging_lock(v->domain); switch ( bytes ) diff --git a/xen/arch/x86/mm/shadow/private.h b/xen/arch/x86/mm/shadow/private.h index 472676c..a59ff7a 100644 --- a/xen/arch/x86/mm/shadow/private.h +++ b/xen/arch/x86/mm/shadow/private.h @@ -395,10 +395,9 @@ void shadow_unhook_mappings(struct domain *d, mfn_t smfn, int user_only); /* Returns a mapped pointer to write to, or one of the following error * indicators. */ -#define MAPPING_UNHANDLEABLE ((void *)(unsigned long)X86EMUL_UNHANDLEABLE) -#define MAPPING_EXCEPTION((void *)(unsigned long)X86EMUL_EXCEPTION) -#define MAPPING_SILENT_FAIL ((void *)(unsigned long)X86EMUL_OKAY) -#define sh_emulate_map_dest_failed(rc) ((unsigned long)(rc) <= 3) +#define MAPPING_UNHANDLEABLE ERR_PTR(~(long)X86EMUL_UNHANDLEABLE) +#define MAPPING_EXCEPTIONERR_PTR(~(long)X86EMUL_EXCEPTION) +#define MAPPING_SILENT_FAIL ERR_PTR(~(long)X86EMUL_OKAY) void *sh_emulate_map_dest(struct vcpu *v, unsigned long vaddr, unsigned int bytes, struct sh_emulate_ctxt *sh_ctxt); void sh_emulate_unmap_dest(struct vcpu *v, void *addr, unsigned int bytes, -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/6] [RFC] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result
Signed-off-by: Andrew Cooper --- CC: Jan Beulich Name definitely open to improvement. Perhaps better considered in the context of the following patch. --- xen/arch/x86/hvm/dom0_build.c | 2 +- xen/arch/x86/hvm/emulate.c| 40 ++-- xen/arch/x86/hvm/hvm.c| 56 +++ xen/arch/x86/hvm/intercept.c | 20 +++--- xen/arch/x86/hvm/svm/nestedsvm.c | 5 ++-- xen/arch/x86/hvm/svm/svm.c| 2 +- xen/arch/x86/hvm/viridian.c | 2 +- xen/arch/x86/hvm/vmsi.c | 2 +- xen/arch/x86/hvm/vmx/realmode.c | 2 +- xen/arch/x86/hvm/vmx/vvmx.c | 14 +- xen/arch/x86/mm/shadow/common.c | 12 - xen/common/libelf/libelf-loader.c | 4 +-- xen/include/asm-x86/hvm/support.h | 40 ++-- 13 files changed, 101 insertions(+), 100 deletions(-) diff --git a/xen/arch/x86/hvm/dom0_build.c b/xen/arch/x86/hvm/dom0_build.c index 020c355..e8f746c 100644 --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -238,7 +238,7 @@ static int __init pvh_setup_vmx_realmode_helpers(struct domain *d) if ( !pvh_steal_ram(d, HVM_VM86_TSS_SIZE, 128, GB(4), &gaddr) ) { if ( hvm_copy_to_guest_phys(gaddr, NULL, HVM_VM86_TSS_SIZE, v) != - HVMCOPY_okay ) + HVMTRANS_okay ) printk("Unable to zero VM86 TSS area\n"); d->arch.hvm_domain.params[HVM_PARAM_VM86_TSS_SIZED] = VM86_TSS_UPDATED | ((uint64_t)HVM_VM86_TSS_SIZE << 32) | gaddr; diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 495e312..384ad0b 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -100,7 +100,7 @@ static int ioreq_server_read(const struct hvm_io_handler *io_handler, uint32_t size, uint64_t *data) { -if ( hvm_copy_from_guest_phys(data, addr, size) != HVMCOPY_okay ) +if ( hvm_copy_from_guest_phys(data, addr, size) != HVMTRANS_okay ) return X86EMUL_UNHANDLEABLE; return X86EMUL_OKAY; @@ -892,18 +892,18 @@ static int __hvmemul_read( switch ( rc ) { -case HVMCOPY_okay: +case HVMTRANS_okay: break; -case HVMCOPY_bad_gva_to_gfn: +case HVMTRANS_bad_linear_to_gfn: x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt); return X86EMUL_EXCEPTION; -case HVMCOPY_bad_gfn_to_mfn: +case HVMTRANS_bad_gfn_to_mfn: if ( access_type == hvm_access_insn_fetch ) return X86EMUL_UNHANDLEABLE; return hvmemul_linear_mmio_read(addr, bytes, p_data, pfec, hvmemul_ctxt, 0); -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: return X86EMUL_RETRY; default: return X86EMUL_UNHANDLEABLE; @@ -1011,15 +1011,15 @@ static int hvmemul_write( switch ( rc ) { -case HVMCOPY_okay: +case HVMTRANS_okay: break; -case HVMCOPY_bad_gva_to_gfn: +case HVMTRANS_bad_linear_to_gfn: x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt); return X86EMUL_EXCEPTION; -case HVMCOPY_bad_gfn_to_mfn: +case HVMTRANS_bad_gfn_to_mfn: return hvmemul_linear_mmio_write(addr, bytes, p_data, pfec, hvmemul_ctxt, 0); -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: return X86EMUL_RETRY; default: return X86EMUL_UNHANDLEABLE; @@ -1383,7 +1383,7 @@ static int hvmemul_rep_movs( return rc; } -rc = HVMCOPY_okay; +rc = HVMTRANS_okay; } else /* @@ -1393,16 +1393,16 @@ static int hvmemul_rep_movs( */ rc = hvm_copy_from_guest_phys(buf, sgpa, bytes); -if ( rc == HVMCOPY_okay ) +if ( rc == HVMTRANS_okay ) rc = hvm_copy_to_guest_phys(dgpa, buf, bytes, current); xfree(buf); -if ( rc == HVMCOPY_gfn_paged_out ) +if ( rc == HVMTRANS_gfn_paged_out ) return X86EMUL_RETRY; -if ( rc == HVMCOPY_gfn_shared ) +if ( rc == HVMTRANS_gfn_shared ) return X86EMUL_RETRY; -if ( rc != HVMCOPY_okay ) +if ( rc != HVMTRANS_okay ) { gdprintk(XENLOG_WARNING, "Failed memory-to-memory REP MOVS: sgpa=%" PRIpaddr" dgpa=%"PRIpaddr" reps=%lu bytes_per_rep=%u\n", @@ -1512,10 +1512,10 @@ static int hvmemul_rep_stos( switch ( rc ) { -case HVMCOPY_gfn_paged_out: -case HVMCOPY_gfn_shared: +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: return X86EMUL_RETRY; -case HVMCOPY_okay: +case HVMTRANS_okay: return X86EMUL_OKAY; } @@ -2171,7 +2171,7 @@ void hvm_emulate_init_per_insn( &addr) && hvm_fetch_from_guest_linear(hvme
[Xen-devel] [PATCH 5/6] x86/hvm: Break out __hvm_copy()'s translation logic
It will be reused by later changes. Signed-off-by: Andrew Cooper --- CC: Jan Beulich --- xen/arch/x86/hvm/hvm.c| 141 ++ xen/include/asm-x86/hvm/support.h | 12 2 files changed, 95 insertions(+), 58 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index c822d3b..0a5697a 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3072,6 +3072,80 @@ void hvm_task_switch( hvm_unmap_entry(nptss_desc); } +enum hvm_translation_result hvm_translate_get_page( +struct vcpu *v, unsigned long addr, bool linear, uint32_t pfec, +pagefault_info_t *pfinfo, struct page_info **_page, +gfn_t *__gfn, p2m_type_t *_p2mt) +{ +struct page_info *page; +p2m_type_t p2mt; +gfn_t gfn; + +if ( linear ) +{ +gfn = _gfn(paging_gva_to_gfn(v, addr, &pfec)); + +if ( gfn_eq(gfn, INVALID_GFN) ) +{ +if ( pfec & PFEC_page_paged ) +return HVMTRANS_gfn_paged_out; + +if ( pfec & PFEC_page_shared ) +return HVMTRANS_gfn_shared; + +if ( pfinfo ) +{ +pfinfo->linear = addr; +pfinfo->ec = pfec & ~PFEC_implicit; +} + +return HVMTRANS_bad_linear_to_gfn; +} +} +else +gfn = _gfn(addr >> PAGE_SHIFT); + +/* + * No need to do the P2M lookup for internally handled MMIO, benefiting + * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses, + * - newer Windows (like Server 2012) for HPET accesses. + */ +if ( v == current && is_hvm_vcpu(v) + && !nestedhvm_vcpu_in_guestmode(v) + && hvm_mmio_internal(gfn_x(gfn) << PAGE_SHIFT) ) +return HVMTRANS_bad_gfn_to_mfn; + +page = get_page_from_gfn(v->domain, gfn_x(gfn), &p2mt, P2M_UNSHARE); + +if ( !page ) +return HVMTRANS_bad_gfn_to_mfn; + +if ( p2m_is_paging(p2mt) ) +{ +put_page(page); +p2m_mem_paging_populate(v->domain, gfn_x(gfn)); +return HVMTRANS_gfn_paged_out; +} +if ( p2m_is_shared(p2mt) ) +{ +put_page(page); +return HVMTRANS_gfn_shared; +} +if ( p2m_is_grant(p2mt) ) +{ +put_page(page); +return HVMTRANS_unhandleable; +} + +*_page = page; +if ( __gfn ) +*__gfn = gfn; +if ( _p2mt ) +*_p2mt = p2mt; + +return HVMTRANS_okay; +} + #define HVMCOPY_from_guest (0u<<0) #define HVMCOPY_to_guest (1u<<0) #define HVMCOPY_phys (0u<<2) @@ -3080,7 +3154,7 @@ static enum hvm_translation_result __hvm_copy( void *buf, paddr_t addr, int size, struct vcpu *v, unsigned int flags, uint32_t pfec, pagefault_info_t *pfinfo) { -unsigned long gfn; +gfn_t gfn; struct page_info *page; p2m_type_t p2mt; char *p; @@ -3106,65 +3180,15 @@ static enum hvm_translation_result __hvm_copy( while ( todo > 0 ) { +enum hvm_translation_result res; paddr_t gpa = addr & ~PAGE_MASK; count = min_t(int, PAGE_SIZE - gpa, todo); -if ( flags & HVMCOPY_linear ) -{ -gfn = paging_gva_to_gfn(v, addr, &pfec); -if ( gfn == gfn_x(INVALID_GFN) ) -{ -if ( pfec & PFEC_page_paged ) -return HVMTRANS_gfn_paged_out; -if ( pfec & PFEC_page_shared ) -return HVMTRANS_gfn_shared; -if ( pfinfo ) -{ -pfinfo->linear = addr; -pfinfo->ec = pfec & ~PFEC_implicit; -} -return HVMTRANS_bad_linear_to_gfn; -} -gpa |= (paddr_t)gfn << PAGE_SHIFT; -} -else -{ -gfn = addr >> PAGE_SHIFT; -gpa = addr; -} - -/* - * No need to do the P2M lookup for internally handled MMIO, benefiting - * - 32-bit WinXP (& older Windows) on AMD CPUs for LAPIC accesses, - * - newer Windows (like Server 2012) for HPET accesses. - */ -if ( v == current && is_hvm_vcpu(v) - && !nestedhvm_vcpu_in_guestmode(v) - && hvm_mmio_internal(gpa) ) -return HVMTRANS_bad_gfn_to_mfn; - -page = get_page_from_gfn(v->domain, gfn, &p2mt, P2M_UNSHARE); - -if ( !page ) -return HVMTRANS_bad_gfn_to_mfn; - -if ( p2m_is_paging(p2mt) ) -{ -put_page(page); -p2m_mem_paging_populate(v->domain, gfn); -return HVMTRANS_gfn_paged_out; -} -if ( p2m_is_shared(p2mt) ) -{ -put_page(page); -return HVMTRANS_gfn_shared; -} -if ( p2m_is_grant(p2mt) ) -{ -put_page(page); -return HVMTRANS_unhandleable; -} +res = hvm_translate_get_page(v, addr, flags & HVMCOPY_linear, + pf
[Xen-devel] [PATCH 1/6] x86/hvm: Fixes to hvmemul_insn_fetch()
Force insn_off to a single byte, as offset can wrap around or truncate with respect to sh_ctxt->insn_buf_eip under a number of normal circumstances. Furthermore, don't use an ASSERT() for bounds checking the write into hvmemul_ctxt->insn_buf[]. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Paul Durrant For anyone wondering, this is way to explot XSA-186 --- xen/arch/x86/hvm/emulate.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 11e4aba..495e312 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -939,7 +939,8 @@ int hvmemul_insn_fetch( { struct hvm_emulate_ctxt *hvmemul_ctxt = container_of(ctxt, struct hvm_emulate_ctxt, ctxt); -unsigned int insn_off = offset - hvmemul_ctxt->insn_buf_eip; +/* Careful, as offset can wrap or truncate WRT insn_buf_eip. */ +uint8_t insn_off = offset - hvmemul_ctxt->insn_buf_eip; /* * Fall back if requested bytes are not in the prefetch cache. @@ -953,7 +954,17 @@ int hvmemul_insn_fetch( if ( rc == X86EMUL_OKAY && bytes ) { -ASSERT(insn_off + bytes <= sizeof(hvmemul_ctxt->insn_buf)); +/* + * Will we overflow insn_buf[]? This shouldn't be able to happen, + * which means something went wrong with instruction decoding... + */ +if ( insn_off > sizeof(hvmemul_ctxt->insn_buf) || + (insn_off + bytes) > sizeof(hvmemul_ctxt->insn_buf) ) +{ +ASSERT_UNREACHABLE(); +return X86EMUL_UNHANDLEABLE; +} + memcpy(&hvmemul_ctxt->insn_buf[insn_off], p_data, bytes); hvmemul_ctxt->insn_buf_bytes = insn_off + bytes; } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 6/6] x86/hvm: Implement hvmemul_write() using real mappings
An access which crosses a page boundary is performed atomically by x86 hardware, albeit with a severe performance penalty. An important corner case is when a straddled access hits two pages which differ in whether a translation exists, or in net access rights. The use of hvm_copy*() in hvmemul_write() is problematic, because it performs a translation then completes the partial write, before moving onto the next translation. If an individual emulated write straddles two pages, the first of which is writable, and the second of which is not, the first half of the write will complete before #PF is raised from the second half. This results in guest state corruption as a side effect of emulation, which has been observed to cause windows to crash while under introspection. Introduce the hvmemul_{,un}map_linear_addr() helpers, which translate an entire contents of a linear access, and vmap() the underlying frames to provide a contiguous virtual mapping for the emulator to use. This is the same mechanism as used by the shadow emulation code. This will catch any translation issues and abort the emulation before any modifications occur. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Paul Durrant CC: Razvan Cojocaru CC: Mihai Donțu While the maximum size of linear mapping is capped at 1 page, the logic in the helpers is written to work properly as hvmemul_ctxt->mfn[] gets longer, specifically with XSAVE instruction emulation in mind. This has only had light testing so far. --- xen/arch/x86/hvm/emulate.c| 179 ++ xen/include/asm-x86/hvm/emulate.h | 7 ++ 2 files changed, 169 insertions(+), 17 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index 384ad0b..804bea5 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -498,6 +498,159 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa, } /* + * Map the frame(s) covering an individual linear access, for writeable + * access. May return NULL for MMIO, or ERR_PTR(~X86EMUL_*) for other errors + * including ERR_PTR(~X86EMUL_OKAY) for write-discard mappings. + * + * In debug builds, map() checks that each slot in hvmemul_ctxt->mfn[] is + * clean before use, and poisions unused slots with INVALID_MFN. + */ +static void *hvmemul_map_linear_addr( +unsigned long linear, unsigned int bytes, uint32_t pfec, +struct hvm_emulate_ctxt *hvmemul_ctxt) +{ +struct vcpu *curr = current; +void *err, *mapping; + +/* First and final gfns which need mapping. */ +unsigned long frame = linear >> PAGE_SHIFT, first = frame; +unsigned long final = (linear + bytes - !!bytes) >> PAGE_SHIFT; + +/* + * mfn points to the next free slot. All used slots have a page reference + * held on them. + */ +mfn_t *mfn = &hvmemul_ctxt->mfn[0]; + +/* + * The caller has no legitimate reason for trying a zero-byte write, but + * final is calculate to fail safe in release builds. + * + * The maximum write size depends on the number of adjacent mfns[] which + * can be vmap()'d, accouting for possible misalignment within the region. + * The higher level emulation callers are responsible for ensuring that + * mfns[] is large enough for the requested write size. + */ +if ( bytes == 0 || + final - first > ARRAY_SIZE(hvmemul_ctxt->mfn) - 1 ) +{ +ASSERT_UNREACHABLE(); +goto unhandleable; +} + +do { +enum hvm_translation_result res; +struct page_info *page; +pagefault_info_t pfinfo; +p2m_type_t p2mt; + +res = hvm_translate_get_page(curr, frame << PAGE_SHIFT, true, pfec, + &pfinfo, &page, NULL, &p2mt); + +switch ( res ) +{ +case HVMTRANS_okay: +break; + +case HVMTRANS_bad_linear_to_gfn: +x86_emul_pagefault(pfinfo.ec, pfinfo.linear, &hvmemul_ctxt->ctxt); +err = ERR_PTR(~(long)X86EMUL_EXCEPTION); +goto out; + +case HVMTRANS_bad_gfn_to_mfn: +err = NULL; +goto out; + +case HVMTRANS_gfn_paged_out: +case HVMTRANS_gfn_shared: +err = ERR_PTR(~(long)X86EMUL_RETRY); +goto out; + +default: +goto unhandleable; +} + +/* Error checking. Confirm that the current slot is clean. */ +ASSERT(mfn_x(*mfn) == 0); + +*mfn++ = _mfn(page_to_mfn(page)); +frame++; + +if ( p2m_is_discard_write(p2mt) ) +{ +err = ERR_PTR(~(long)X86EMUL_OKAY); +goto out; +} + +} while ( frame < final ); + +/* Entire access within a single frame? */ +if ( first == final ) +mapping = map_domain_page(hvmemul_ctxt->mfn[0]) + (linear & ~PAGE_MASK); +/* Multiple frames? Need to vmap(). */ +else if ( (mapping = vmap(hvmemul_ctxt->mfn, + mfn - hvme
[Xen-devel] [PATCH 2/6] x86/shadow: Fixes to hvm_emulate_insn_fetch()
Zero-legnth reads are jump-target segmentation checks; never serve them from the cache. Force insn_off to a single byte, as offset can wrap around or truncate with respect to sh_ctxt->insn_buf_eip under a number of normal circumstances. Signed-off-by: Andrew Cooper --- CC: Tim Deegan CC: Jan Beulich --- xen/arch/x86/mm/shadow/common.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c index 2e64a77..deea03a 100644 --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -235,12 +235,16 @@ hvm_emulate_insn_fetch(enum x86_segment seg, { struct sh_emulate_ctxt *sh_ctxt = container_of(ctxt, struct sh_emulate_ctxt, ctxt); -unsigned int insn_off = offset - sh_ctxt->insn_buf_eip; +/* Careful, as offset can wrap or truncate WRT insn_buf_eip. */ +uint8_t insn_off = offset - sh_ctxt->insn_buf_eip; ASSERT(seg == x86_seg_cs); -/* Fall back if requested bytes are not in the prefetch cache. */ -if ( unlikely((insn_off + bytes) > sh_ctxt->insn_buf_bytes) ) +/* + * Fall back if requested bytes are not in the prefetch cache, but always + * perform the zero-length read for segmentation purposes. + */ +if ( !bytes || unlikely((insn_off + bytes) > sh_ctxt->insn_buf_bytes) ) return hvm_read(seg, offset, p_data, bytes, hvm_access_insn_fetch, sh_ctxt); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 0/6] Various XSA followups
XSA-219 was discovered while trying to implement the bugfix in patch 6. Andrew Cooper (6): x86/hvm: Fixes to hvmemul_insn_fetch() x86/shadow: Fixes to hvm_emulate_insn_fetch() x86/shadow: Use ERR_PTR infrastructure for sh_emulate_map_dest() [RFC] x86/hvm: Rename enum hvm_copy_result to hvm_translation_result x86/hvm: Break out __hvm_copy()'s translation logic x86/hvm: Implement hvmemul_write() using real mappings xen/arch/x86/hvm/dom0_build.c | 2 +- xen/arch/x86/hvm/emulate.c| 224 -- xen/arch/x86/hvm/hvm.c| 181 +- xen/arch/x86/hvm/intercept.c | 20 ++-- xen/arch/x86/hvm/svm/nestedsvm.c | 5 +- xen/arch/x86/hvm/svm/svm.c| 2 +- xen/arch/x86/hvm/viridian.c | 2 +- xen/arch/x86/hvm/vmsi.c | 2 +- xen/arch/x86/hvm/vmx/realmode.c | 2 +- xen/arch/x86/hvm/vmx/vvmx.c | 14 +-- xen/arch/x86/mm/shadow/common.c | 22 ++-- xen/arch/x86/mm/shadow/multi.c| 8 +- xen/arch/x86/mm/shadow/private.h | 7 +- xen/common/libelf/libelf-loader.c | 4 +- xen/include/asm-x86/hvm/emulate.h | 7 ++ xen/include/asm-x86/hvm/support.h | 52 + 16 files changed, 379 insertions(+), 175 deletions(-) -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
On 06/21/2017 06:10 PM, Wei Liu wrote: > On Wed, Jun 21, 2017 at 05:56:02PM +0300, Razvan Cojocaru wrote: >> Fixed an issue where the maximum index allowed (31) goes beyond the >> actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. >> Coverity-ID: 1412966 >> >> Signed-off-by: Razvan Cojocaru >> >> --- >> Changes since V1: >> - Changed '3' to 'ARRAY_SIZE(...)'. >> --- >> xen/arch/x86/monitor.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c >> index bedf13c..af68a79 100644 >> --- a/xen/arch/x86/monitor.c >> +++ b/xen/arch/x86/monitor.c >> @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d, >> bool_t old_status; >> >> /* sanity check: avoid left-shift undefined behavior */ > > This comment should be deleted now. It technically continues to be correct, but if you'd like I can send V3 - otherwise (and if it's not too much hassle) it can be deleted on commit. I'm happy to accomodate either scenario. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2]Proposal to allow setting up shared memory areas between VMs from xl config file
On 21/06/17 16:09, Wei Liu wrote: On Wed, Jun 21, 2017 at 01:18:38AM +0800, Zhongze Liu wrote: For example: In xl config file of vm1: static_shared_mem = ["id = ID1, begin = gmfn1, end = gmfn2, granularity = 4k, prot = RO, master”, "id = ID2, begin = gmfn3, end = gmfn4, I think you mean "gpfn" here and below. It would be better to use gfn in that case to follow the convention of the hypervisor (see xen/include/memory.h). Cheers, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
On Wed, Jun 21, 2017 at 05:56:02PM +0300, Razvan Cojocaru wrote: > Fixed an issue where the maximum index allowed (31) goes beyond the > actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. > Coverity-ID: 1412966 > > Signed-off-by: Razvan Cojocaru > > --- > Changes since V1: > - Changed '3' to 'ARRAY_SIZE(...)'. > --- > xen/arch/x86/monitor.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c > index bedf13c..af68a79 100644 > --- a/xen/arch/x86/monitor.c > +++ b/xen/arch/x86/monitor.c > @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d, > bool_t old_status; > > /* sanity check: avoid left-shift undefined behavior */ This comment should be deleted now. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2]Proposal to allow setting up shared memory areas between VMs from xl config file
On Wed, Jun 21, 2017 at 01:18:38AM +0800, Zhongze Liu wrote: > > 1. Motivation and Description > > Virtual machines use grant table hypercalls to setup a share page for > inter-VMs communications. These hypercalls are used by all PV > protocols today. However, very simple guests, such as baremetal > applications, might not have the infrastructure to handle the grant table. > This project is about setting up several shared memory areas for inter-VMs > communications directly from the VM config file. > So that the guest kernel doesn't have to have grant table support (in the > embedded space, this is not unusual) to be able to communicate with > other guests. > > > 2. Implementation Plan: > > > == > 2.1 Introduce a new VM config option in xl: > == > The shared areas should be shareable among several (>=2) VMs, so > every shared physical memory area is assigned to a set of VMs. > Therefore, a “token” or “identifier” should be used here to uniquely > identify a backing memory area. > > The backing area would be taken from one domain, which we will regard > as the "master domain", and this domain should be created prior to any > other "slave domain"s. Again, we have to use some kind of tag to tell who > is the "master domain". > > And the ability to specify the attributes of the pages (say, WO/RO/X) > to be shared should be also given to the user. For the master domain, > these attributes often describes the maximum permission allowed for the > shared pages, and for the slave domains, these attributes are often used > to describe with what permissions this area will be mapped. > This information should also be specified in the xl config entry. > I don't quite get the attribute settings. If you only insert a backing page into guest physical address space with XENMEM hypercall, how do you audit the attributes when the guest tries to map the page? > To handle all these, I would suggest using an unsigned integer to serve as the > identifier, and using a "master" tag in the master domain's xl config entry > to announce that she will provide the backing memory pages. A separate > entry would be used to describe the attributes of the shared memory area, of > the form "prot=RW". I think using an integer is too limiting. You would need the user to know if a particular number is already used. Maybe using a number is good enough for the use case you have in mind, but it is not future proof. I don't know how sophisticated we want this to be, though. > For example: > > In xl config file of vm1: > > static_shared_mem = ["id = ID1, begin = gmfn1, end = gmfn2, > granularity = 4k, prot = RO, master”, > "id = ID2, begin = gmfn3, end = gmfn4, I think you mean "gpfn" here and below. > granularity = 4k, prot = RW, master”] > > In xl config file of vm2: > > static_shared_mem = ["id = ID1, begin = gmfn5, end = gmfn6, > granularity = 4k, prot = RO”] > > In xl config file of vm3: > > static_shared_mem = ["id = ID2, begin = gmfn7, end = gmfn8, > granularity = 4k, prot = RW”] > > gmfn's above are all hex of the form "0x2". > > In the example above. A memory area ID1 will be shared between vm1 and vm2. > This area will be taken from vm1 and mapped into vm2's stage-2 page table. > The parameter "prot=RO" means that this memory area are offered with read-only > permission. vm1 can access this area using gmfn1~gmfn2, and vm2 using > gmfn5~gmfn6. > Likewise, a memory area ID will be shared between vm1 and vm3 with read and > write permissions. vm1 is the master and vm2 the slave. vm1 can access the > area using gmfn3~gmfn4 and vm3 using gmfn7~gmfn8. > > The "granularity" is optional in the slaves' config entries. But if it's > presented in the slaves' config entry, it has to be the same with its > master's. > Besides, the size of the gmfn range must also match. And overlapping backing > memory areas are well defined. > What do you mean by "well defined"? Why is inserting a sub-range not allowed? > Note that the "master" tag in vm1 for both ID1 and ID2 indicates that vm1 > should be created prior to both vm2 and vm3, for they both rely on the pages > backed by vm1. If one tries to create vm2 or vm3 prior to vm1, she will get > an error. And in vm1's config file, the "prot=RO" parameter of ID1 indicates > that if one tries to share this page with vm1 with, say, "WR" permission, > she will get an error, too. > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
On 21/06/17 15:56, Razvan Cojocaru wrote: > Fixed an issue where the maximum index allowed (31) goes beyond the > actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. > Coverity-ID: 1412966 > > Signed-off-by: Razvan Cojocaru Reviewed-by: Andrew Cooper ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/xen-detect: try sysfs node for obtaining guest type
Wei Liu writes ("Re: [Xen-devel] [PATCH] tools/xen-detect: try sysfs node for obtaining guest type"): > On Wed, Jun 21, 2017 at 04:24:33PM +0200, Olaf Hering wrote: > > On Thu, Jun 15, Juergen Gross wrote: > > > +++ b/tools/misc/xen-detect.c > > > > > +asprintf(&ver, "V%u.%u", > > > + (uint16_t)(regs[0] >> 16), (uint16_t)regs[0]); > > > > > +asprintf(&ver, "V%s.%s", str, tmp); > > > > This fails to build: > > > > xen-detect.c:196:17: error: ignoring return value of 'asprintf', declared > > with attribute warn_unused_result [-Werror=unused-result] > > xen-detect.c:93:17: error: ignoring return value of 'asprintf', declared > > with attribute warn_unused_result [-Werror=unused-result] > > Hmm... your gcc seems to be rather strict. :-) gcc is correct. asprintf can fail if its malloc fails. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH V2] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
Fixed an issue where the maximum index allowed (31) goes beyond the actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. Coverity-ID: 1412966 Signed-off-by: Razvan Cojocaru --- Changes since V1: - Changed '3' to 'ARRAY_SIZE(...)'. --- xen/arch/x86/monitor.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c index bedf13c..af68a79 100644 --- a/xen/arch/x86/monitor.c +++ b/xen/arch/x86/monitor.c @@ -133,7 +133,8 @@ int arch_monitor_domctl_event(struct domain *d, bool_t old_status; /* sanity check: avoid left-shift undefined behavior */ -if ( unlikely(mop->u.mov_to_cr.index > 31) ) +if ( unlikely(mop->u.mov_to_cr.index >= + ARRAY_SIZE(ad->monitor.write_ctrlreg_mask)) ) return -EINVAL; if ( unlikely(mop->u.mov_to_cr.pad1 || mop->u.mov_to_cr.pad2) ) -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
On 21/06/17 15:33, Andrew Cooper wrote: > On 21/06/17 15:28, Razvan Cojocaru wrote: >> Fixed an issue where the maximum index allowed (31) goes beyond the >> actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. >> Coverity-ID: 1412966 >> >> Signed-off-by: Razvan Cojocaru >> --- >> xen/arch/x86/monitor.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c >> index bedf13c..4620b15 100644 >> --- a/xen/arch/x86/monitor.c >> +++ b/xen/arch/x86/monitor.c >> @@ -133,7 +133,7 @@ int arch_monitor_domctl_event(struct domain *d, >> bool_t old_status; >> >> /* sanity check: avoid left-shift undefined behavior */ >> -if ( unlikely(mop->u.mov_to_cr.index > 31) ) >> +if ( unlikely(mop->u.mov_to_cr.index > 3) ) >> = ARRAY_SIZE(ad->monitor.write_ctrlreg_mask) > ? Sorry - that should be ">= ARRAY" when it doesn't get interpreted as an email quotation. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
On 06/21/2017 05:33 PM, Andrew Cooper wrote: > On 21/06/17 15:28, Razvan Cojocaru wrote: >> Fixed an issue where the maximum index allowed (31) goes beyond the >> actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. >> Coverity-ID: 1412966 >> >> Signed-off-by: Razvan Cojocaru >> --- >> xen/arch/x86/monitor.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c >> index bedf13c..4620b15 100644 >> --- a/xen/arch/x86/monitor.c >> +++ b/xen/arch/x86/monitor.c >> @@ -133,7 +133,7 @@ int arch_monitor_domctl_event(struct domain *d, >> bool_t old_status; >> >> /* sanity check: avoid left-shift undefined behavior */ >> -if ( unlikely(mop->u.mov_to_cr.index > 31) ) >> +if ( unlikely(mop->u.mov_to_cr.index > 3) ) > >> = ARRAY_SIZE(ad->monitor.write_ctrlreg_mask) > > ? Yes, that'd be the right way to do it. :) V2 coming up in a second. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH] xen-detect: handle asprintf error
Otherwise gcc with -Wunused will complain the return value is not used. Reported-by: Olaf Hering Signed-off-by: Wei Liu --- Cc: Ian Jackson Cc: Olaf Hering Cc: Juergen Gross --- tools/misc/xen-detect.c | 21 + 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/tools/misc/xen-detect.c b/tools/misc/xen-detect.c index a62747d316..fd187a4be6 100644 --- a/tools/misc/xen-detect.c +++ b/tools/misc/xen-detect.c @@ -90,8 +90,15 @@ static int check_for_xen(int pv_context) found: cpuid(base + 1, regs, pv_context); if ( regs[0] ) -asprintf(&ver, "V%u.%u", - (uint16_t)(regs[0] >> 16), (uint16_t)regs[0]); +{ +int r = asprintf(&ver, "V%u.%u", (uint16_t)(regs[0] >> 16), + (uint16_t)regs[0]); +if ( r < 0 ) +{ +perror("asprintf failed\n"); +exit(EXIT_FAILURE); +} +} return regs[0]; } @@ -193,8 +200,14 @@ static enum guest_type check_sysfs(void) if ( tmp ) tmp[strlen(tmp) - 1] = 0; if ( str && tmp ) -asprintf(&ver, "V%s.%s", str, tmp); -else +{ +int r = asprintf(&ver, "V%s.%s", str, tmp); +if ( r < 0 ) +{ +perror("asprintf failed\n"); +exit(EXIT_FAILURE); +} +} else ver = strdup("unknown version"); free(tmp); -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
On 21/06/17 15:28, Razvan Cojocaru wrote: > Fixed an issue where the maximum index allowed (31) goes beyond the > actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. > Coverity-ID: 1412966 > > Signed-off-by: Razvan Cojocaru > --- > xen/arch/x86/monitor.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c > index bedf13c..4620b15 100644 > --- a/xen/arch/x86/monitor.c > +++ b/xen/arch/x86/monitor.c > @@ -133,7 +133,7 @@ int arch_monitor_domctl_event(struct domain *d, > bool_t old_status; > > /* sanity check: avoid left-shift undefined behavior */ > -if ( unlikely(mop->u.mov_to_cr.index > 31) ) > +if ( unlikely(mop->u.mov_to_cr.index > 3) ) >= ARRAY_SIZE(ad->monitor.write_ctrlreg_mask) ? ~Andrew > return -EINVAL; > > if ( unlikely(mop->u.mov_to_cr.pad1 || mop->u.mov_to_cr.pad2) ) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] passthrough: give XEN_DOMCTL_test_assign_device more sane semantics
On 21/06/17 11:08, Jan Beulich wrote: > So far callers of the libxc interface passed in a domain ID which was > then ignored in the hypervisor. Instead, make the hypervisor honor it > (accepting DOMID_INVALID to obtain original behavior), allowing to > query whether a device is assigned to a particular domain. Ignore the > passed in domain ID at the libxc layer instead, in order to not break > existing callers. New libxc functions would need to be added if callers > wanted to leverage the new functionality. I don't think your modified description matches the name of the call at all. It looks like the callers expect "test_assign_device" to answer the question: "Can I assign a device to this domain"? It looks like it's meant to be used in XSM environments, to allow a policy to permit or forbid specific guests to have access to specific devices. On a default (non-XSM) system, the answer to that question doesn't depend on the domain it's being assigned to, but only whether the device is already assigned to another domain; but on XSM systems the logic can presumably be more complicated. That sounds like a perfectly sane semantic to me, and this patch removes that ability. I think if you want a hypercall that answers the question, "Is device X assigned to domain Y", I think you should add a new one with a name more like, "test_device_is_assigned_to" or something. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/xen-detect: try sysfs node for obtaining guest type
On Wed, Jun 21, Wei Liu wrote: > Hmm... your gcc seems to be rather strict. :-) Yes, all of them: https://build.opensuse.org/package/show/home:olh:xen-unstable/xen All red. :-) Olaf signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH] x86/monitor: Fixed CID 1412966: Memory - corruptions (OVERRUN)
Fixed an issue where the maximum index allowed (31) goes beyond the actual number of array elements (4) of ad->monitor.write_ctrlreg_mask. Coverity-ID: 1412966 Signed-off-by: Razvan Cojocaru --- xen/arch/x86/monitor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/monitor.c b/xen/arch/x86/monitor.c index bedf13c..4620b15 100644 --- a/xen/arch/x86/monitor.c +++ b/xen/arch/x86/monitor.c @@ -133,7 +133,7 @@ int arch_monitor_domctl_event(struct domain *d, bool_t old_status; /* sanity check: avoid left-shift undefined behavior */ -if ( unlikely(mop->u.mov_to_cr.index > 31) ) +if ( unlikely(mop->u.mov_to_cr.index > 3) ) return -EINVAL; if ( unlikely(mop->u.mov_to_cr.pad1 || mop->u.mov_to_cr.pad2) ) -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/xen-detect: try sysfs node for obtaining guest type
On Wed, Jun 21, 2017 at 04:24:33PM +0200, Olaf Hering wrote: > On Thu, Jun 15, Juergen Gross wrote: > > > +++ b/tools/misc/xen-detect.c > > > +asprintf(&ver, "V%u.%u", > > + (uint16_t)(regs[0] >> 16), (uint16_t)regs[0]); > > > +asprintf(&ver, "V%s.%s", str, tmp); > > This fails to build: > > xen-detect.c:196:17: error: ignoring return value of 'asprintf', declared > with attribute warn_unused_result [-Werror=unused-result] > xen-detect.c:93:17: error: ignoring return value of 'asprintf', declared with > attribute warn_unused_result [-Werror=unused-result] Hmm... your gcc seems to be rather strict. :-) > > Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/xen-detect: try sysfs node for obtaining guest type
On Thu, Jun 15, Juergen Gross wrote: > +++ b/tools/misc/xen-detect.c > +asprintf(&ver, "V%u.%u", > + (uint16_t)(regs[0] >> 16), (uint16_t)regs[0]); > +asprintf(&ver, "V%s.%s", str, tmp); This fails to build: xen-detect.c:196:17: error: ignoring return value of 'asprintf', declared with attribute warn_unused_result [-Werror=unused-result] xen-detect.c:93:17: error: ignoring return value of 'asprintf', declared with attribute warn_unused_result [-Werror=unused-result] Olaf signature.asc Description: PGP signature ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 1/2] x86/monitor: add masking support for write_ctrlreg events
On Wed, Jun 21, 2017 at 05:13:29PM +0300, Razvan Cojocaru wrote: > On 06/21/2017 04:58 PM, Wei Liu wrote: > > On Mon, Jun 19, 2017 at 03:24:38PM +0300, Petre Pircalabu wrote: > >> Add support for filtering out the write_ctrlreg monitor events if they > >> are generated only by changing certains bits. > >> A new parameter (bitmask) was added to the xc_monitor_write_ctrlreg > >> function in order to mask the event generation if the changed bits are > >> set. > >> > >> Signed-off-by: Petre Pircalabu > >> Acked-by: Tamas K Lengyel > > > > Coverity isn't happy with this patch. > > > > It seems to me there is indeed a risk to overrun the buffer (4 in size) > > because > > the caller can specify index up to 31. > > > > ** CID 1412966: Memory - corruptions (OVERRUN) > > > > > > /xen/arch/x86/monitor.c: 162 in arch_monitor_domctl_event() > > > > > > > > > > > > > > > > > > > > > > *** CID 1412966: Memory - corruptions (OVERRUN) > > > > > > /xen/arch/x86/monitor.c: 162 in arch_monitor_domctl_event() > > > > > > 156 ad->monitor.write_ctrlreg_onchangeonly |= > > ctrlreg_bitmask; > > > > 157 else > > > > > > 158 ad->monitor.write_ctrlreg_onchangeonly &= > > ~ctrlreg_bitmask; > > > > 159 > > > > > > 160 if ( requested_status ) > > > > > > 161 { > > > > > CID 1412966: Memory - corruptions (OVERRUN) > > > Overrunning array "ad->monitor.write_ctrlreg_mask" of 4 8-byte > elements at element index 31 (byte offset 248) using index > "mop->u.mov_to_cr.index" > > (which evaluates to 31). > > > > > > 162 ad->monitor.write_ctrlreg_mask[mop->u.mov_to_cr.index] > > = mop->u.mov_to_cr.bitmask; > > > > 163 ad->monitor.write_ctrlreg_enabled |= ctrlreg_bitmask; > > > > > > 164 } > > > > > > 165 else > > > > > > 166 { > > > > > > 167 ad->monitor.write_ctrlreg_mask[mop->u.mov_to_cr.index] > > = 0; > > I vaguely remember that 31 was introduced simply as a "reserved" > precaution - we can probably safely please Coverity by simply patching > that code to not go over 3 as an index. > > To Petre's credit, he did notice and propose that we change this value > but I've suggested that we keep the check as-is for the future. My bad. :) > OK. Please submit a patch to fix this. Using 3 should be fine. Please include Coverity-ID: 1412966 in the commit message. > > Thanks, > Razvan > > > ___
[Xen-devel] [OSSTEST PATCH 06/15] sg-run-job: Execute prep-job/RECIPE if it exists
This allows a job to do its own host allocation, for example. Signed-off-by: Ian Jackson --- v2: New patch --- sg-run-job | 3 +++ 1 file changed, 3 insertions(+) diff --git a/sg-run-job b/sg-run-job index 8acb902..cfbc34b 100755 --- a/sg-run-job +++ b/sg-run-job @@ -73,6 +73,9 @@ proc run-job {job} { if {$ok} { setstatus running } +if {$ok && ![catch { info args prep-job/$jobinfo(recipe) }]} \ +{ catching-otherwise fail prep-job/$jobinfo(recipe) } + per-host-ts broken host-install/@(*) ts-host-install-twice per-host-prep -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [OSSTEST PATCH 12/15] sg-run-job: Abolish `anyfailed'
This is very similar to !$ok. anyfailed is set to 1 only by catching-otherwise, which also then sets ok to 0. The differences are solely some sites which set ok to 0 without setting anyfailed to 1: (i) In run-job, if there are escaped steps (ii) In run-job, if any steps were not run due to the skip_testids or truncate_testids runvars job is being truncated due to (iii) If a per-host-ts step fails The only places where anyfailed was used were: * To decide whether to do log capture in build jobs. This happens before (i) and (ii); and (iii) is not applicable because it only applies if $need_xen_hosts is nonempty which is never true if $need_build_host is false. We can simply use $ok instead, without any functional change. * In printing a message to stdout at the end of the job. We update this use site to use $ok and change the message accordingly. So overall, no functional change other than to a message in the transcript. Signed-off-by: Ian Jackson --- sg-run-job | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/sg-run-job b/sg-run-job index 654ba83..b1f94f4 100755 --- a/sg-run-job +++ b/sg-run-job @@ -40,13 +40,12 @@ proc per-host-finish {} { } proc run-job {job} { -global jobinfo builds flight ok truncate need_xen_hosts anyfailed +global jobinfo builds flight ok truncate need_xen_hosts global nested_layers_hosts truncate_globs skip_globs anyskipped set ok 1 set truncate 0 set anyskipped 0 -set anyfailed 0 jobdb::prepare $job set truncate_globs [jobdb::read-runvar $flight $job truncate_testids] @@ -96,7 +95,7 @@ proc run-job {job} { set need_xen_hosts [lunappend nested_layers_hosts] } -if {$need_build_host && $anyfailed} { +if {$need_build_host && !$ok} { run-ts !broken capture-logs ts-logs-capture + host } @@ -119,8 +118,8 @@ proc run-job {job} { if {$need_build_host && $ok} { jobdb::preserve-task 90 } -if {$anyfailed} { -jobdb::logputs stdout "at least one test failed" +if {!$ok} { +jobdb::logputs stdout "job not ok" } } @@ -128,9 +127,9 @@ proc catching-otherwise {failst script} { # Executes $script. # If job is already a failure (ie not $ok), skips it (ie does nothing). # If any Tcl exception is thrown, declares the job a failure. -# (ie sets job status to $failst, and sets anyfailed to 1 and ok to 0) +# (ie sets job status to $failst, and sets ok to 0) -global anyfailed flight jobinfo ok +global flight jobinfo ok if {!$ok} return @@ -139,7 +138,6 @@ proc catching-otherwise {failst script} { } emsg]} { jobdb::logputs stderr "$flight.$jobinfo(job) $script failed: $emsg" set ok 0 -set anyfailed 1 setstatus $failst } } -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [OSSTEST PATCH 10/15] sg-run-job: Spawn ts-syslog-server
Every job now gets a syslog server, which starts up after host allocation. The server's address and port are recorded by ts-syslog-server in a runvar and can be used by subsequent test steps. Signed-off-by: Ian Jackson --- v2: Change mistaken TESTID `.' to `='. `.' means literally `.'. Do not start ts-syslog-server if !$ok. Do not start ts-syslog-server if job is being truncated now. Use new | (stdin pipe) spawn-ts feature; avoids total failure. --- sg-run-job | 6 ++ 1 file changed, 6 insertions(+) diff --git a/sg-run-job b/sg-run-job index 620d89a..66c2c19 100755 --- a/sg-run-job +++ b/sg-run-job @@ -76,6 +76,8 @@ proc run-job {job} { if {$ok && ![catch { info args prep-job/$jobinfo(recipe) }]} \ { catching-otherwise fail prep-job/$jobinfo(recipe) } +if {$ok} { set syslog [spawn-ts broken = | ts-syslog-server] } + if {$ok && $need_build_host} \ { catching-otherwise broken prepare-build-host } @@ -99,6 +101,10 @@ proc run-job {job} { run-ts !broken capture-logs ts-logs-capture + host } +if {[info exists syslog]} { +reap-ts $syslog +} + if {$ok} { if {[jobdb::job-check-escaped-steps $flight $job]} { set ok 0 -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel