Re: [Xen-devel] [PATCH 3/3] AMD/IOMMU: replace a few literal numbers

2020-02-17 Thread Jan Beulich
On 17.02.2020 20:06, Andrew Cooper wrote:
> On 17/02/2020 13:09, Jan Beulich wrote:
>> On 10.02.2020 15:28, Andrew Cooper wrote:
>>> On 05/02/2020 09:43, Jan Beulich wrote:
 Introduce IOMMU_PDE_NEXT_LEVEL_{MIN,MAX} to replace literal 1, 6, and 7
 instances. While doing so replace two uses of memset() by initializers.

 Signed-off-by: Jan Beulich 
>>> This does not look to be an improvement.  IOMMU_PDE_NEXT_LEVEL_MIN is
>>> definitely bogus, and in all cases, a literal 1 is better, because that
>>> is how we describe pagetable levels.
>> I disagree.
> 
> A pagetable walking function which does:
> 
> while ( level > 1 )
> {
>     ...
>     level--;
> }
> 
> is far clearer and easier to follow than hiding 1 behind a constant
> which isn't obviously 1.    Something like LEVEL_4K would at least be
> something that makes sense in context, but a literal one less verbose.
> 
>>  The device table entry's mode field is bounded by 1
>> (min) and 6 (max) for the legitimate values to put there.
> 
> If by 1, you mean 0, then yes.

I don't, no. A value of zero means "translation disabled".

>  Coping properly with a mode of 0 looks
> to be easier than putting in an arbitrary restriction.

Coping with this mode is entirely orthogonal imo.

>>> Something to replace literal 6/7 probably is ok, but doesn't want to be
>>> done like this.
>>>
>>> The majority of the problems here as caused by iommu_pde_from_dfn()'s
>>> silly ABI.  The pt_mfn[] array is problematic (because it is used as a
>>> 1-based array, not 0-based) and useless because both callers only want
>>> the 4k-equivelent mfn.  Fixing the ABI gets rid of quite a lot of wasted
>>> stack space, every use of '1', and every upper bound other than the bug
>>> on and amd_iommu_get_paging_mode().
>> I didn't mean to alter that function's behavior, at the very least
>> not until being certain there wasn't a reason it was coded with this
>> array approach. IOW the alternative to going with this patch
>> (subject to corrections of course) is for me to drop it altogether,
>> keeping the hard-coded numbers in place. Just let me know.
> 
> If you don't want to change the API, then I'll put it on my todo list.
> 
> As previously expressed, this patch on its own is not an improvement IMO.

We disagree here, quite obviously, but well, we'll have to live
with the literal numbers then. I'll drop the patch.

>>> and the IVRS table in Type 10.
>> Which may in turn be absent, i.e. the question of what to use as
>> a default merely gets shifted.
> 
> One of Type 10 or 11 is mandatory for each IOMMU in the system.  One way
> or another, the information is present.

Even for type 10 the description for the field says "If IVinfo[EFRSup] = 0,
this field is Reserved."

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 0/6] x86: fixes/improvements for scratch cpumask

2020-02-17 Thread Jürgen Groß

On 17.02.20 19:43, Roger Pau Monne wrote:

Hello,

Commit:

5500d265a2a8fa63d60c08beb549de8ec82ff7a5
x86/smp: use APIC ALLBUT destination shorthand when possible

Introduced a bogus usage of the scratch cpumask: it was used in a
function that could be called from interrupt context, and hence using
the scratch cpumask there is not safe. Patch #5 is a fix for that usage,
together with also preventing the usage of any per-CPU variables when
send_IPI_mask is called from #MC or #NMI context. Previous patches are
preparatory changes.

Patch #6 adds some debug infrastructure to make sure the scratch cpumask
is used in the right context, and hence should prevent further missuses.


I wonder whether it wouldn't be better to have a common percpu scratch
cpumask handling instead of introducing local ones all over the
hypervisor.

So basically an array of percpu cpumasks allocated when bringing up a
cpu (this spares memory as the masks wouldn't need to cover NR_CPUS
cpus), a percpu counter of the next free index and get_ and put_
functions acting in a lifo manner.

This would help removing all the still existing cpumasks on the stack
and any illegal nesting would be avoided. The only remaining question
would be the size of the array.

Thoughts?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-linus test] 147157: regressions - FAIL

2020-02-17 Thread osstest service owner
flight 147157 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/147157/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-shadow12 guest-start  fail REGR. vs. 133580
 test-amd64-amd64-xl-shadow   12 guest-start  fail REGR. vs. 133580
 test-amd64-amd64-xl-credit2  12 guest-start  fail REGR. vs. 133580
 test-amd64-amd64-xl-multivcpu 12 guest-start fail REGR. vs. 133580
 test-amd64-amd64-xl-credit1  12 guest-start  fail REGR. vs. 133580
 test-arm64-arm64-xl-credit1  12 guest-start  fail REGR. vs. 133580
 test-amd64-amd64-libvirt-pair 24 guest-migrate/dst_host/src_host/debian.repeat 
fail REGR. vs. 133580
 test-arm64-arm64-xl-credit2  12 guest-start  fail REGR. vs. 133580
 test-armhf-armhf-xl-multivcpu 12 guest-start fail REGR. vs. 133580
 test-armhf-armhf-xl-credit1  12 guest-start  fail REGR. vs. 133580
 test-armhf-armhf-xl-credit2  12 guest-start  fail REGR. vs. 133580
 test-arm64-arm64-xl 16 guest-start/debian.repeat fail REGR. vs. 133580
 test-arm64-arm64-xl-xsm 16 guest-start/debian.repeat fail REGR. vs. 133580
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
133580
 test-arm64-arm64-libvirt-xsm 16 guest-start/debian.repeat fail REGR. vs. 133580
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 133580

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 12 guest-start  fail REGR. vs. 133580
 test-armhf-armhf-xl-rtds 12 guest-start  fail REGR. vs. 133580

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-seattle 16 guest-start/debian.repeat fail baseline untested
 test-arm64-arm64-xl-thunderx 16 guest-start/debian.repeat fail baseline 
untested
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 133580
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 133580
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 133580
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 133580
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 133580
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 133580
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 133580
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 133580
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop 

Re: [Xen-devel] CPU Lockup bug with the credit2 scheduler

2020-02-17 Thread Jürgen Groß

On 18.02.20 01:39, Glen wrote:

Hello Sander -

If I might chime in, I'm also experiencing what we believe is the same
problem, and hope I'm not breaking any protocol by sharing a few quick
details...

On Mon, Feb 17, 2020 at 3:46 PM Sander Eikelenboom  wrote:

On 17/02/2020 20:58, Sarah Newman wrote:

On 1/7/20 6:25 AM, Alastair Browne wrote:

So in conclusion, the tests indicate that credit2 might be unstable.
For the time being, we are using credit as the chosen scheduler. We

I don't think there are, but have there been any patches since the 4.13.0 
release which might have fixed problems with credit 2 scheduler? If not,
what would the next step be to isolating the problem - a debug build of Xen or 
something else?
If there are no merged or proposed fixes soon, it may be worth considering 
making the credit scheduler the default again until problems with the
credit2 scheduler are resolved.

I did take a look at Alastair Browne's report your replied to 
(https://lists.xen.org/archives/html/xen-devel/2020-01/msg00361.html)
and I do see some differences:
 - Alastair's machine has multiple sockets, my machines don't.
 - It seems Alastair's config is using ballooning ? 
(dom0_mem=4096M,max:16384M), for me that has been a source of trouble in the 
past, so my configs don't.


My configuration has ballooning disabled, we do not use it, and we
still have the problem.


 - kernel's tested are quite old (4.19.67 (latest upstream is 4.19.104), 
4.9.189 (latest upstream is 4.9.214)) and no really new kernel is tested
   (5.4 is available in Debian backport for buster).
 - Alastair, are you using pv, hvm or pvh guests? The report seems to miss 
the Guest configs (I'm primarily using PVH, and few HVM's, no PV except for 
dom0) ?


The problem appears to occur for both HVM and PV guests.

A report by Tomas
https://lists.xenproject.org/archives/html/xen-users/2020-02/msg00015.html
provides his config for his HVM setup.

My initial report
https://lists.xenproject.org/archives/html/xen-users/2020-02/msg00018.html
contains my PV guest config.


Any how, could be worthwhile to test without ballooning, and test a recent 
kernel to rule out an issue with (missing) kernel backports.


Thanks to guidance from Sarah, we've had lots of discussion on the
users lists about this, especially this past week (pasting in
https://lists.xenproject.org/archives/html/xen-users/2020-02/ just for
your clicking convenience since I'm there as I type this) and it seems
like we've been able to narrow things down a bit:

* Alastair's config is on very large machines.  Tomas can duplicate
this on a much smaller scale, and I can duplicate it on a single DomU
running as the only guest on a Dom0 host.   So overall host
size/capacity doesn't seem to be very important, nor does number of
guests on the host.

* I'm using the Linux 4.12.14 kernel on both host and guest with Xen
4.12.1. - for me, the act of just going to a previous version of Xen
(in my case to Xen 4.10) eliminates the problem.  Tomas is on
4.14.159, and he reports that even moving back just to Xen 4.11
resolves his issue, whereas the issue seems to still exist in Xen
4.13.  So changing Xen versions without changing kernel versions seems
to resolve this.

* We've had another user mention that "When I switched to openSUSE Xen
4.13.0_04 packages with KernelStable (atm, 5.5.3-25.gd654690), Guests
of all 'flavors' became *much* better behaved.", so we think maybe
something in very recent Xen 4.13 might have helped (or possibly that
latest kernel, although from our limited point of view the changing of
Xen versions back to pre-4.12 solcing this without any kernel changes
seems compelling.)

* Tomas has already tested, and I am still testing, Xen 4.12 with just
the sched=credit change.  For him that has eliminated the problem as
well, I am still stress-testing my guest under Xen 4.12 sched=credit,
so I cannot report, but I am hopeful.

I believe this is why Sarah asked about patches to 4.13... it is
looking to us just on the user level like this is possibly
kernel-independent, but at least Xen-version-dependent, and likely
credit-scheduler-dependent.

I apologize if I should be doing something different here, but it is
looking like a few more of us are having what we believe to be the
same problem and, based only on what I've seen, I've already changed
over all of my production hosts (I run about 20) to sched=credit as a
precautionary measure.

Any thoughts, insights or guidance would be greatly appreciated!


Can you check whether all vcpus of a hanging guest are consuming time
(via xl vcpu-list) ?

It would be interesting to see where the vcpus are running around. Can
you please copy the domU's /boot/System.map- to dom0
and then issue:

/usr/lib/xen/bin/xenctx -C -S -s  

This should give a backtrace for all vcpus of . To recognize a
loop you should issue that multiple times.


Juergen

___
Xen-devel mailing list

Re: [Xen-devel] [PATCH 2/8] xen: add using domlist_read_lock in keyhandlers

2020-02-17 Thread Tian, Kevin
> From: Juergen Gross 
> Sent: Thursday, February 13, 2020 8:55 PM
> 
> Using for_each_domain() with out holding the domlist_read_lock is
> fragile, so add the lock in the keyhandlers it is missing.
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen: do live patching only from main idle loop

2020-02-17 Thread Tian, Kevin
> From: Juergen Gross 
> Sent: Tuesday, February 11, 2020 5:31 PM
> 
> One of the main design goals of core scheduling is to avoid actions
> which are not directly related to the domain currently running on a
> given cpu or core. Live patching is one of those actions which are
> allowed taking place on a cpu only when the idle scheduling unit is
> active on that cpu.
> 
> Unfortunately live patching tries to force the cpus into the idle loop
> just by raising the schedule softirq, which will no longer be
> guaranteed to work with core scheduling active. Additionally there are
> still some places in the hypervisor calling check_for_livepatch_work()
> without being in the idle loop.
> 
> It is easy to force a cpu into the main idle loop by scheduling a
> tasklet on it. So switch live patching to use tasklets for switching to
> idle and raising scheduling events. Additionally the calls of
> check_for_livepatch_work() outside the main idle loop can be dropped.
> 
> As tasklets are only running on idle vcpus and stop_machine_run()
> is activating tasklets on all cpus but the one it has been called on
> to rendezvous, it is mandatory for stop_machine_run() to be called on
> an idle vcpu, too, as otherwise there is no way for scheduling to
> activate the idle vcpu for the tasklet on the sibling of the cpu
> stop_machine_run() has been called on.
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] HOW TO USE XENTRACE TO FIND DOM0 INFORMATION

2020-02-17 Thread SHREYA JAIN
I am working on a project related to hypervisor.I used the command
xentrace -d -e 0xf000-T 20 trace.bin
xnalyze trace.bin > x.txt

HOW TO ANALYZE THIS FILE AND TO GET WHAT HYPERCALL THIS HYPERCALL NUMBER
CORRESPOND TO
Total time: 9.98 seconds (using cpu speed 2.40 GHz)
--- Log volume summary ---
 - cpu 0 -
 gen   :480
 sched : 931788
 +-verbose: 665004
 hvm   : 12
 mem   : 96
 pv:7387440
 hw: 235524
|-- Domain 0 --|
 Runstates:
   blocked: 293  0.15s 1232141 {105240|7148524|31202268}
  full run: 324  1.11s 8212027 {21120712|71848352|71848352}
  partial contention:2844  1.23s 1041739 { 48504|8711020|16668336}
  concurrency_hazard:5100  7.47s 3515838 {7625856|71845720|71845720}
  full_contention:1063  0.01s  25305 {  9708|2692036|9898548}
 Grant table ops:
  Done by:
  Done for:
 Populate-on-demand:
  Populated:
  Reclaim order:
  Reclaim contexts:
-- v0 --
 Runstates:
   running: 496  2.00s 9666478 {196660|7196|71848352}
  runnable: 638  7.64s 28745046 {93745884|800175728|1332287024}
wake:  58  0.51s 21159872 {688272344|688272344|688272344}
 preempt:  18  0.24s 32003641 {575114260|575114260|575114260}
   other: 562  6.89s 29423489 {56741048|1332287024|1332287024}
   blocked: 437  0.21s 1131642 {2227148|31202268|31202268}
   offline:   2  0.00s  14342 { 14908| 14908| 14908}
 cpu affinity:   1 23934153168 {23934153168|23934153168|23934153168}
   [0]:   1 23934153168 {23934153168|23934153168|23934153168}
PV events:
  page_fault  1005
  math state restore  18
  ptwr  437
  hypercall  18940
mmu_update   [ 1]:   1093
stack_switch [ 3]:   2139
multicall[13]:  1
update_va_mapping[14]:  1
xen_version  [17]: 25
iret [23]:  10501
vcpu_op  [24]:   1161
set_segment_base [25]:   1569
mmuext_op[26]:   1341
sched_op [29]:451
evtchn_op[32]:548
physdev_op   [33]:110
  *hypercall (subcall)  2*
-- v1 --
 Runstates:
   running: 674  2.32s 8253727 {23224640|71845964|71955400}
  runnable:1161  6.11s 12624440 {16853932|769289692|1010395252}
wake: 301  0.29s 2296535 {62882020|571958924|571958924}
 preempt:  13  0.15s 28203032 {279082744|279082744|279082744}
   other: 847  5.67s 16055582 {21473244|229039264|1010395252}
   blocked: 604  0.29s 1158752 {1609692|4310964|9690508}
   offline:   1  0.00s  16020 { 16020| 16020| 16020}
 cpu affinity:   1 20919890036 {20919890036|20919890036|20919890036}
   [0]:   1 20919890036 {20919890036|20919890036|20919890036}
PV events:
  page_fault  17979
  emulate privop  692
  math state restore  43
  ptwr  9947
  hypercall  *150054*
mmu_update   [ 1]:  61475
stack_switch [ 3]:   4861
memory_op[12]:  3
multicall[13]:975
update_va_mapping[14]:   4561
xen_version  [17]:152
grant_table_op   [20]:  3
iret [23]:  58997
vcpu_op  [24]:   1197
set_segment_base [25]:   4252
mmuext_op[26]:  11353
acm_op   [27]:  9
sched_op [29]:617
evtchn_op[32]:965
physdev_op   [33]:604
sysctl   [35]: 21
domctl   [36]:  9
  hypercall (subcall) * 7924*
-- v2 --
 Runstates:
   running: 106  0.27s 6001367 {32532916|71841040|71843144}
  runnable: 402  4.65s 27763825 {176048840|726964484|3539128564}
wake:  24  0.18s 18023801 {285953912|285953912|285953912}
 preempt:  13  0.01s 1569969 {20284976|20284976|20284976}
   other: 365  4.46s 29337197 {78401584|3539128564|3539128564}
   blocked:  86  0.05s 1427216 {3750688|7241324|43684308}
 cpu affinity:   1 12016972924 {12016972924|12016972924|12016972924}
   [0]:   1 12016972924 {12016972924|12016972924|12016972924}
PV events:
  page_fault  1236
  math state restore  10
  ptwr  615
  hypercall  16813
mmu_update   [ 1]:   7607
stack_switch [ 3]:643
multicall[13]:189
update_va_mapping[14]:598
xen_version  [17]:  9
iret [23]:   5205
vcpu_op  [24]:164
set_segment_base [25]:534
mmuext_op[26]:   1506
sched_op [29]: 99
evtchn_op[32]:107
physdev_op   [33]:152
  hypercall (subcall)  *952*
-- v3 --
 

Re: [Xen-devel] [PATCH] x86/p2m: drop p2m_access_t parameter from set_mmio_p2m_entry()

2020-02-17 Thread Tian, Kevin
> From: Jan Beulich 
> Sent: Thursday, February 6, 2020 11:20 PM
> 
> Both callers request the host P2M's default access, which can as well be
> done inside the function. While touching this anyway, make the "gfn"
> parameter type-safe as well.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] VT-d: drop stray "list" field from struct user_rmrr

2020-02-17 Thread Tian, Kevin
> From: Jan Beulich 
> Sent: Thursday, February 6, 2020 11:35 PM
> 
> The field looks to have been bogusly added by the patch introducing the
> struct (431685e8deb6 "VT-d: add command line option for extra rmrrs").
> 
> Signed-off-by: Jan Beulich 
> 

Reviewed-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] VT-d: adjust logging of RMRRs

2020-02-17 Thread Tian, Kevin
> From: Jan Beulich 
> Sent: Thursday, February 6, 2020 9:31 PM
> To: xen-devel@lists.xenproject.org
> Cc: Tian, Kevin 
> Subject: [PATCH 2/2] VT-d: adjust logging of RMRRs
> 
> Consistently use [,] range representation, shrink leading double blanks
> to a single one, and slightly adjust text in some cases.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/2] VT-d: check all of an RMRR for being E820-reserved

2020-02-17 Thread Tian, Kevin
> From: Jan Beulich 
> Sent: Thursday, February 6, 2020 9:31 PM
> 
> Checking just the first and last page is not sufficient (and redundant
> for single-page regions). As we don't need to care about IA64 anymore,
> use an x86-specific function to get this done without looping over each
> individual page.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] CPU Lockup bug with the credit2 scheduler

2020-02-17 Thread Glen
Hello Sander -

If I might chime in, I'm also experiencing what we believe is the same
problem, and hope I'm not breaking any protocol by sharing a few quick
details...

On Mon, Feb 17, 2020 at 3:46 PM Sander Eikelenboom  wrote:
> On 17/02/2020 20:58, Sarah Newman wrote:
> > On 1/7/20 6:25 AM, Alastair Browne wrote:
> >> So in conclusion, the tests indicate that credit2 might be unstable.
> >> For the time being, we are using credit as the chosen scheduler. We
> > I don't think there are, but have there been any patches since the 4.13.0 
> > release which might have fixed problems with credit 2 scheduler? If not,
> > what would the next step be to isolating the problem - a debug build of Xen 
> > or something else?
> > If there are no merged or proposed fixes soon, it may be worth considering 
> > making the credit scheduler the default again until problems with the
> > credit2 scheduler are resolved.
> I did take a look at Alastair Browne's report your replied to 
> (https://lists.xen.org/archives/html/xen-devel/2020-01/msg00361.html)
> and I do see some differences:
> - Alastair's machine has multiple sockets, my machines don't.
> - It seems Alastair's config is using ballooning ? 
> (dom0_mem=4096M,max:16384M), for me that has been a source of trouble in the 
> past, so my configs don't.

My configuration has ballooning disabled, we do not use it, and we
still have the problem.

> - kernel's tested are quite old (4.19.67 (latest upstream is 4.19.104), 
> 4.9.189 (latest upstream is 4.9.214)) and no really new kernel is tested
>   (5.4 is available in Debian backport for buster).
> - Alastair, are you using pv, hvm or pvh guests? The report seems to miss 
> the Guest configs (I'm primarily using PVH, and few HVM's, no PV except for 
> dom0) ?

The problem appears to occur for both HVM and PV guests.

A report by Tomas
https://lists.xenproject.org/archives/html/xen-users/2020-02/msg00015.html
provides his config for his HVM setup.

My initial report
https://lists.xenproject.org/archives/html/xen-users/2020-02/msg00018.html
contains my PV guest config.

> Any how, could be worthwhile to test without ballooning, and test a recent 
> kernel to rule out an issue with (missing) kernel backports.

Thanks to guidance from Sarah, we've had lots of discussion on the
users lists about this, especially this past week (pasting in
https://lists.xenproject.org/archives/html/xen-users/2020-02/ just for
your clicking convenience since I'm there as I type this) and it seems
like we've been able to narrow things down a bit:

* Alastair's config is on very large machines.  Tomas can duplicate
this on a much smaller scale, and I can duplicate it on a single DomU
running as the only guest on a Dom0 host.   So overall host
size/capacity doesn't seem to be very important, nor does number of
guests on the host.

* I'm using the Linux 4.12.14 kernel on both host and guest with Xen
4.12.1. - for me, the act of just going to a previous version of Xen
(in my case to Xen 4.10) eliminates the problem.  Tomas is on
4.14.159, and he reports that even moving back just to Xen 4.11
resolves his issue, whereas the issue seems to still exist in Xen
4.13.  So changing Xen versions without changing kernel versions seems
to resolve this.

* We've had another user mention that "When I switched to openSUSE Xen
4.13.0_04 packages with KernelStable (atm, 5.5.3-25.gd654690), Guests
of all 'flavors' became *much* better behaved.", so we think maybe
something in very recent Xen 4.13 might have helped (or possibly that
latest kernel, although from our limited point of view the changing of
Xen versions back to pre-4.12 solcing this without any kernel changes
seems compelling.)

* Tomas has already tested, and I am still testing, Xen 4.12 with just
the sched=credit change.  For him that has eliminated the problem as
well, I am still stress-testing my guest under Xen 4.12 sched=credit,
so I cannot report, but I am hopeful.

I believe this is why Sarah asked about patches to 4.13... it is
looking to us just on the user level like this is possibly
kernel-independent, but at least Xen-version-dependent, and likely
credit-scheduler-dependent.

I apologize if I should be doing something different here, but it is
looking like a few more of us are having what we believe to be the
same problem and, based only on what I've seen, I've already changed
over all of my production hosts (I run about 20) to sched=credit as a
precautionary measure.

Any thoughts, insights or guidance would be greatly appreciated!

Respectfully,
Glen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 1/7] SVM: drop asm/hvm/emulate.h inclusion from vmcb.h

2020-02-17 Thread Tian, Kevin
> From: Jan Beulich 
> Sent: Saturday, February 1, 2020 12:42 AM
> 
> It's not needed there and introduces a needless, almost global
> dependency. Include the file (or in some cases just xen/err.h) where
> actually needed, or - in one case - simply forward-declare a struct. In
> microcode*.c take the opportunity and also re-order a few other
> #include-s.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Kevin Tian 
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [ovmf test] 147160: regressions - FAIL

2020-02-17 Thread osstest service owner
flight 147160 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/147160/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 
145767
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 145767

version targeted for testing:
 ovmf f1d78c489a39971b5aac5d2fc8a39bfa925c3c5d
baseline version:
 ovmf 70911f1f4aee0366b6122f2b90d367ec0f066beb

Last test of basis   145767  2020-01-08 00:39:09 Z   41 days
Failing since145774  2020-01-08 02:50:20 Z   41 days  126 attempts
Testing same since   147093  2020-02-15 16:19:17 Z2 days2 attempts


People who touched revisions under test:
  Aaron Li 
  Albecki, Mateusz 
  Amol N Sukerkar 
  Anthony PERARD 
  Antoine Coeur 
  Ard Biesheuvel 
  Ashish Singhal 
  Bob Feng 
  Bret Barkelew 
  Brian R Haug 
  Chasel Chiu 
  Dandan Bi 
  Eric Dong 
  Fan, ZhijuX 
  Felix Polyudov 
  Guo Dong 
  GuoMinJ 
  Hao A Wu 
  Heng Luo 
  Jason Voelz 
  Jeff Brasen 
  Jian J Wang 
  Kinney, Michael D 
  Krzysztof Koch 
  Laszlo Ersek 
  Leif Lindholm 
  Li, Aaron 
  Liming Gao 
  Liu, Zhiguang 
  Mateusz Albecki 
  Matthew Carlson 
  Michael D Kinney 
  Michael Kubacki 
  Pavana.K 
  Philippe Mathieu-Daud? 
  Philippe Mathieu-Daude 
  Philippe Mathieu-Daudé 
  Philippe Mathieu-Daudé 
  Pierre Gondois 
  Ray Ni 
  Sami Mujawar 
  Sean Brogan 
  Siyuan Fu 
  Siyuan, Fu 
  Steven 
  Steven Shi 
  Sudipto Paul 
  Vitaly Cheptsov 
  Vitaly Cheptsov via Groups.Io 
  Wei6 Xu 
  Xu, Wei6 
  Zhichao Gao 
  Zhiguang Liu 
  Zhiju.Fan 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 5317 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [xen-unstable-smoke test] 147218: tolerable all pass - PUSHED

2020-02-17 Thread osstest service owner
flight 147218 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/147218/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  c9727280da893b57a4eb33de26fbc6669410eabb
baseline version:
 xen  8171e0796542e11c2d5067f86cc69201c2584501

Last test of basis   147213  2020-02-17 20:03:31 Z0 days
Testing same since   147218  2020-02-17 23:07:32 Z0 days1 attempts


People who touched revisions under test:
  George Dunlap 
  Julien Grall 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   8171e07965..c9727280da  c9727280da893b57a4eb33de26fbc6669410eabb -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] panic: "IO-APIC + timer doesn't work" - several people have reproduced

2020-02-17 Thread Andrew Cooper

On 17/02/2020 20:41, Jason Andryuk wrote:

On Mon, Feb 17, 2020 at 2:46 PM Andrew Cooper  wrote:

On 17/02/2020 19:19, Jason Andryuk wrote:

enabling vecOn Tue, Dec 31, 2019 at 5:43 AM Aaron Janse  wrote:

On Tue, Dec 31, 2019, at 12:27 AM, Andrew Cooper wrote:

Is there any full boot log in the bad case?  Debugging via divination
isn't an effective way to get things done.

Agreed. I included some more verbose logs towards the end of the email (typed 
up by hand).

Attached are pictures from a slow-motion video of my laptop booting. Note that 
I also included a picture of a stack trace that happens immediately before 
reboot. It doesn't look related, but I wanted to include it anyway.

I think the original email should have said "4.8.5" instead of "4.0.5." 
Regardless, everyone on this mailing list can now see all the boot logs that I've seen.

Attaching a serial console seems like it would be difficult to do on this 
laptop, otherwise I would have sent the logs as a txt file.

I'm seeing Xen panic: "IO-APIC + timer doesn't work" on a Dell
Latitude 7200 2-in-1.  Fedora 31 Live USB image boots successfully.
No way to get serial output.  I manually recreated the output before
from the vga display.

We have multiple bugs.

First and foremost, Xen seems totally broken when running in ExtINT
mode.  This needs addressing, and ought to be sufficient to let Xen
boot, at which point we can try to figure out why it is trying to fall
back into 486(ish) compatibility mode.


I tested Linux with intel_iommu=on and that booted successfully.
Under Xen, this system sets iommu_x2apic_enabled = true, so
force_iommu is set and iommu=0 cannot disable the iommu.
fails.  Oh, I can disable x2apic and then disable iommu

x2apic=1 -> failure above
x2apic=0 iommu=0 -> failure above
clocksource=acpi -> doesn't help
clocksource=pit -> hangs after "load tracking window length 1073741824 ns"

None of these are surprising, given that Xen can't make any interrupts
work at all.


noapic -> BUG in init_bsp_APIC

This is a surprise.  Its clearly a bug in Xen.  (OTOH, I've been
threatening to rip all of that logic out, because there is no such thing
as a 64bit capable system without an integrated APIC.)

It's a GPF [error_code=] at init_bsp_APIC+0x53 which is
0x82d080428f86 <+64>:je 0x82d080428fc9 
0x82d080428f88 <+66>:or $0xff,%al
0x82d080428f8a <+68>:test   %sil,%sil
0x82d080428f8d <+71>:je 0x82d080428fd8 
0x82d080428f8f <+73>:mov$0x80f,%ecx
0x82d080428f94 <+78>:mov$0x0,%edx
0x82d080428f99 <+83>:wrmsr

RAX is 0x3ff

This is immediately after Xen prints "Switched to APIC driver x2apic_cluster"


Hmm, in which case it isn't a BUG specifically, but merely a crash. 
0x3ff to SPIV is trying to set reserved bits, so it is no surprise that 
there is a #GP.


In which case this can safely be filed under "even more collateral 
damage from failing to set up any kind of interrupt handling".



One other thing that might be noteworthy.  Linux only prints ACPI IRQ0
and IRQ9 used by override where Xen lists IRQ 0, 2 & 9.

Huh - this is supposed to come directly from the ACPI tables, so Linux
and Xen should be using the same source of information.


Below is the re-constructed Xen console output.  The SMBIOS line is
the first thing displayed on the VGA output.

Yes - it is the first thing printed after vesa_init() which I think is a
manifestation of a previous EFI bug I've reported.  Does booting with
-basevideo help?  (No need to transcribe the output, manually.  Just
need to know if it lets you see the full log.)

I'm booting grub->xen.gz so -basevideo isn't directly applicable.  My
attempt at setting a boot entry failed, so I'll have to try that
again.


Ah ok.  One thing which Xen(.gz) needs to do is to take video details 
from the bootloader rather than trying to figure them out itself.


By default, Xen.gz will try and write into the legacy vga range which 
most likely isn't working in an EFI system.


(As a slight tangent, It is possible to test xen.efi via grub with a 
suitable chainloader stanza, but xen.efi is deficient in enough 
important ways that I'd avoid it unless absolutely necessary.)


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-4.19 test] 147144: regressions - FAIL

2020-02-17 Thread osstest service owner
flight 147144 linux-4.19 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/147144/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
142932
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 142932

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop fail like 142932
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass

version targeted for testing:
 linux9b15f7fae677336e04b9e026ff91854e43165455
baseline version:
 linuxc3038e718a19fc596f7b1baba0f83d5146dc7784

Last test of basis   142932  2019-10-19 23:17:10 Z  121 days
Failing since143326  2019-10-29 08:49:29 Z  111 days   11 attempts
Testing same since   147075  2020-02-15 05:44:56 Z2 days2 attempts


1786 people touched 

[Xen-devel] [xen-unstable test] 147140: tolerable FAIL

2020-02-17 Thread osstest service owner
flight 147140 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/147140/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 147069
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 147069
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 147069
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 147069
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 147069
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail like 
147069
 test-armhf-armhf-xl-rtds 16 guest-start/debian.repeatfail  like 147069
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 147069
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 147069
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 147069
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 147069
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 147069
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  707db77a380b96025bae8bc4322da0b64819d3b7
baseline version:
 xen  707db77a380b96025bae8bc4322da0b64819d3b7

Last test of basis   147140  2020-02-16 14:57:28 Z1 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 

Re: [Xen-devel] CPU Lockup bug with the credit2 scheduler

2020-02-17 Thread Sander Eikelenboom
On 17/02/2020 20:58, Sarah Newman wrote:
> On 1/7/20 6:25 AM, Alastair Browne wrote:
>>
>> CONCLUSION
>>
>> So in conclusion, the tests indicate that credit2 might be unstable.
>>
>> For the time being, we are using credit as the chosen scheduler. We
>> are booting the kernel with a parameter "sched=credit" to ensure that
>> the correct scheduler is used.
>>
>> After the tests, we decided to stick with 4.9.0.9 kernel and 4.12 Xen
>> for production use running credit1 as the default scheduler.
> 
> One person CC'ed appears to be having the same experience, where the credit2 
> scheduler leads to lockups (in this case in the domU, not the dom0) under 
> relatively heavy load. It seems possible they may have the same root cause.
> 
> I don't think there are, but have there been any patches since the 4.13.0 
> release which might have fixed problems with credit 2 scheduler? If not, 
> what would the next step be to isolating the problem - a debug build of Xen 
> or something else?
> 
> If there are no merged or proposed fixes soon, it may be worth considering 
> making the credit scheduler the default again until problems with the 
> credit2 scheduler are resolved.
> 
> Thanks, Sarah
> 
> 

Hi Sarah / Alastair,

I can only provide my n=1 (OK, I'm running a bunch of boxes, some of which 
pretty over-committed CPU wise), 
but I haven't seen any issues (lately) with credit2.

I did take a look at Alastair Browne's report your replied to 
(https://lists.xen.org/archives/html/xen-devel/2020-01/msg00361.html)
and I do see some differences:
- Alastair's machine has multiple sockets, my machines don't.
- It seems Alastair's config is using ballooning ? 
(dom0_mem=4096M,max:16384M), for me that has been a source of trouble in the 
past, so my configs don't.
- kernel's tested are quite old (4.19.67 (latest upstream is 4.19.104), 
4.9.189 (latest upstream is 4.9.214)) and no really new kernel is tested
  (5.4 is available in Debian backport for buster). 
- Alastair, are you using pv, hvm or pvh guests? The report seems to miss 
the Guest configs (I'm primarily using PVH, and few HVM's, no PV except for 
dom0) ?

Any how, could be worthwhile to test without ballooning, and test a recent 
kernel to rule out an issue with (missing) kernel backports.

--
Sander

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [OSSTEST PATCH V2] build: fix configuration of libvirt

2020-02-17 Thread Jim Fehlig

On 2/14/20 10:47 AM, Ian Jackson wrote:

Jim Fehlig writes ("[OSSTEST PATCH V2] build: fix configuration of libvirt"):

libvirt.git commit 2621d48f00 removed the last traces of gnulib, which
also removed the '--no-git' option from autogen.sh. Unknown options are
now passed to the configure script, which quickly fails with

   configure: error: unrecognized option: `--no-git'

Remove the gnulib handling from ts-libvirt-build, including the '--no-git'
option to autogen.sh. While at it remove configure options no longer
supported by the libvirt configure script.


Harmf.  Thanks for looking into this and trying to fix this mess.

I think there is a problem with your patch, which is that 2621d48f00
is recent enough that we might want still to be able to build with
earlier versions.


Ah, good point.


Is there an easy way to tell (by looking at the tree after checkout,
maybe) whether to do the old or the new thing ?


There would be no gnulib directory in a tree checked out after commit 
2621d48f00. Another option is to check for the 'bootstrap' script in the root of 
the tree, which was removed by 2621d48f00.



Your perl code looks good to me for what it is trying to do.


I'm afraid my perl is too weak to quickly hack something up to support both pre 
and post gnulib builds :-(. I'll add this task to my list if you don't have time 
for it.


Regards,
Jim

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks for PM suspend and hibernation

2020-02-17 Thread Anchal Agarwal
On Mon, Feb 17, 2020 at 11:05:09AM +0100, Roger Pau Monné wrote:
> On Fri, Feb 14, 2020 at 11:25:34PM +, Anchal Agarwal wrote:
> > From: Munehisa Kamata  > 
> > Add freeze, thaw and restore callbacks for PM suspend and hibernation
> > support. All frontend drivers that needs to use PM_HIBERNATION/PM_SUSPEND
> > events, need to implement these xenbus_driver callbacks.
> > The freeze handler stops a block-layer queue and disconnect the
> > frontend from the backend while freeing ring_info and associated resources.
> > The restore handler re-allocates ring_info and re-connect to the
> > backend, so the rest of the kernel can continue to use the block device
> > transparently. Also, the handlers are used for both PM suspend and
> > hibernation so that we can keep the existing suspend/resume callbacks for
> > Xen suspend without modification. Before disconnecting from backend,
> > we need to prevent any new IO from being queued and wait for existing
> > IO to complete.
> 
> This is different from Xen (xenstore) initiated suspension, as in that
> case Linux doesn't flush the rings or disconnects from the backend.
Yes, AFAIK in xen initiated suspension backend takes care of it. 
> 
> This is done so that in case suspensions fails the recovery doesn't
> need to reconnect the PV devices, and in order to speed up suspension
> time (ie: waiting for all queues to be flushed can take time as Linux
> supports multiqueue, multipage rings and indirect descriptors), and
> the backend could be contended if there's a lot of IO pressure from
> guests.
> 
> Linux already keeps a shadow of the ring contents, so in-flight
> requests can be re-issued after the frontend has reconnected during
> resume.
> 
> > Freeze/unfreeze of the queues will guarantee that there
> > are no requests in use on the shared ring.
> > 
> > Note:For older backends,if a backend doesn't have commit'12ea729645ace'
> > xen/blkback: unmap all persistent grants when frontend gets disconnected,
> > the frontend may see massive amount of grant table warning when freeing
> > resources.
> > [   36.852659] deferring g.e. 0xf9 (pfn 0x)
> > [   36.855089] xen:grant_table: WARNING:e.g. 0x112 still in use!
> > 
> > In this case, persistent grants would need to be disabled.
> > 
> > [Anchal Changelog: Removed timeout/request during blkfront freeze.
> > Fixed major part of the code to work with blk-mq]
> > Signed-off-by: Anchal Agarwal 
> > Signed-off-by: Munehisa Kamata 
> > ---
> >  drivers/block/xen-blkfront.c | 119 ---
> >  1 file changed, 112 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> > index 478120233750..d715ed3cb69a 100644
> > --- a/drivers/block/xen-blkfront.c
> > +++ b/drivers/block/xen-blkfront.c
> > @@ -47,6 +47,8 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> >  
> >  #include 
> >  #include 
> > @@ -79,6 +81,8 @@ enum blkif_state {
> > BLKIF_STATE_DISCONNECTED,
> > BLKIF_STATE_CONNECTED,
> > BLKIF_STATE_SUSPENDED,
> > +   BLKIF_STATE_FREEZING,
> > +   BLKIF_STATE_FROZEN
> >  };
> >  
> >  struct grant {
> > @@ -220,6 +224,7 @@ struct blkfront_info
> > struct list_head requests;
> > struct bio_list bio_list;
> > struct list_head info_list;
> > +   struct completion wait_backend_disconnected;
> >  };
> >  
> >  static unsigned int nr_minors;
> > @@ -261,6 +266,7 @@ static DEFINE_SPINLOCK(minor_lock);
> >  static int blkfront_setup_indirect(struct blkfront_ring_info *rinfo);
> >  static void blkfront_gather_backend_features(struct blkfront_info *info);
> >  static int negotiate_mq(struct blkfront_info *info);
> > +static void __blkif_free(struct blkfront_info *info);
> >  
> >  static int get_id_from_freelist(struct blkfront_ring_info *rinfo)
> >  {
> > @@ -995,6 +1001,7 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, 
> > u16 sector_size,
> > info->sector_size = sector_size;
> > info->physical_sector_size = physical_sector_size;
> > blkif_set_queue_limits(info);
> > +   init_completion(>wait_backend_disconnected);
> >  
> > return 0;
> >  }
> > @@ -1218,6 +1225,8 @@ static void xlvbd_release_gendisk(struct 
> > blkfront_info *info)
> >  /* Already hold rinfo->ring_lock. */
> >  static inline void kick_pending_request_queues_locked(struct 
> > blkfront_ring_info *rinfo)
> >  {
> > +   if (unlikely(rinfo->dev_info->connected == BLKIF_STATE_FREEZING))
> > +   return;
> > if (!RING_FULL(>ring))
> > blk_mq_start_stopped_hw_queues(rinfo->dev_info->rq, true);
> >  }
> > @@ -1341,8 +1350,6 @@ static void blkif_free_ring(struct blkfront_ring_info 
> > *rinfo)
> >  
> >  static void blkif_free(struct blkfront_info *info, int suspend)
> >  {
> > -   unsigned int i;
> > -
> > /* Prevent new requests being issued until we fix things up. */
> > info->connected = suspend ?
> > BLKIF_STATE_SUSPENDED : BLKIF_STATE_DISCONNECTED;
> 

[Xen-devel] [xen-unstable-smoke test] 147213: tolerable all pass - PUSHED

2020-02-17 Thread osstest service owner
flight 147213 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/147213/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  8171e0796542e11c2d5067f86cc69201c2584501
baseline version:
 xen  707db77a380b96025bae8bc4322da0b64819d3b7

Last test of basis   147063  2020-02-14 21:10:52 Z3 days
Testing same since   147213  2020-02-17 20:03:31 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Ian Jackson 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   707db77a38..8171e07965  8171e0796542e11c2d5067f86cc69201c2584501 -> smoke

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] xen/arm: Workaround clang/armclang support for register allocation

2020-02-17 Thread Julien Grall
Clang 8.0 (see [1]) and by extent some of the version of armclang does
not support register allocation using the syntax rN.

Thankfully, both GCC [2] and clang are able to support the xN syntax for
Arm64. Introduce a new macro ASM_REG() and use in common code for
register allocation.

[1] https://reviews.llvm.org/rL328829
[2] https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html

Cc: Andrii Anisov 
Signed-off-by: Julien Grall 
---
 xen/include/asm-arm/asm_defns.h |  8 +++-
 xen/include/asm-arm/smccc.h | 74 -
 2 files changed, 44 insertions(+), 38 deletions(-)

diff --git a/xen/include/asm-arm/asm_defns.h b/xen/include/asm-arm/asm_defns.h
index b4fbcdae1d..29a9dbb002 100644
--- a/xen/include/asm-arm/asm_defns.h
+++ b/xen/include/asm-arm/asm_defns.h
@@ -7,11 +7,17 @@
 #endif
 #include 
 
-/* For generic assembly code: use macros to define operand sizes. */
+/* Macros for generic assembly code */
 #if defined(CONFIG_ARM_32)
 # define __OP32
+# define ASM_REG(index) asm("r" # index)
 #elif defined(CONFIG_ARM_64)
 # define __OP32 "w"
+/*
+ * Clang < 8.0 doesn't support register alllocation using the syntax rN.
+ * See https://reviews.llvm.org/rL328829.
+ */
+# define ASM_REG(index) asm("x" # index)
 #else
 # error "unknown ARM variant"
 #endif
diff --git a/xen/include/asm-arm/smccc.h b/xen/include/asm-arm/smccc.h
index 126399dd70..9d94beb3df 100644
--- a/xen/include/asm-arm/smccc.h
+++ b/xen/include/asm-arm/smccc.h
@@ -120,59 +120,59 @@ struct arm_smccc_res {
 #define __constraint_read_6 __constraint_read_5, "r" (r6)
 #define __constraint_read_7 __constraint_read_6, "r" (r7)
 
-#define __declare_arg_0(a0, res)\
-struct arm_smccc_res*___res = res;  \
-register unsigned long  r0 asm("r0") = (uint32_t)a0;\
-register unsigned long  r1 asm("r1");   \
-register unsigned long  r2 asm("r2");   \
-register unsigned long  r3 asm("r3")
-
-#define __declare_arg_1(a0, a1, res)\
-typeof(a1) __a1 = a1;   \
-struct arm_smccc_res*___res = res;  \
-register unsigned long  r0 asm("r0") = (uint32_t)a0;\
-register unsigned long  r1 asm("r1") = __a1;\
-register unsigned long  r2 asm("r2");   \
-register unsigned long  r3 asm("r3")
-
-#define __declare_arg_2(a0, a1, a2, res)\
-typeof(a1) __a1 = a1;   \
-typeof(a2) __a2 = a2;   \
-struct arm_smccc_res*___res = res; \
-register unsigned long  r0 asm("r0") = (uint32_t)a0;\
-register unsigned long  r1 asm("r1") = __a1;\
-register unsigned long  r2 asm("r2") = __a2;\
-register unsigned long  r3 asm("r3")
-
-#define __declare_arg_3(a0, a1, a2, a3, res)\
-typeof(a1) __a1 = a1;   \
-typeof(a2) __a2 = a2;   \
-typeof(a3) __a3 = a3;   \
-struct arm_smccc_res*___res = res;  \
-register unsigned long  r0 asm("r0") = (uint32_t)a0;\
-register unsigned long  r1 asm("r1") = __a1;\
-register unsigned long  r2 asm("r2") = __a2;\
-register unsigned long  r3 asm("r3") = __a3
+#define __declare_arg_0(a0, res)\
+struct arm_smccc_res*___res = res;  \
+register unsigned long  r0 ASM_REG(0) = (uint32_t)a0;   \
+register unsigned long  r1 ASM_REG(1);  \
+register unsigned long  r2 ASM_REG(2);  \
+register unsigned long  r3 ASM_REG(3)
+
+#define __declare_arg_1(a0, a1, res)\
+typeof(a1) __a1 = a1;   \
+struct arm_smccc_res*___res = res;  \
+register unsigned long  r0 ASM_REG(0) = (uint32_t)a0;   \
+register unsigned long  r1 ASM_REG(1) = __a1;   \
+register unsigned long  r2 ASM_REG(2);  \
+register unsigned long  r3 ASM_REG(3)
+
+#define __declare_arg_2(a0, a1, a2, res)\
+typeof(a1) __a1 = a1;   \
+typeof(a2) __a2 = a2;   \
+struct arm_smccc_res*___res = res; \
+register unsigned long  r0 ASM_REG(0) = (uint32_t)a0;   \
+register unsigned long  r1 ASM_REG(1) = __a1;   \
+register unsigned long  r2 ASM_REG(2) = __a2;   \
+register unsigned long  r3 ASM_REG(3)
+
+#define __declare_arg_3(a0, a1, a2, a3, res)\
+typeof(a1) __a1 = a1;   \
+typeof(a2) __a2 = a2;   \
+typeof(a3) __a3 = a3;   \
+struct arm_smccc_res*___res = res;  \
+register unsigned long  

Re: [Xen-devel] [PATCH] xen/x86: p2m: Don't initialize slot 0 of the P2M

2020-02-17 Thread Julien Grall

Hi George,

On 06/02/2020 12:08, George Dunlap wrote:

On 2/3/20 4:58 PM, Julien Grall wrote:

From: Julien Grall 

It is not entirely clear why the slot 0 of each p2m should be populated
with empty page-tables. The commit introducing it 759af8e3800 "[HVM]
Fix 64-bit HVM domain creation." does not contain meaningful
explanation except that it was necessary for shadow.

As we don't seem to have a good explanation why this is there, drop the
code completely.

This was tested by successfully booting a HVM with shadow enabled.

Signed-off-by: Julien Grall 


Since nobody knows why it's here, and it doesn't look like it should
have any effect:

Acked-by: George Dunlap 


Thank you! I have now committed the patch.

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 2/4] arm: rename BIT_WORD to BITOP_WORD

2020-02-17 Thread Julien Grall

Hi Roger,

Thank you for the renaming.

On 17/02/2020 11:45, Roger Pau Monne wrote:

So BIT_WORD can be imported from Linux. The difference between current
Linux implementation of BIT_WORD is that the size of the word unit is
a long integer, while the Xen one is hardcoded to 32 bits.

Current users of BITOP_WORD on Arm (which considers a word a long
integer) are switched to use the generic BIT_WORD which also operates
on long integers.

No functional change intended.

Suggested-by: Julien Grall 
Suggested-by: Jan Beulich 
Signed-off-by: Roger Pau Monné 
---
Changes since v4:
  - New in this version.
---
  xen/arch/arm/arm32/lib/bitops.c|  4 ++--
  xen/arch/arm/arm64/lib/bitops.c|  4 ++--
  xen/arch/arm/arm64/lib/find_next_bit.c | 10 --
  xen/include/asm-arm/bitops.h   | 10 +-
  xen/include/xen/bitops.h   |  2 ++
  5 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/xen/arch/arm/arm32/lib/bitops.c b/xen/arch/arm/arm32/lib/bitops.c
index 3dca769bf0..82d935ce33 100644
--- a/xen/arch/arm/arm32/lib/bitops.c
+++ b/xen/arch/arm/arm32/lib/bitops.c
@@ -33,7 +33,7 @@
  static always_inline bool int_##name(int nr, volatile void *p, bool timeout,\
   unsigned int max_try)  \
  {   \
-volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr);\
+volatile uint32_t *ptr = (uint32_t *)p + BITOP_WORD((unsigned int)nr);  \
  const uint32_t mask = BIT_MASK((unsigned int)nr);   \
  unsigned long res, tmp; \
  \
@@ -71,7 +71,7 @@ bool name##_timeout(int nr, volatile void *p, unsigned int 
max_try) \
  static always_inline bool int_##name(int nr, volatile void *p, int *oldbit, \
   bool timeout, unsigned int max_try)\
  {   \
-volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr);\
+volatile uint32_t *ptr = (uint32_t *)p + BITOP_WORD((unsigned int)nr);  \
  unsigned int bit = (unsigned int)nr % BITS_PER_WORD;\
  const uint32_t mask = BIT_MASK(bit);\
  unsigned long res, tmp; \
diff --git a/xen/arch/arm/arm64/lib/bitops.c b/xen/arch/arm/arm64/lib/bitops.c
index 27688e5418..f5128c58f5 100644
--- a/xen/arch/arm/arm64/lib/bitops.c
+++ b/xen/arch/arm/arm64/lib/bitops.c
@@ -32,7 +32,7 @@
  static always_inline bool int_##name(int nr, volatile void *p, bool timeout,\
   unsigned int max_try)  \
  {   \
-volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr);\
+volatile uint32_t *ptr = (uint32_t *)p + BITOP_WORD((unsigned int)nr);  \
  const uint32_t mask = BIT_MASK((unsigned int)nr);   \
  unsigned long res, tmp; \
  \
@@ -67,7 +67,7 @@ bool name##_timeout(int nr, volatile void *p, unsigned int 
max_try) \
  static always_inline bool int_##name(int nr, volatile void *p, int *oldbit, \
   bool timeout, unsigned int max_try)\
  {   \
-volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr);\
+volatile uint32_t *ptr = (uint32_t *)p + BITOP_WORD((unsigned int)nr);  \
  unsigned int bit = (unsigned int)nr % BITS_PER_WORD;\
  const uint32_t mask = BIT_MASK(bit);\
  unsigned long res, tmp; \
diff --git a/xen/arch/arm/arm64/lib/find_next_bit.c 
b/xen/arch/arm/arm64/lib/find_next_bit.c
index 17cb176266..8ebf8bfe97 100644
--- a/xen/arch/arm/arm64/lib/find_next_bit.c
+++ b/xen/arch/arm/arm64/lib/find_next_bit.c
@@ -12,8 +12,6 @@
  #include 
  #include 
  
-#define BITOP_WORD(nr)		((nr) / BITS_PER_LONG)

-
  #ifndef find_next_bit
  /*
   * Find the next set bit in a memory region.
@@ -21,7 +19,7 @@
  unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
unsigned long offset)
  {
-   const unsigned long *p = addr + BITOP_WORD(offset);
+   const unsigned long *p = addr + BIT_WORD(offset);
unsigned long result = offset & ~(BITS_PER_LONG-1);
unsigned long tmp;
  
@@ -67,7 +65,7 @@ EXPORT_SYMBOL(find_next_bit);

  unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long 
size,
   

Re: [Xen-devel] [BUG] panic: "IO-APIC + timer doesn't work" - several people have reproduced

2020-02-17 Thread Jason Andryuk
On Mon, Feb 17, 2020 at 2:46 PM Andrew Cooper  wrote:
>
> On 17/02/2020 19:19, Jason Andryuk wrote:
> > enabling vecOn Tue, Dec 31, 2019 at 5:43 AM Aaron Janse  
> > wrote:
> >> On Tue, Dec 31, 2019, at 12:27 AM, Andrew Cooper wrote:
> >>> Is there any full boot log in the bad case?  Debugging via divination
> >>> isn't an effective way to get things done.
> >> Agreed. I included some more verbose logs towards the end of the email 
> >> (typed up by hand).
> >>
> >> Attached are pictures from a slow-motion video of my laptop booting. Note 
> >> that I also included a picture of a stack trace that happens immediately 
> >> before reboot. It doesn't look related, but I wanted to include it anyway.
> >>
> >> I think the original email should have said "4.8.5" instead of "4.0.5." 
> >> Regardless, everyone on this mailing list can now see all the boot logs 
> >> that I've seen.
> >>
> >> Attaching a serial console seems like it would be difficult to do on this 
> >> laptop, otherwise I would have sent the logs as a txt file.
> > I'm seeing Xen panic: "IO-APIC + timer doesn't work" on a Dell
> > Latitude 7200 2-in-1.  Fedora 31 Live USB image boots successfully.
> > No way to get serial output.  I manually recreated the output before
> > from the vga display.
>
> We have multiple bugs.
>
> First and foremost, Xen seems totally broken when running in ExtINT
> mode.  This needs addressing, and ought to be sufficient to let Xen
> boot, at which point we can try to figure out why it is trying to fall
> back into 486(ish) compatibility mode.
>
> > I tested Linux with intel_iommu=on and that booted successfully.
> > Under Xen, this system sets iommu_x2apic_enabled = true, so
> > force_iommu is set and iommu=0 cannot disable the iommu.
> > fails.  Oh, I can disable x2apic and then disable iommu
> >
> > x2apic=1 -> failure above
> > x2apic=0 iommu=0 -> failure above
> > clocksource=acpi -> doesn't help
> > clocksource=pit -> hangs after "load tracking window length 1073741824 ns"
>
> None of these are surprising, given that Xen can't make any interrupts
> work at all.
>
> > noapic -> BUG in init_bsp_APIC
>
> This is a surprise.  Its clearly a bug in Xen.  (OTOH, I've been
> threatening to rip all of that logic out, because there is no such thing
> as a 64bit capable system without an integrated APIC.)

It's a GPF [error_code=] at init_bsp_APIC+0x53 which is
   0x82d080428f86 <+64>:je 0x82d080428fc9 
   0x82d080428f88 <+66>:or $0xff,%al
   0x82d080428f8a <+68>:test   %sil,%sil
   0x82d080428f8d <+71>:je 0x82d080428fd8 
   0x82d080428f8f <+73>:mov$0x80f,%ecx
   0x82d080428f94 <+78>:mov$0x0,%edx
   0x82d080428f99 <+83>:wrmsr

RAX is 0x3ff

This is immediately after Xen prints "Switched to APIC driver x2apic_cluster"

> > One other thing that might be noteworthy.  Linux only prints ACPI IRQ0
> > and IRQ9 used by override where Xen lists IRQ 0, 2 & 9.
>
> Huh - this is supposed to come directly from the ACPI tables, so Linux
> and Xen should be using the same source of information.
>
> >
> > Below is the re-constructed Xen console output.  The SMBIOS line is
> > the first thing displayed on the VGA output.
>
> Yes - it is the first thing printed after vesa_init() which I think is a
> manifestation of a previous EFI bug I've reported.  Does booting with
> -basevideo help?  (No need to transcribe the output, manually.  Just
> need to know if it lets you see the full log.)

I'm booting grub->xen.gz so -basevideo isn't directly applicable.  My
attempt at setting a boot entry failed, so I'll have to try that
again.

> >   I skipped the full EFI
> > memory map dump since it is quite long.
> >
> > I've also attached the Linux dmesg output.  Any pointers or
> > suggestions are most welcome.
>
> Lets start with getting Xen able to limp along to a full boot.  After
> that, we can figure out how to stop it making silly decisions during boot.
>
> ~Andrew

Thanks for taking a look.

-Jason

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 3/3] x86/hyperv: L0 assisted TLB flush

2020-02-17 Thread Michael Kelley
From: Wei Liu  On Behalf Of Wei Liu

[snip]

> diff --git a/xen/arch/x86/guest/hyperv/util.c 
> b/xen/arch/x86/guest/hyperv/util.c
> new file mode 100644
> index 00..0abb37b05f
> --- /dev/null
> +++ b/xen/arch/x86/guest/hyperv/util.c
> @@ -0,0 +1,74 @@
> +/**
> 
> + * arch/x86/guest/hyperv/util.c
> + *
> + * Hyper-V utility functions
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see https://www.gnu.org/licenses/.
> + *
> + * Copyright (c) 2020 Microsoft.
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +
> +#include "private.h"
> +
> +int cpumask_to_vpset(struct hv_vpset *vpset,
> + const cpumask_t *mask)
> +{
> +int nr = 1;
> +unsigned int cpu, vcpu_bank, vcpu_offset;
> +unsigned int max_banks = ms_hyperv.max_vp_index / 64;
> +
> +/* Up to 64 banks can be represented by valid_bank_mask */
> +if ( max_banks > 64 )
> +return -E2BIG;
> +
> +/* Clear all banks to avoid flushing unwanted CPUs */
> +for ( vcpu_bank = 0; vcpu_bank < max_banks; vcpu_bank++ )
> +vpset->bank_contents[vcpu_bank] = 0;
> +
> +vpset->valid_bank_mask = 0;
> +vpset->format = HV_GENERIC_SET_SPARSE_4K;
> +
> +for_each_cpu ( cpu, mask )
> +{
> +unsigned int vcpu = hv_vp_index(cpu);
> +
> +vcpu_bank = vcpu / 64;
> +vcpu_offset = vcpu % 64;
> +
> +__set_bit(vcpu_offset, >bank_contents[vcpu_bank]);
> +__set_bit(vcpu_bank, >valid_bank_mask);

This approach to setting the bits in the valid_bank_mask causes a bug.
If an entire 64-bit word in the bank_contents array is zero because there
are no CPUs in that range, the corresponding bit in valid_bank_mask still
must be set to tell Hyper-V that the 64-bit word is present in the array
and should be processed, even though the content is zero.  A zero bit
in valid_bank_mask indicates that the corresponding 64-bit word in the
array is not present, and every 64-bit word above it has been shifted down.
That's why the similar Linux function sets valid_bank_mask the way that
it does.

Michael

> +
> +if ( vcpu_bank >= nr )
> +nr = vcpu_bank + 1;
> +}
> +
> +return nr;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> --
> 2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] CPU Lockup bug with the credit2 scheduler

2020-02-17 Thread Sarah Newman

On 1/7/20 6:25 AM, Alastair Browne wrote:


CONCLUSION

So in conclusion, the tests indicate that credit2 might be unstable.

For the time being, we are using credit as the chosen scheduler. We
are booting the kernel with a parameter "sched=credit" to ensure that
the correct scheduler is used.

After the tests, we decided to stick with 4.9.0.9 kernel and 4.12 Xen
for production use running credit1 as the default scheduler.


One person CC'ed appears to be having the same experience, where the credit2 scheduler leads to lockups (in this case in the domU, not the dom0) under 
relatively heavy load. It seems possible they may have the same root cause.


I don't think there are, but have there been any patches since the 4.13.0 release which might have fixed problems with credit 2 scheduler? If not, 
what would the next step be to isolating the problem - a debug build of Xen or something else?


If there are no merged or proposed fixes soon, it may be worth considering making the credit scheduler the default again until problems with the 
credit2 scheduler are resolved.


Thanks, Sarah

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-5.4 test] 147129: regressions - FAIL

2020-02-17 Thread osstest service owner
flight 147129 linux-5.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/147129/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-shadow23 leak-check/check fail REGR. vs. 146121
 test-amd64-amd64-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 
146121
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 146121
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
146121

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail REGR. vs. 146121
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 146121

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass

version targeted for testing:
 linux

Re: [Xen-devel] [BUG] panic: "IO-APIC + timer doesn't work" - several people have reproduced

2020-02-17 Thread Andrew Cooper
On 17/02/2020 19:19, Jason Andryuk wrote:
> enabling vecOn Tue, Dec 31, 2019 at 5:43 AM Aaron Janse  
> wrote:
>> On Tue, Dec 31, 2019, at 12:27 AM, Andrew Cooper wrote:
>>> Is there any full boot log in the bad case?  Debugging via divination
>>> isn't an effective way to get things done.
>> Agreed. I included some more verbose logs towards the end of the email 
>> (typed up by hand).
>>
>> Attached are pictures from a slow-motion video of my laptop booting. Note 
>> that I also included a picture of a stack trace that happens immediately 
>> before reboot. It doesn't look related, but I wanted to include it anyway.
>>
>> I think the original email should have said "4.8.5" instead of "4.0.5." 
>> Regardless, everyone on this mailing list can now see all the boot logs that 
>> I've seen.
>>
>> Attaching a serial console seems like it would be difficult to do on this 
>> laptop, otherwise I would have sent the logs as a txt file.
> I'm seeing Xen panic: "IO-APIC + timer doesn't work" on a Dell
> Latitude 7200 2-in-1.  Fedora 31 Live USB image boots successfully.
> No way to get serial output.  I manually recreated the output before
> from the vga display.

We have multiple bugs.

First and foremost, Xen seems totally broken when running in ExtINT
mode.  This needs addressing, and ought to be sufficient to let Xen
boot, at which point we can try to figure out why it is trying to fall
back into 486(ish) compatibility mode.

> I tested Linux with intel_iommu=on and that booted successfully.
> Under Xen, this system sets iommu_x2apic_enabled = true, so
> force_iommu is set and iommu=0 cannot disable the iommu.
> fails.  Oh, I can disable x2apic and then disable iommu
>
> x2apic=1 -> failure above
> x2apic=0 iommu=0 -> failure above
> clocksource=acpi -> doesn't help
> clocksource=pit -> hangs after "load tracking window length 1073741824 ns"

None of these are surprising, given that Xen can't make any interrupts
work at all.

> noapic -> BUG in init_bsp_APIC

This is a surprise.  Its clearly a bug in Xen.  (OTOH, I've been
threatening to rip all of that logic out, because there is no such thing
as a 64bit capable system without an integrated APIC.)

> One other thing that might be noteworthy.  Linux only prints ACPI IRQ0
> and IRQ9 used by override where Xen lists IRQ 0, 2 & 9.

Huh - this is supposed to come directly from the ACPI tables, so Linux
and Xen should be using the same source of information.

>
> Below is the re-constructed Xen console output.  The SMBIOS line is
> the first thing displayed on the VGA output.

Yes - it is the first thing printed after vesa_init() which I think is a
manifestation of a previous EFI bug I've reported.  Does booting with
-basevideo help?  (No need to transcribe the output, manually.  Just
need to know if it lets you see the full log.)

>   I skipped the full EFI
> memory map dump since it is quite long.
>
> I've also attached the Linux dmesg output.  Any pointers or
> suggestions are most welcome.

Lets start with getting Xen able to limp along to a full boot.  After
that, we can figure out how to stop it making silly decisions during boot.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 3/6] x86: track when in #MC context

2020-02-17 Thread Julien Grall

Hi Roger,

On 17/02/2020 18:43, Roger Pau Monne wrote:

Add helpers to track when executing in #MC context. This is modeled
after the in_irq helpers.

Note that there are no users of in_mc() introduced by the change,
further users will be added by followup changes.

Signed-off-by: Roger Pau Monné 
---
  xen/arch/x86/cpu/mcheck/mce.c | 2 ++
  xen/include/asm-x86/hardirq.h | 5 +
  xen/include/xen/irq_cpustat.h | 1 +
  3 files changed, 8 insertions(+)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index d61e582af3..93ed5752ac 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -93,7 +93,9 @@ void x86_mce_vector_register(x86_mce_vector_t hdlr)
  
  void do_machine_check(const struct cpu_user_regs *regs)

  {
+mc_enter();
  _machine_check_vector(regs);
+mc_exit();
  }
  
  /*

diff --git a/xen/include/asm-x86/hardirq.h b/xen/include/asm-x86/hardirq.h
index 34e1b49260..af3eab6a4d 100644
--- a/xen/include/asm-x86/hardirq.h
+++ b/xen/include/asm-x86/hardirq.h
@@ -8,6 +8,7 @@ typedef struct {
unsigned int __softirq_pending;
unsigned int __local_irq_count;
unsigned int __nmi_count;
+   unsigned int mc_count;
bool_t __mwait_wakeup;
  } __cacheline_aligned irq_cpustat_t;
  
@@ -18,6 +19,10 @@ typedef struct {

  #define irq_enter()   (local_irq_count(smp_processor_id())++)
  #define irq_exit()(local_irq_count(smp_processor_id())--)
  
+#define in_mc() 	(mc_count(smp_processor_id()) != 0)

+#define mc_enter() (mc_count(smp_processor_id())++)
+#define mc_exit()  (mc_count(smp_processor_id())--)
+
  void ack_bad_irq(unsigned int irq);
  
  extern void apic_intr_init(void);

diff --git a/xen/include/xen/irq_cpustat.h b/xen/include/xen/irq_cpustat.h
index 73629f6ec8..12b932fc39 100644
--- a/xen/include/xen/irq_cpustat.h
+++ b/xen/include/xen/irq_cpustat.h
@@ -26,5 +26,6 @@ extern irq_cpustat_t irq_stat[];
  #define local_irq_count(cpu)  __IRQ_STAT((cpu), __local_irq_count)
  #define nmi_count(cpu)__IRQ_STAT((cpu), __nmi_count)
  #define mwait_wakeup(cpu) __IRQ_STAT((cpu), __mwait_wakeup)
+#define mc_count(cpu)  __IRQ_STAT((cpu), mc_count)


The header is only meant to contain arch-independent IRQ stats (see 
comment a few lines above). This is unlikely to be used on Arm, so can 
you move this into an x86 specific header?


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [BUG] panic: "IO-APIC + timer doesn't work" - several people have reproduced

2020-02-17 Thread Jason Andryuk
enabling vecOn Tue, Dec 31, 2019 at 5:43 AM Aaron Janse  wrote:
>
> On Tue, Dec 31, 2019, at 12:27 AM, Andrew Cooper wrote:
> > Is there any full boot log in the bad case?  Debugging via divination
> > isn't an effective way to get things done.
>
> Agreed. I included some more verbose logs towards the end of the email (typed 
> up by hand).
>
> Attached are pictures from a slow-motion video of my laptop booting. Note 
> that I also included a picture of a stack trace that happens immediately 
> before reboot. It doesn't look related, but I wanted to include it anyway.
>
> I think the original email should have said "4.8.5" instead of "4.0.5." 
> Regardless, everyone on this mailing list can now see all the boot logs that 
> I've seen.
>
> Attaching a serial console seems like it would be difficult to do on this 
> laptop, otherwise I would have sent the logs as a txt file.

I'm seeing Xen panic: "IO-APIC + timer doesn't work" on a Dell
Latitude 7200 2-in-1.  Fedora 31 Live USB image boots successfully.
No way to get serial output.  I manually recreated the output before
from the vga display.

Comparing the Linux and Xen, Xen does:
(XEN) I/O Virtualisation enabled
(XEN)  - Dom0 mode: Relaxed
(XEN) Interrupt remapping enabled
(XEN) nr_sockets: 1
(XEN) Getting VERSION: 1060015
(XEN) Getting VERSION: 1060015
(XEN) Enabled directed EOI with ioapic_ack_old on!
(XEN) Getting ID: 0
(XEN) Getting LVT0: 700
(XEN) Getting LVT1: 400
(XEN) Suppress EOI broadcast on CPU#0
(XEN) enabled ExtINT on CPU#0
(XEN) ESR value before enabling vector: 0x40 after: 0
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using old ACK method
(XEN) init IO_APIC IRQs
(XEN)  IO-APIC (apicid-pin) 2-0, 2-16, 2-17, .. 2-119 not connected.
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) ..MP-BIOS bug: 8254 timer not connected to IO-APIC
(XEN) ...trying to set up timer (IRQ0) through 8259A ... failed
(XEN) ...trying to set up timer as Virtual Wire IRQ... failed.
(XEN) ...trying to set up timer as ExtINT IRQ...spurious 8259A interrupt: IRQ7.
(XEN) CPU0: no irq handler for vector e7 (IRQ -8)
(XEN) IRQ7 a=0001[0001,] v=60[] t=IO-APIC-edge s=0002
(XEN)  failed :(.

while linux apic=debug does:
kernel: ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
kernel: clocksource: tsc-early: mask: 0x max_cycles:
0x1e44fb6c2ab, max_idle_ns: 440795206594 ns
kernel: Calibrating delay loop (skipped), value calculated using timer
frequency.. 4199.88 BogoMIPS (lpj=2099944)
.. and continues onward

Since linux doesn't print "...trying to set up timer (IRQ0) through
the 8259A ..." that seems to indicate Linux is seeing the timer
interrupt properly.
https://elixir.bootlin.com/linux/v5.3.7/source/arch/x86/kernel/apic/io_apic.c#L2198

I tested Linux with intel_iommu=on and that booted successfully.
Under Xen, this system sets iommu_x2apic_enabled = true, so
force_iommu is set and iommu=0 cannot disable the iommu.
fails.  Oh, I can disable x2apic and then disable iommu

x2apic=1 -> failure above
x2apic=0 iommu=0 -> failure above
clocksource=acpi -> doesn't help
clocksource=pit -> hangs after "load tracking window length 1073741824 ns"
noapic -> BUG in init_bsp_APIC

One other thing that might be noteworthy.  Linux only prints ACPI IRQ0
and IRQ9 used by override where Xen lists IRQ 0, 2 & 9.

Below is the re-constructed Xen console output.  The SMBIOS line is
the first thing displayed on the VGA output.  I skipped the full EFI
memory map dump since it is quite long.

I've also attached the Linux dmesg output.  Any pointers or
suggestions are most welcome.

Thanks,
Jason

(XEN) SMBIOS 3.2 present.
(XEN) APIC boot stats is `xapic`
(XEN) Using APIC driver default
(XEN) XSM Framework v1.0.0 initialized
(XEN) Flask: 128 avtab hash slots, 283 rules.
(XEN) Flask: 128 avtab hash slots, 283 rules.
(XEN) Flask:  4 users, 3 roles, 38 types, 2 bools
(XEN) Flask:  13 classes, 283 rules
(XEN) Flask:  Starting in enforcing mode.
(XEN) ACPI: PM-Timer IO Port: 0x1808 (32 bits)
(XEN) ACPI: v5 SLEEP INFO: control[1:1804], status[1:1800]
(XEN) ACPI: Invalid sleep control/status register data: 0:0x8:0x3 0:0x8:0x3
(XEN) ACPI: SLEEP INFO: pm1x_cnt[1:1804,1:0], pm1x_evt[1:1800,1:0]
(XEN) ACPI: 32/64X FACS address mismatch in FADT -
38c80c00/, using 32
(XEN) ACPI: wakeup_vec[38c80c0c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee0
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x01] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x03] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x05] enabled)
(XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x07] enabled)
(XEN) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
(XEN) ACPI: LAPIC_NMI 

Re: [Xen-devel] [PATCH 3/3] AMD/IOMMU: replace a few literal numbers

2020-02-17 Thread Andrew Cooper
On 17/02/2020 13:09, Jan Beulich wrote:
> On 10.02.2020 15:28, Andrew Cooper wrote:
>> On 05/02/2020 09:43, Jan Beulich wrote:
>>> Introduce IOMMU_PDE_NEXT_LEVEL_{MIN,MAX} to replace literal 1, 6, and 7
>>> instances. While doing so replace two uses of memset() by initializers.
>>>
>>> Signed-off-by: Jan Beulich 
>> This does not look to be an improvement.  IOMMU_PDE_NEXT_LEVEL_MIN is
>> definitely bogus, and in all cases, a literal 1 is better, because that
>> is how we describe pagetable levels.
> I disagree.

A pagetable walking function which does:

while ( level > 1 )
{
    ...
    level--;
}

is far clearer and easier to follow than hiding 1 behind a constant
which isn't obviously 1.    Something like LEVEL_4K would at least be
something that makes sense in context, but a literal one less verbose.

>  The device table entry's mode field is bounded by 1
> (min) and 6 (max) for the legitimate values to put there.

If by 1, you mean 0, then yes.  Coping properly with a mode of 0 looks
to be easier than putting in an arbitrary restriction.

OTOH, you intended to restrict to just values we expect to find in a Xen
setup, then the answers are 3 and 4 only.  (The "correctness" of this
function depends on only running on Xen-written tables.  It doesn't
actually read the next-level field out of the PTE, and assumes that it
is a standard pagetable hierarchy.  Things will go wrong if it
encounters a superpage, or a next-level-7 entry.)

>
>> Something to replace literal 6/7 probably is ok, but doesn't want to be
>> done like this.
>>
>> The majority of the problems here as caused by iommu_pde_from_dfn()'s
>> silly ABI.  The pt_mfn[] array is problematic (because it is used as a
>> 1-based array, not 0-based) and useless because both callers only want
>> the 4k-equivelent mfn.  Fixing the ABI gets rid of quite a lot of wasted
>> stack space, every use of '1', and every upper bound other than the bug
>> on and amd_iommu_get_paging_mode().
> I didn't mean to alter that function's behavior, at the very least
> not until being certain there wasn't a reason it was coded with this
> array approach. IOW the alternative to going with this patch
> (subject to corrections of course) is for me to drop it altogether,
> keeping the hard-coded numbers in place. Just let me know.

If you don't want to change the API, then I'll put it on my todo list.

As previously expressed, this patch on its own is not an improvement IMO.

>>> ---
>>> TBD: We should really honor the hats field of union
>>>  amd_iommu_ext_features, but the specification (or at least the
>>>  parts I did look at in the course of putting together this patch)
>>>  is unclear about the maximum valid value in case EFRSup is clear.
>> It is available from PCI config space (Misc0 register, cap+0x10) even on
>> first gen IOMMUs,
> I don't think any of the address size fields there matches what
> HATS is about (limiting of the values valid to put in a DTE's
> mode field). In fact I'm having some difficulty bringing the
> two in (sensible) sync.

It will confirm whether 4-levels is available or not, but TBH, we know
that anyway by virtue of being 64bit.

Higher levels really don't matter because we don't support using them. 
We're we to support using them (and I do have one usecase in mind), it
would be entirely reasonable to restrict usage to systems which had EFR.

>
>> and the IVRS table in Type 10.
> Which may in turn be absent, i.e. the question of what to use as
> a default merely gets shifted.

One of Type 10 or 11 is mandatory for each IOMMU in the system.  One way
or another, the information is present.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 3/6] x86: track when in #MC context

2020-02-17 Thread Roger Pau Monne
Add helpers to track when executing in #MC context. This is modeled
after the in_irq helpers.

Note that there are no users of in_mc() introduced by the change,
further users will be added by followup changes.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/cpu/mcheck/mce.c | 2 ++
 xen/include/asm-x86/hardirq.h | 5 +
 xen/include/xen/irq_cpustat.h | 1 +
 3 files changed, 8 insertions(+)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index d61e582af3..93ed5752ac 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -93,7 +93,9 @@ void x86_mce_vector_register(x86_mce_vector_t hdlr)
 
 void do_machine_check(const struct cpu_user_regs *regs)
 {
+mc_enter();
 _machine_check_vector(regs);
+mc_exit();
 }
 
 /*
diff --git a/xen/include/asm-x86/hardirq.h b/xen/include/asm-x86/hardirq.h
index 34e1b49260..af3eab6a4d 100644
--- a/xen/include/asm-x86/hardirq.h
+++ b/xen/include/asm-x86/hardirq.h
@@ -8,6 +8,7 @@ typedef struct {
unsigned int __softirq_pending;
unsigned int __local_irq_count;
unsigned int __nmi_count;
+   unsigned int mc_count;
bool_t __mwait_wakeup;
 } __cacheline_aligned irq_cpustat_t;
 
@@ -18,6 +19,10 @@ typedef struct {
 #define irq_enter()(local_irq_count(smp_processor_id())++)
 #define irq_exit() (local_irq_count(smp_processor_id())--)
 
+#define in_mc()(mc_count(smp_processor_id()) != 0)
+#define mc_enter() (mc_count(smp_processor_id())++)
+#define mc_exit()  (mc_count(smp_processor_id())--)
+
 void ack_bad_irq(unsigned int irq);
 
 extern void apic_intr_init(void);
diff --git a/xen/include/xen/irq_cpustat.h b/xen/include/xen/irq_cpustat.h
index 73629f6ec8..12b932fc39 100644
--- a/xen/include/xen/irq_cpustat.h
+++ b/xen/include/xen/irq_cpustat.h
@@ -26,5 +26,6 @@ extern irq_cpustat_t irq_stat[];
 #define local_irq_count(cpu)   __IRQ_STAT((cpu), __local_irq_count)
 #define nmi_count(cpu) __IRQ_STAT((cpu), __nmi_count)
 #define mwait_wakeup(cpu)  __IRQ_STAT((cpu), __mwait_wakeup)
+#define mc_count(cpu)  __IRQ_STAT((cpu), mc_count)
 
 #endif /* __irq_cpustat_h */
-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 4/6] x86: track when in #NMI context

2020-02-17 Thread Roger Pau Monne
Add helpers to track when running in #MC context. This is modeled
after the in_irq helpers, but does not support reentry.

Note that there are no users of in_mc() introduced by the change,
further users will be added by followup changes.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/traps.c  |  6 ++
 xen/include/asm-x86/hardirq.h | 18 +-
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 3dbc66bb64..f4f2c13ae9 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1692,9 +1692,13 @@ void do_nmi(const struct cpu_user_regs *regs)
 bool handle_unknown = false;
 
 this_cpu(nmi_count)++;
+nmi_enter();
 
 if ( nmi_callback(regs, cpu) )
+{
+nmi_exit();
 return;
+}
 
 /*
  * Accessing port 0x61 may trap to SMM which has been actually
@@ -1720,6 +1724,8 @@ void do_nmi(const struct cpu_user_regs *regs)
 if ( !(reason & 0xc0) && handle_unknown )
 unknown_nmi_error(regs, reason);
 }
+
+nmi_exit();
 }
 
 nmi_callback_t *set_nmi_callback(nmi_callback_t *callback)
diff --git a/xen/include/asm-x86/hardirq.h b/xen/include/asm-x86/hardirq.h
index af3eab6a4d..8bcae99eac 100644
--- a/xen/include/asm-x86/hardirq.h
+++ b/xen/include/asm-x86/hardirq.h
@@ -2,12 +2,14 @@
 #define __ASM_HARDIRQ_H
 
 #include 
+#include 
+#include 
 #include 
 
 typedef struct {
unsigned int __softirq_pending;
unsigned int __local_irq_count;
-   unsigned int __nmi_count;
+   bool in_nmi;
unsigned int mc_count;
bool_t __mwait_wakeup;
 } __cacheline_aligned irq_cpustat_t;
@@ -23,6 +25,20 @@ typedef struct {
 #define mc_enter() (mc_count(smp_processor_id())++)
 #define mc_exit()  (mc_count(smp_processor_id())--)
 
+#define in_nmi()   __IRQ_STAT(smp_processor_id(), in_nmi)
+
+static inline void nmi_enter(void)
+{
+ASSERT(!in_nmi());
+in_nmi() = true;
+}
+
+static inline void nmi_exit(void)
+{
+ASSERT(in_nmi());
+in_nmi() = false;
+}
+
 void ack_bad_irq(unsigned int irq);
 
 extern void apic_intr_init(void);
-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 5/6] x86/smp: use a dedicated scratch cpumask in send_IPI_mask

2020-02-17 Thread Roger Pau Monne
Using scratch_cpumask in send_IPI_mak is not safe because it can be
called from interrupt context, and hence Xen would have to make sure
all the users of the scratch cpumask disable interrupts while using
it.

Instead introduce a new cpumask to be used by send_IPI_mask, and
disable interrupts while using.

Fixes: 5500d265a2a8 ('x86/smp: use APIC ALLBUT destination shorthand when 
possible')
Reported-by: Sander Eikelenboom 
Signed-off-by: Roger Pau Monné 
---
Changes since v1:
 - Don't use the shorthand when in #MC or #NMI context.
---
 xen/arch/x86/smp.c | 26 +-
 xen/arch/x86/smpboot.c |  9 -
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index c7caf5bc26..0a9a9e7f02 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -59,6 +59,7 @@ static void send_IPI_shortcut(unsigned int shortcut, int 
vector,
 apic_write(APIC_ICR, cfg);
 }
 
+DECLARE_PER_CPU(cpumask_var_t, send_ipi_cpumask);
 /*
  * send_IPI_mask(cpumask, vector): sends @vector IPI to CPUs in @cpumask,
  * excluding the local CPU. @cpumask may be empty.
@@ -67,7 +68,20 @@ static void send_IPI_shortcut(unsigned int shortcut, int 
vector,
 void send_IPI_mask(const cpumask_t *mask, int vector)
 {
 bool cpus_locked = false;
-cpumask_t *scratch = this_cpu(scratch_cpumask);
+cpumask_t *scratch = this_cpu(send_ipi_cpumask);
+unsigned long flags;
+
+if ( in_mc() || in_nmi() )
+{
+/*
+ * When in #MC or #MNI context Xen cannot use the per-CPU scratch mask
+ * because we have no way to avoid reentry, so do not use the APIC
+ * shorthand.
+ */
+alternative_vcall(genapic.send_IPI_mask, mask, vector);
+return;
+}
+
 
 /*
  * This can only be safely used when no CPU hotplug or unplug operations
@@ -81,7 +95,15 @@ void send_IPI_mask(const cpumask_t *mask, int vector)
  local_irq_is_enabled() && (cpus_locked = get_cpu_maps()) &&
  (park_offline_cpus ||
   cpumask_equal(_online_map, _present_map)) )
+{
+/*
+ * send_IPI_mask can be called from interrupt context, and hence we
+ * need to disable interrupts in order to protect the per-cpu
+ * send_ipi_cpumask while being used.
+ */
+local_irq_save(flags);
 cpumask_or(scratch, mask, cpumask_of(smp_processor_id()));
+}
 else
 {
 if ( cpus_locked )
@@ -89,6 +111,7 @@ void send_IPI_mask(const cpumask_t *mask, int vector)
 put_cpu_maps();
 cpus_locked = false;
 }
+local_irq_save(flags);
 cpumask_clear(scratch);
 }
 
@@ -97,6 +120,7 @@ void send_IPI_mask(const cpumask_t *mask, int vector)
 else
 alternative_vcall(genapic.send_IPI_mask, mask, vector);
 
+local_irq_restore(flags);
 if ( cpus_locked )
 put_cpu_maps();
 }
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index e83e4564a4..82e89201b3 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -57,6 +57,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
 static cpumask_t scratch_cpu0mask;
 
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, send_ipi_cpumask);
+static cpumask_t send_ipi_cpu0mask;
+
 cpumask_t cpu_online_map __read_mostly;
 EXPORT_SYMBOL(cpu_online_map);
 
@@ -930,6 +933,8 @@ static void cpu_smpboot_free(unsigned int cpu, bool remove)
 FREE_CPUMASK_VAR(per_cpu(cpu_core_mask, cpu));
 if ( per_cpu(scratch_cpumask, cpu) != _cpu0mask )
 FREE_CPUMASK_VAR(per_cpu(scratch_cpumask, cpu));
+if ( per_cpu(send_ipi_cpumask, cpu) != _ipi_cpu0mask )
+FREE_CPUMASK_VAR(per_cpu(send_ipi_cpumask, cpu));
 }
 
 cleanup_cpu_root_pgt(cpu);
@@ -1034,7 +1039,8 @@ static int cpu_smpboot_alloc(unsigned int cpu)
 
 if ( !(cond_zalloc_cpumask_var(_cpu(cpu_sibling_mask, cpu)) &&
cond_zalloc_cpumask_var(_cpu(cpu_core_mask, cpu)) &&
-   cond_alloc_cpumask_var(_cpu(scratch_cpumask, cpu))) )
+   cond_alloc_cpumask_var(_cpu(scratch_cpumask, cpu)) &&
+   cond_alloc_cpumask_var(_cpu(send_ipi_cpumask, cpu))) )
 goto out;
 
 rc = 0;
@@ -1175,6 +1181,7 @@ void __init smp_prepare_boot_cpu(void)
 cpumask_set_cpu(cpu, _present_map);
 #if NR_CPUS > 2 * BITS_PER_LONG
 per_cpu(scratch_cpumask, cpu) = _cpu0mask;
+per_cpu(send_ipi_cpumask, cpu) = _ipi_cpu0mask;
 #endif
 
 get_cpu_info()->use_pv_cr3 = false;
-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 0/6] x86: fixes/improvements for scratch cpumask

2020-02-17 Thread Roger Pau Monne
Hello,

Commit:

5500d265a2a8fa63d60c08beb549de8ec82ff7a5
x86/smp: use APIC ALLBUT destination shorthand when possible

Introduced a bogus usage of the scratch cpumask: it was used in a
function that could be called from interrupt context, and hence using
the scratch cpumask there is not safe. Patch #5 is a fix for that usage,
together with also preventing the usage of any per-CPU variables when
send_IPI_mask is called from #MC or #NMI context. Previous patches are
preparatory changes.

Patch #6 adds some debug infrastructure to make sure the scratch cpumask
is used in the right context, and hence should prevent further missuses.

Thanks, Roger.

Roger Pau Monne (6):
  x86/smp: unify header includes in smp.h
  x86: introduce a nmi_count tracking variable
  x86: track when in #MC context
  x86: track when in #NMI context
  x86/smp: use a dedicated scratch cpumask in send_IPI_mask
  x86: add accessors for scratch cpu mask

 xen/arch/x86/cpu/mcheck/mce.c |  2 ++
 xen/arch/x86/io_apic.c|  6 +++--
 xen/arch/x86/irq.c| 13 ++---
 xen/arch/x86/mm.c | 30 ++---
 xen/arch/x86/msi.c|  4 ++-
 xen/arch/x86/nmi.c| 11 
 xen/arch/x86/smp.c| 51 ++-
 xen/arch/x86/smpboot.c| 10 +--
 xen/arch/x86/traps.c  | 10 ++-
 xen/include/asm-x86/hardirq.h | 23 +++-
 xen/include/asm-x86/nmi.h |  2 ++
 xen/include/asm-x86/smp.h | 15 ---
 xen/include/xen/irq_cpustat.h |  1 +
 13 files changed, 148 insertions(+), 30 deletions(-)

-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 1/6] x86/smp: unify header includes in smp.h

2020-02-17 Thread Roger Pau Monne
Unify the two adjacent header includes that are both gated with ifndef
__ASSEMBLY__.

No functional change intended.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Wei Liu 
Acked-by: Jan Beulich 
---
 xen/include/asm-x86/smp.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/xen/include/asm-x86/smp.h b/xen/include/asm-x86/smp.h
index 1aa55d41e1..92d69a5ea0 100644
--- a/xen/include/asm-x86/smp.h
+++ b/xen/include/asm-x86/smp.h
@@ -5,13 +5,10 @@
  * We need the APIC definitions automatically as part of 'smp.h'
  */
 #ifndef __ASSEMBLY__
+#include 
 #include 
 #include 
 #include 
-#endif
-
-#ifndef __ASSEMBLY__
-#include 
 #include 
 #endif
 
-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 6/6] x86: add accessors for scratch cpu mask

2020-02-17 Thread Roger Pau Monne
Current usage of the per-CPU scratch cpumask is dangerous since
there's no way to figure out if the mask is already being used except
for manual code inspection of all the callers and possible call paths.

This is unsafe and not reliable, so introduce a minimal get/put
infrastructure to prevent nested usage of the scratch mask and usage
in interrupt context.

Move the declaration of scratch_cpumask to smp.c in order to place the
declaration and the accessors as close as possible.

Signed-off-by: Roger Pau Monné 
---
Changes since v1:
 - Use __builtin_return_address(0) instead of __func__.
 - Move declaration of scratch_cpumask and scratch_cpumask accessor to
   smp.c.
 - Do not allow usage in #MC or #NMI context.
---
 xen/arch/x86/io_apic.c|  6 --
 xen/arch/x86/irq.c| 13 ++---
 xen/arch/x86/mm.c | 30 +-
 xen/arch/x86/msi.c|  4 +++-
 xen/arch/x86/smp.c| 25 +
 xen/arch/x86/smpboot.c|  1 -
 xen/include/asm-x86/smp.h | 10 ++
 7 files changed, 73 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index e98e08e9c8..4ee261b632 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -2236,10 +2236,11 @@ int io_apic_set_pci_routing (int ioapic, int pin, int 
irq, int edge_level, int a
 entry.vector = vector;
 
 if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
-cpumask_t *mask = this_cpu(scratch_cpumask);
+cpumask_t *mask = get_scratch_cpumask();
 
 cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
 SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
+put_scratch_cpumask();
 } else {
 printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
irq, CPUMASK_PR(desc->arch.cpu_mask), CPUMASK_PR(TARGET_CPUS));
@@ -2433,10 +2434,11 @@ int ioapic_guest_write(unsigned long physbase, unsigned 
int reg, u32 val)
 
 if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
 {
-cpumask_t *mask = this_cpu(scratch_cpumask);
+cpumask_t *mask = get_scratch_cpumask();
 
 cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
 SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
+put_scratch_cpumask();
 }
 else
 {
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index cc2eb8e925..7ecf5376e3 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -196,7 +196,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
 {
 unsigned int cpu, old_vector, irq = desc->irq;
 unsigned int vector = desc->arch.vector;
-cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
+cpumask_t *tmp_mask = get_scratch_cpumask();
 
 BUG_ON(!valid_irq_vector(vector));
 
@@ -223,7 +223,10 @@ static void _clear_irq_vector(struct irq_desc *desc)
 trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, tmp_mask);
 
 if ( likely(!desc->arch.move_in_progress) )
+{
+put_scratch_cpumask();
 return;
+}
 
 /* If we were in motion, also clear desc->arch.old_vector */
 old_vector = desc->arch.old_vector;
@@ -236,6 +239,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
 per_cpu(vector_irq, cpu)[old_vector] = ~irq;
 }
 
+put_scratch_cpumask();
 release_old_vec(desc);
 
 desc->arch.move_in_progress = 0;
@@ -1152,10 +1156,11 @@ static void irq_guest_eoi_timer_fn(void *data)
 break;
 
 case ACKTYPE_EOI:
-cpu_eoi_map = this_cpu(scratch_cpumask);
+cpu_eoi_map = get_scratch_cpumask();
 cpumask_copy(cpu_eoi_map, action->cpu_eoi_map);
 spin_unlock_irq(>lock);
 on_selected_cpus(cpu_eoi_map, set_eoi_ready, desc, 0);
+put_scratch_cpumask();
 return;
 }
 
@@ -2531,12 +2536,12 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
 unsigned int irq;
 static int warned;
 struct irq_desc *desc;
+cpumask_t *affinity = get_scratch_cpumask();
 
 for ( irq = 0; irq < nr_irqs; irq++ )
 {
 bool break_affinity = false, set_affinity = true;
 unsigned int vector;
-cpumask_t *affinity = this_cpu(scratch_cpumask);
 
 if ( irq == 2 )
 continue;
@@ -2640,6 +2645,8 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
irq, CPUMASK_PR(affinity));
 }
 
+put_scratch_cpumask();
+
 /* That doesn't seem sufficient.  Give it 1ms. */
 local_irq_enable();
 mdelay(1);
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index edc238e51a..75b6114c1c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1261,7 +1261,7 @@ void put_page_from_l1e(l1_pgentry_t l1e, struct domain 
*l1e_owner)
  (l1e_owner == pg_owner) )
 {
 struct vcpu *v;
-cpumask_t *mask = this_cpu(scratch_cpumask);
+cpumask_t *mask = get_scratch_cpumask();
 
 cpumask_clear(mask);
 
@@ -1278,6 +1278,7 

[Xen-devel] [PATCH v2 2/6] x86: introduce a nmi_count tracking variable

2020-02-17 Thread Roger Pau Monne
This is modeled after the irq_count variable, and is used to account
for all the NMIs handled by the system.

This will allow to repurpose the nmi_count() helper so it can be used
in a similar manner as local_irq_count(): account for the NMIs
currently in service.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/nmi.c| 11 +--
 xen/arch/x86/traps.c  |  4 +++-
 xen/include/asm-x86/nmi.h |  2 ++
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
index a5c6bdd0ce..e286ceeb40 100644
--- a/xen/arch/x86/nmi.c
+++ b/xen/arch/x86/nmi.c
@@ -151,15 +151,14 @@ int nmi_active;
 
 static void __init wait_for_nmis(void *p)
 {
-unsigned int cpu = smp_processor_id();
-unsigned int start_count = nmi_count(cpu);
+unsigned int start_count = this_cpu(nmi_count);
 unsigned long ticks = 10 * 1000 * cpu_khz / nmi_hz;
 unsigned long s, e;
 
 s = rdtsc();
 do {
 cpu_relax();
-if ( nmi_count(cpu) >= start_count + 2 )
+if ( this_cpu(nmi_count) >= start_count + 2 )
 break;
 e = rdtsc();
 } while( e - s < ticks );
@@ -177,7 +176,7 @@ void __init check_nmi_watchdog(void)
 printk("Testing NMI watchdog on all CPUs:");
 
 for_each_online_cpu ( cpu )
-prev_nmi_count[cpu] = nmi_count(cpu);
+prev_nmi_count[cpu] = per_cpu(nmi_count, cpu);
 
 /*
  * Wait at most 10 ticks for 2 watchdog NMIs on each CPU.
@@ -188,7 +187,7 @@ void __init check_nmi_watchdog(void)
 
 for_each_online_cpu ( cpu )
 {
-if ( nmi_count(cpu) - prev_nmi_count[cpu] < 2 )
+if ( per_cpu(nmi_count, cpu) - prev_nmi_count[cpu] < 2 )
 {
 printk(" %d", cpu);
 ok = false;
@@ -593,7 +592,7 @@ static void do_nmi_stats(unsigned char key)
 
 printk("CPU\tNMI\n");
 for_each_online_cpu ( i )
-printk("%3d\t%3d\n", i, nmi_count(i));
+printk("%3d\t%3u\n", i, per_cpu(nmi_count, i));
 
 if ( ((d = hardware_domain) == NULL) || (d->vcpu == NULL) ||
  ((v = d->vcpu[0]) == NULL) )
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 56067f85d1..3dbc66bb64 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1683,13 +1683,15 @@ static int dummy_nmi_callback(const struct 
cpu_user_regs *regs, int cpu)
 
 static nmi_callback_t *nmi_callback = dummy_nmi_callback;
 
+DEFINE_PER_CPU(unsigned int, nmi_count);
+
 void do_nmi(const struct cpu_user_regs *regs)
 {
 unsigned int cpu = smp_processor_id();
 unsigned char reason = 0;
 bool handle_unknown = false;
 
-++nmi_count(cpu);
+this_cpu(nmi_count)++;
 
 if ( nmi_callback(regs, cpu) )
 return;
diff --git a/xen/include/asm-x86/nmi.h b/xen/include/asm-x86/nmi.h
index f9dfca6afb..a288f02a50 100644
--- a/xen/include/asm-x86/nmi.h
+++ b/xen/include/asm-x86/nmi.h
@@ -31,5 +31,7 @@ nmi_callback_t *set_nmi_callback(nmi_callback_t *callback);
  * Remove the handler previously set.
  */
 void unset_nmi_callback(void);
+
+DECLARE_PER_CPU(unsigned int, nmi_count);
  
 #endif /* ASM_NMI_H */
-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 5/6] tools/libx[cl]: Don't use HVM_PARAM_PAE_ENABLED as a function parameter

2020-02-17 Thread Ian Jackson
Andrew Cooper writes ("[PATCH v2 5/6] tools/libx[cl]: Don't use 
HVM_PARAM_PAE_ENABLED as a function parameter"):
> HVM_PARAM_PAE_ENABLED is set and consumed by the toolstack only.  It is in
> practice a complicated and non-standard way of passing a boolean parameter
> into xc_cpuid_apply_policy().
> 
> This is silly.  Pass PAE as a regular parameter instead.
> 
> In libxl__cpuid_legacy(), leave a rather better explaination of why only HVM
> guests have a choice in PAE setting.
> 
> No change in how a guest is constructed.

Acked-by: Ian Jackson 

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v2 5/6] tools/libx[cl]: Don't use HVM_PARAM_PAE_ENABLED as a function parameter

2020-02-17 Thread Andrew Cooper
HVM_PARAM_PAE_ENABLED is set and consumed by the toolstack only.  It is in
practice a complicated and non-standard way of passing a boolean parameter
into xc_cpuid_apply_policy().

This is silly.  Pass PAE as a regular parameter instead.

In libxl__cpuid_legacy(), leave a rather better explaination of why only HVM
guests have a choice in PAE setting.

No change in how a guest is constructed.

Signed-off-by: Andrew Cooper 
---
CC: Ian Jackson 
CC: Wei Liu 
CC: Anthony PERARD 

v2:
 * Rewrite commit message and comments.
---
 tools/libxc/include/xenctrl.h |  2 +-
 tools/libxc/xc_cpuid_x86.c| 15 +++
 tools/libxl/libxl_cpuid.c | 16 +++-
 3 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 311df1ef0f..4eb4f4c2c6 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1807,7 +1807,7 @@ int xc_cpuid_set(xc_interface *xch,
 int xc_cpuid_apply_policy(xc_interface *xch,
   uint32_t domid,
   const uint32_t *featureset,
-  unsigned int nr_features);
+  unsigned int nr_features, bool pae);
 int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
 int xc_mca_op_inject_v2(xc_interface *xch, unsigned int flags,
 xc_cpumap_t cpumap, unsigned int nr_cpus);
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 2540aa1e1c..21b15b86ec 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -455,7 +455,8 @@ int xc_cpuid_set(
 }
 
 int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
-  const uint32_t *featureset, unsigned int nr_features)
+  const uint32_t *featureset, unsigned int nr_features,
+  bool pae)
 {
 int rc;
 xc_dominfo_t di;
@@ -579,8 +580,6 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
 }
 else
 {
-uint64_t val;
-
 /*
  * Topology for HVM guests is entirely controlled by Xen.  For now, we
  * hardcode APIC_ID = vcpu_id * 2 to give the illusion of no SMT.
@@ -634,15 +633,7 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t 
domid,
 break;
 }
 
-/*
- * HVM_PARAM_PAE_ENABLED is a parameter to this function, stashed in
- * Xen.  Nothing else has ever taken notice of the value.
- */
-rc = xc_hvm_param_get(xch, domid, HVM_PARAM_PAE_ENABLED, );
-if ( rc )
-goto out;
-
-p->basic.pae = val;
+p->basic.pae = pae;
 
 /*
  * These settings are necessary to cause earlier HVM_PARAM_NESTEDHVM /
diff --git a/tools/libxl/libxl_cpuid.c b/tools/libxl/libxl_cpuid.c
index 49d3ca5b26..062750102e 100644
--- a/tools/libxl/libxl_cpuid.c
+++ b/tools/libxl/libxl_cpuid.c
@@ -416,8 +416,22 @@ void libxl__cpuid_legacy(libxl_ctx *ctx, uint32_t domid,
 libxl_cpuid_policy_list cpuid = info->cpuid;
 int i;
 char *cpuid_res[4];
+bool pae = true;
+
+/*
+ * For PV guests, PAE is Xen-controlled (it is the 'p' that differentiates
+ * the xen-3.0-x86_32 and xen-3.0-x86_32p ABIs).  It is mandatory as Xen
+ * is 64bit only these days.
+ *
+ * For PVH guests, there is no top-level PAE control in the domain config,
+ * so is treated as always available.
+ *
+ * HVM guests get a top-level choice of whether PAE is available.
+ */
+if (info->type == LIBXL_DOMAIN_TYPE_HVM)
+pae = libxl_defbool_val(info->u.hvm.pae);
 
-xc_cpuid_apply_policy(ctx->xch, domid, NULL, 0);
+xc_cpuid_apply_policy(ctx->xch, domid, NULL, 0, pae);
 
 if (!cpuid)
 return;
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 7/7] xl: allow domid to be preserved on save/restore or migrate

2020-02-17 Thread Ian Jackson
Paul Durrant writes ("[PATCH v5 7/7] xl: allow domid to be preserved on 
save/restore or migrate"):
> This patch adds a '-D' command line option to save and migrate to allow
> the domain id to be incorporated into the saved domain configuration and
> hence be preserved.
> 
> NOTE: Logically it may seem as though preservation of domid should be
>   dealt with by libxl, but the libxl migration stream has no record
>   in which to transfer domid and remote domain creation occurs before
>   the migration stream is parsed. Hence this patch modifies xl rather
>   then libxl.

Thanks.

I think I am satisfied that this is the best we can do without
tremendous amounts of reorganisation.

Acked-by: Ian Jackson 

Regards,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 5/7] libxl: allow creation of domains with a specified or random domid

2020-02-17 Thread Ian Jackson
Paul Durrant writes ("[PATCH v5 5/7] libxl: allow creation of domains with a 
specified or random domid"):
> This patch adds a 'domid' field to libxl_domain_create_info and then
> modifies libxl__domain_make() to have Xen use that value if it is valid.
> If the domid value is invalid then Xen will choose the domid, as before,
> unless the value is the new special RANDOM_DOMID value added to the API.
> This value instructs libxl__domain_make() to choose a random domid value
> for Xen to use.
> 
> If Xen determines that a domid specified to or chosen by
> libxl__domain_make() co-incides with an existing domain then the create
> operation will fail. In this case, if RANDOM_DOMID was specified to
> libxl__domain_make() then a new random value will be chosen and the create
> operation will be re-tried, otherwise libxl__domain_make() will fail.
> 
> After Xen has successfully created a new domain, libxl__domain_make() will
> check whether its domid matches any recently used domid values. If it does
> then the domain will be destroyed. If the domid used in creation was
> specified to libxl__domain_make() then it will fail at this point,
> otherwise the create operation will be re-tried with either a new random
> or Xen-selected domid value.
> 
> NOTE: libxl__logv() is also modified to only log valid domid values in
>   messages rather than any domid, valid or otherwise, that is not
>   INVALID_DOMID.
> 
> Signed-off-by: Paul Durrant 
> ---
> Cc: Ian Jackson 
> Cc: Wei Liu 
> Cc: Anthony PERARD 
> Cc: Andrew Cooper 
> Cc: George Dunlap 
> Cc: Jan Beulich 
> Cc: Julien Grall 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Stefano Stabellini 
> Cc: Jason Andryuk 
> 
> v5:
>  - Flattened nested loops
> 
> v4:
>  - Not added Jason's R-b because of substantial change
>  - Check for recent domid *after* creation
>  - Re-worked commit comment
> 
> v3:
>  - Added DOMID_MASK definition used to mask randomized values
>  - Use stack variable to avoid assuming endianness
> 
> v2:
>  - Re-worked to use a value from libxl_domain_create_info
> ---
>  tools/libxl/libxl.h  |  9 +
>  tools/libxl/libxl_create.c   | 67 
>  tools/libxl/libxl_internal.c |  2 +-
>  tools/libxl/libxl_types.idl  |  1 +
>  xen/include/public/xen.h |  3 ++
>  5 files changed, 74 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 1d235ecb1c..31c6f4b11a 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -1268,6 +1268,14 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
> const libxl_mac *src);
>   */
>  #define LIBXL_HAVE_DOMAIN_NEED_MEMORY_CONFIG
>  
> +/*
> + * LIBXL_HAVE_CREATEINFO_DOMID
> + *
> + * libxl_domain_create_new() and libxl_domain_create_restore() will use
> + * a domid specified in libxl_domain_create_info().
> + */
> +#define LIBXL_HAVE_CREATEINFO_DOMID
> +
>  typedef char **libxl_string_list;
>  void libxl_string_list_dispose(libxl_string_list *sl);
>  int libxl_string_list_length(const libxl_string_list *sl);
> @@ -1528,6 +1536,7 @@ int libxl_ctx_free(libxl_ctx *ctx /* 0 is OK */);
>  /* domain related functions */
>  
>  #define INVALID_DOMID ~0
> +#define RANDOM_DOMID (INVALID_DOMID - 1)
>  
>  /* If the result is ERROR_ABORTED, the domain may or may not exist
>   * (in a half-created state).  *domid will be valid and will be the
> diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
> index 3a7364e2ac..7fd4d713e7 100644
> --- a/tools/libxl/libxl_create.c
> +++ b/tools/libxl/libxl_create.c
> @@ -555,8 +555,6 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config 
> *d_config,
>  libxl_domain_create_info *info = _config->c_info;
>  libxl_domain_build_info *b_info = _config->b_info;
>  
> -assert(soft_reset || *domid == INVALID_DOMID);
> -
>  uuid_string = libxl__uuid2string(gc, info->uuid);
>  if (!uuid_string) {
>  rc = ERROR_NOMEM;
> @@ -600,11 +598,66 @@ int libxl__domain_make(libxl__gc *gc, 
> libxl_domain_config *d_config,
>  goto out;
>  }
>  
> -ret = xc_domain_create(ctx->xch, domid, );
> -if (ret < 0) {
> -LOGED(ERROR, *domid, "domain creation fail");
> -rc = ERROR_FAIL;
> -goto out;
> +for (;;) {
> +bool recent;
> +
> +if (info->domid == RANDOM_DOMID) {
> +uint16_t v;
> +
> +ret = libxl__random_bytes(gc, (void *), sizeof(v));
> +if (ret < 0)
> +break;
> +
> +v &= DOMID_MASK;
> +if (!libxl_domid_valid_guest(v))
> +continue;
> +
> +*domid = v;
> +} else
> +*domid = info->domid;

Style: { } on all or none of the same `if' series.  (CODING_STYLE)

> +/* The domid is not recent, so we're done */
> +if (!recent)
> +break;
> +
> +/*
> + * If the 

[Xen-devel] [linux-4.4 bisection] complete test-amd64-i386-xl-qemuu-ovmf-amd64

2020-02-17 Thread osstest service owner
branch xen-unstable
xenbranch xen-unstable
job test-amd64-i386-xl-qemuu-ovmf-amd64
testid debian-hvm-install

Tree: linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  ovmf git://xenbits.xen.org/osstest/ovmf.git
  Bug introduced:  999463c865d3768a8432a89508096ae6a43873a5
  Bug not present: a5abd9cc2cebe7fac001f7bb7b647c47cf54af1a
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/147201/


  commit 999463c865d3768a8432a89508096ae6a43873a5
  Author: Hao A Wu 
  Date:   Thu Dec 19 13:36:24 2019 +0800
  
  UefiCpuPkg/MpInitLib: Collect processors' CPUID & Platform ID info
  
  REF:https://bugzilla.tianocore.org/show_bug.cgi?id=2429
  
  This commit will collect the CPUID and Platform ID information for each
  processor within system. They will be stored in the CPU_AP_DATA structure.
  
  These information will be used in the next commit to decide whether a
  microcode patch will be loaded into memory.
  
  Cc: Eric Dong 
  Cc: Ray Ni 
  Cc: Laszlo Ersek 
  Cc: Star Zeng 
  Cc: Siyuan Fu 
  Cc: Michael D Kinney 
  Signed-off-by: Hao A Wu 
  Reviewed-by: Ray Ni 
  Reviewed-by: Eric Dong 


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-4.4/test-amd64-i386-xl-qemuu-ovmf-amd64.debian-hvm-install.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/linux-4.4/test-amd64-i386-xl-qemuu-ovmf-amd64.debian-hvm-install
 --summary-out=tmp/147201.bisection-summary --basis-template=139698 
--blessings=real,real-bisect linux-4.4 test-amd64-i386-xl-qemuu-ovmf-amd64 
debian-hvm-install
Searching for failure / basis pass:
 147111 fail [host=pinot1] / 143846 [host=debina0] 143646 ok.
Failure / basis pass flights: 147111 / 143646
(tree with no url: minios)
Tree: linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git
Latest 76e5c6fd6d163f1aa63969cc982e79be1fee87a7 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
70911f1f4aee0366b6122f2b90d367ec0f066beb 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
76551856b28d227cb0386a1ab0e774329b941f7d 
6c47c37b9b40d6fe40bce8c8fd39135f6d549c8c
Basis pass da259d0284b69e084d65200b69462bed9b86a4c7 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
b15646484eaffcf7cc464fdea0214498f26addc2 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
c1ab7d7ed5306641784a9ed8972db5151a49a1a1 
518c935fac4d30b3ec35d4b6add82b17b7d7aca3
Generating revisions with ./adhoc-revtuple-generator  
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git#da259d0284b69e084d65200b69462bed9b86a4c7-76e5c6fd6d163f1aa63969cc982e79be1fee87a7
 
git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860
 
git://xenbits.xen.org/osstest/ovmf.git#b15646484eaffcf7cc464fdea0214498f26addc2-70911f1f4aee0366b6122f2b90d367ec0f066beb
 git://xenbits.xen.org/qemu-xen-traditional\
 
.git#d0d8ad39ecb51cd7497cd524484fe09f50876798-d0d8ad39ecb51cd7497cd524484fe09f50876798
 
git://xenbits.xen.org/qemu-xen.git#933ebad2470a169504799a1d95b8e410bd9847ef-933ebad2470a169504799a1d95b8e410bd9847ef
 
git://xenbits.xen.org/osstest/seabios.git#c1ab7d7ed5306641784a9ed8972db5151a49a1a1-76551856b28d227cb0386a1ab0e774329b941f7d
 
git://xenbits.xen.org/xen.git#518c935fac4d30b3ec35d4b6add82b17b7d7aca3-6c47c37b9b40d6fe40bce8c8fd39135f6d549c8c
Use of uninitialized value $parents in array dereference at 
./adhoc-revtuple-generator line 465.
Use of uninitialized value in concatenation (.) or string at 
./adhoc-revtuple-generator line 465.
Loaded 13143 nodes in revision graph
Searching for test results:
 146915 fail d6ccbff9be43dbb6113a6a3f107c3d066052097e 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
70911f1f4aee0366b6122f2b90d367ec0f066beb 
d0d8ad39ecb51cd7497cd524484fe09f50876798 
933ebad2470a169504799a1d95b8e410bd9847ef 
76551856b28d227cb0386a1ab0e774329b941f7d 
72dbcf0c065037dddb591a072c4f8f16fe888ea8
 146860 []
 146992 [host=pinot0]
 147055 [host=pinot0]
 147076 [host=pinot0]
 147059 [host=pinot0]
 147062 

Re: [Xen-devel] [PATCH v5 4/7] libxl: add infrastructure to track and query 'recent' domids

2020-02-17 Thread Ian Jackson
Paul Durrant writes ("[PATCH v5 4/7] libxl: add infrastructure to track and 
query 'recent' domids"):
> A domid is considered recent if the domain it represents was destroyed
> less than a specified number of seconds ago. For debugging and/or testing
> purposes the number can be set using the environment variable
> LIBXL_DOMID_REUSE_TIMEOUT. If the variable does not exist then a default
> value of 60s is used.
> 
> Whenever a domain is destroyed, a time-stamped record will be written into
> a history file (/var/run/xen/domid-history). To avoid the history file
> growing too large, any records with time-stamps that indicate that the
> age of a domid has exceeded the re-use timeout will also be purged.
> 
> A new utility function, libxl__is_recent_domid(), has been added. This
> function reads the same history file checking whether a specified domid
> has a record that does not exceed the re-use timeout. Since this utility
> function does not write to the file, no records are actually purged by it.

Thanks for this.  Sorry for the delay in reviewing it.

I'm afraid I still have some comments about error handling etc.

> +int libxl_clear_domid_history(libxl_ctx *ctx);

I think this needs a clear doc comment saying it is for use in host
initialisation only.  If it is run with any domains running, or
concurrent libxl processes, things may malfunction.

> +static bool libxl__read_recent(FILE *f, unsigned long *sec,
> +   unsigned int *domid)
> +{
> +int n;
> +
> +assert(f);
> +
> +n = fscanf(f, "%lu %u", sec, domid);
> +if (n == EOF)
> +return false;

Missing error handling in case of read error.

> +else if (n != 2) /* malformed entry */
> +*domid = INVALID_DOMID;

Both call sites for this function have open-coded checks for this
return case, where they just go round again.  I think
libxl__read_recent should handle this itself, factoring the common
code into this function and avoiding that special case.

> +return true;

I think this function should return an rc.  It could signal EOF by
setting *domid to INVALID_DOMID maybe, and errors by returning
ERROR_FAIL.

> +static bool libxl__write_recent(FILE *f, unsigned long sec,
> +unsigned int domid)
> +{
> +assert(f);

This is rather pointless.  Please drop it.

> +assert(libxl_domid_valid_guest(domid));

I doubt this is really needed but I don't mind it if you must.

> +return fprintf(f, "%lu %u\n", sec, domid) > 0;

Wrong error handling.  This function should return rc.  fprintf
doesn't return a boolean.  Something should log errno (with LOGE
probably) if fprintf fails.

> +static int libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid)
> +{
> +long timeout = libxl__get_domid_reuse_timeout();
> +libxl__flock *lock;

Please initialise lock = NULL so that it is easy to see that the out
block is correct.

(See tools/libxl/CODING_STYLE where this is discussed.)

> +char *old, *new;
> +FILE *of = NULL, *nf = NULL;
> +struct timespec ts;
> +int rc = ERROR_FAIL;

Please do not set rc to ERROR_FAIL like this.  Leave it undefined.
Set it on each exit path.  (If you are calling a function that returns
an rc, you can put it in rc, and then test rc and goto out without
assignment.)

(Again, see tools/libxl/CODING_STYLE where this is discussed.)

> +lock = libxl__lock_domid_history(gc);
> +if (!lock) {
> +LOGED(ERROR, domid, "failed to acquire lock");
> +goto out;
> +}
> +
> +old = libxl__domid_history_path(gc, NULL);
> +of = fopen(old, "r");
> +if (!of && errno != ENOENT)
> +LOGED(WARN, domid, "failed to open '%s'", old);

This fopen code and its error handling is still duplicated between
libxl__mark_domid_recent and libxl__is_domid_recent.  I meant for you
to factor it out.  Likewise the other duplicated code in these two
functions.  I want there to be nothing duplicated that can be written
once.

Also failure to open the file should be an error, resulting failure of
this function and the whole surrounding operation, not simply produce
a warning in some logfile where it will be ignored.

> +while (libxl__read_recent(of, , )) {
> +if (!libxl_domid_valid_guest(val))
> +continue; /* Ignore invalid entries */
> +
> +if (ts.tv_sec - sec > timeout)
> +continue; /* Ignore expired entries */
> +
> +if (!libxl__write_recent(nf, sec, val)) {
> +LOGED(ERROR, domid, "failed to write to '%s'", new);
> +goto out;
> +}
> +}
> +if (ferror(of)) {
> +LOGED(ERROR, domid, "failed to read from '%s'", old);
> +goto out;
> +}

Oh, wait, here is one of the missing pieces of error handling ?
Please put it where it belongs, next to the corresponding call.

> +if (of && fclose(of) == EOF) {
> +LOGED(ERROR, domid, "failed to close '%s'", 

[Xen-devel] [linux-4.14 bisection] complete test-armhf-armhf-xl

2020-02-17 Thread osstest service owner
branch xen-unstable
xenbranch xen-unstable
job test-armhf-armhf-xl
testid xen-boot

Tree: linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  Bug introduced:  7b72dc2f100d1fe8e969d645050c8ee64b5dd301
  Bug not present: 00843344c6871cde6b8c85bf88bd2197d6eb1da6
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/147202/


  commit 7b72dc2f100d1fe8e969d645050c8ee64b5dd301
  Author: Marek Szyprowski 
  Date:   Thu Sep 6 17:41:35 2018 +0200
  
  ARM: dts: exynos: Disable pull control for S5M8767 PMIC
  
  [ Upstream commit ef2ecab9af5feae97c47b7f61cdd96f7f49b2c23 ]
  
  S5M8767 PMIC interrupt line on Exynos5250-based Arndale board has
  external pull-up resistors, so disable any pull control for it in
  in controller node. This fixes support for S5M8767 interrupts and
  enables operation of wakeup from S5M8767 RTC alarm.
  
  Signed-off-by: Marek Szyprowski 
  Signed-off-by: Krzysztof Kozlowski 
  Signed-off-by: Sasha Levin 


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-4.14/test-armhf-armhf-xl.xen-boot.html
Revision IDs in each graph node refer, respectively, to the Trees above.


Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/linux-4.14/test-armhf-armhf-xl.xen-boot 
--summary-out=tmp/147202.bisection-summary --basis-template=142849 
--blessings=real,real-bisect linux-4.14 test-armhf-armhf-xl xen-boot
Searching for failure / basis pass:
 147094 fail [host=arndale-westfield] / 143911 [host=cubietruck-gleizes] 143834 
[host=cubietruck-picasso] 143610 [host=arndale-bluewater] 143513 
[host=arndale-metrocentre] 143409 [host=arndale-lakeside] 143327 ok.
Failure / basis pass flights: 147094 / 143327
Tree: linux 
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: ovmf git://xenbits.xen.org/osstest/ovmf.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: seabios git://xenbits.xen.org/osstest/seabios.git
Tree: xen git://xenbits.xen.org/xen.git
Latest 98db2bf27b9ed2d5ed0b6c9c8a4bfcb127a19796 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
70911f1f4aee0366b6122f2b90d367ec0f066beb 
933ebad2470a169504799a1d95b8e410bd9847ef 
76551856b28d227cb0386a1ab0e774329b941f7d 
6c47c37b9b40d6fe40bce8c8fd39135f6d549c8c
Basis pass ddef1e8e3f6eb26034833b7255e3fa584d54a230 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
9e639c1cb6abd5ffed0f9017de26f93d2ee99eac 
933ebad2470a169504799a1d95b8e410bd9847ef 
120996f147131eca8af90e30c900bc14bc824d9f 
518c935fac4d30b3ec35d4b6add82b17b7d7aca3
Generating revisions with ./adhoc-revtuple-generator  
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git#ddef1e8e3f6eb26034833b7255e3fa584d54a230-98db2bf27b9ed2d5ed0b6c9c8a4bfcb127a19796
 
git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860
 
git://xenbits.xen.org/osstest/ovmf.git#9e639c1cb6abd5ffed0f9017de26f93d2ee99eac-70911f1f4aee0366b6122f2b90d367ec0f066beb
 git://xenbits.xen.org/qemu-xen.git#933ebad\
 2470a169504799a1d95b8e410bd9847ef-933ebad2470a169504799a1d95b8e410bd9847ef 
git://xenbits.xen.org/osstest/seabios.git#120996f147131eca8af90e30c900bc14bc824d9f-76551856b28d227cb0386a1ab0e774329b941f7d
 
git://xenbits.xen.org/xen.git#518c935fac4d30b3ec35d4b6add82b17b7d7aca3-6c47c37b9b40d6fe40bce8c8fd39135f6d549c8c
Use of uninitialized value $parents in array dereference at 
./adhoc-revtuple-generator line 465.
Use of uninitialized value in concatenation (.) or string at 
./adhoc-revtuple-generator line 465.
Loaded 13418 nodes in revision graph
Searching for test results:
 143327 pass ddef1e8e3f6eb26034833b7255e3fa584d54a230 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
9e639c1cb6abd5ffed0f9017de26f93d2ee99eac 
933ebad2470a169504799a1d95b8e410bd9847ef 
120996f147131eca8af90e30c900bc14bc824d9f 
518c935fac4d30b3ec35d4b6add82b17b7d7aca3
 143409 [host=arndale-lakeside]
 143513 [host=arndale-metrocentre]
 143610 [host=arndale-bluewater]
 143834 [host=cubietruck-picasso]
 143911 [host=cubietruck-gleizes]
 146857 fail e0f8b8a65a473a8baa439cf865a694bbeb83fe90 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
70911f1f4aee0366b6122f2b90d367ec0f066beb 
933ebad2470a169504799a1d95b8e410bd9847ef 
76551856b28d227cb0386a1ab0e774329b941f7d 
72dbcf0c065037dddb591a072c4f8f16fe888ea8
 146905 fail e0f8b8a65a473a8baa439cf865a694bbeb83fe90 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 

Re: [Xen-devel] [PATCH v3 3/3] x86/hyperv: L0 assisted TLB flush

2020-02-17 Thread Roger Pau Monné
On Mon, Feb 17, 2020 at 01:55:17PM +, Wei Liu wrote:
> Implement L0 assisted TLB flush for Xen on Hyper-V. It takes advantage
> of several hypercalls:
> 
>  * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST
>  * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX
>  * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE
>  * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX
> 
> Pick the most efficient hypercalls available.
> 
> Signed-off-by: Wei Liu 

Just two comments below.

> ---
> v3:
> 1. Address more comments.
> 2. Fix usage of max_vp_index.
> 3. Use the fill_gva_list algorithm from Linux.
> 
> v2:
> 1. Address Roger and Jan's comments re types etc.
> 2. Fix pointer arithmetic.
> 3. Misc improvement to code.
> ---
>  xen/arch/x86/guest/hyperv/Makefile  |   1 +
>  xen/arch/x86/guest/hyperv/private.h |   9 ++
>  xen/arch/x86/guest/hyperv/tlb.c | 173 +++-
>  xen/arch/x86/guest/hyperv/util.c|  74 
>  4 files changed, 256 insertions(+), 1 deletion(-)
>  create mode 100644 xen/arch/x86/guest/hyperv/util.c
> 
> diff --git a/xen/arch/x86/guest/hyperv/Makefile 
> b/xen/arch/x86/guest/hyperv/Makefile
> index 18902c33e9..0e39410968 100644
> --- a/xen/arch/x86/guest/hyperv/Makefile
> +++ b/xen/arch/x86/guest/hyperv/Makefile
> @@ -1,2 +1,3 @@
>  obj-y += hyperv.o
>  obj-y += tlb.o
> +obj-y += util.o
> diff --git a/xen/arch/x86/guest/hyperv/private.h 
> b/xen/arch/x86/guest/hyperv/private.h
> index 509bedaafa..79a77930a0 100644
> --- a/xen/arch/x86/guest/hyperv/private.h
> +++ b/xen/arch/x86/guest/hyperv/private.h
> @@ -24,12 +24,21 @@
>  
>  #include 
>  #include 
> +#include 

Do you still need to include types.h?

None of the additions to this header done in this patch seems to
require it AFAICT.

>  
>  DECLARE_PER_CPU(void *, hv_input_page);
>  DECLARE_PER_CPU(void *, hv_vp_assist);
>  DECLARE_PER_CPU(unsigned int, hv_vp_index);
>  
> +static inline unsigned int hv_vp_index(unsigned int cpu)
> +{
> +return per_cpu(hv_vp_index, cpu);
> +}
> +
>  int hyperv_flush_tlb(const cpumask_t *mask, const void *va,
>   unsigned int flags);
>  
> +/* Returns number of banks, -ev if error */
> +int cpumask_to_vpset(struct hv_vpset *vpset, const cpumask_t *mask);
> +
>  #endif /* __XEN_HYPERV_PRIVIATE_H__  */
> diff --git a/xen/arch/x86/guest/hyperv/tlb.c b/xen/arch/x86/guest/hyperv/tlb.c
> index 48f527229e..8cd1c6f0ed 100644
> --- a/xen/arch/x86/guest/hyperv/tlb.c
> +++ b/xen/arch/x86/guest/hyperv/tlb.c
> @@ -19,17 +19,188 @@
>   * Copyright (c) 2020 Microsoft.
>   */
>  
> +#include 
>  #include 
>  #include 
>  
> +#include 
> +#include 
> +#include 
> +
>  #include "private.h"
>  
> +/*
> + * It is possible to encode up to 4096 pages using the lower 12 bits
> + * in an element of gva_list
> + */
> +#define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE)
> +
> +static unsigned int fill_gva_list(uint64_t *gva_list, const void *va,
> +  unsigned int order)
> +{
> +unsigned long cur = (unsigned long)va;
> +/* end is 1 past the range to be flushed */
> +unsigned long end = cur + (PAGE_SIZE << order);
> +unsigned int n = 0;
> +
> +do {
> +unsigned long diff = end - cur;
> +
> +gva_list[n] = cur & PAGE_MASK;
> +
> +/*
> + * Use lower 12 bits to encode the number of additional pages
> + * to flush
> + */
> +if ( diff >= HV_TLB_FLUSH_UNIT )
> +{
> +gva_list[n] |= ~PAGE_MASK;
> +cur += HV_TLB_FLUSH_UNIT;
> +}
> +else
> +{
> +gva_list[n] |= (diff - 1) >> PAGE_SHIFT;
> +cur = end;
> +}
> +
> +n++;
> +} while ( cur < end );
> +
> +return n;
> +}
> +
> +static uint64_t flush_tlb_ex(const cpumask_t *mask, const void *va,
> + unsigned int flags)
> +{
> +struct hv_tlb_flush_ex *flush = this_cpu(hv_input_page);
> +int nr_banks;
> +unsigned int max_gvas, order = flags & FLUSH_ORDER_MASK;
> +uint64_t *gva_list;
> +
> +if ( !flush || local_irq_is_enabled() )
> +{
> +ASSERT_UNREACHABLE();
> +return ~0ULL;
> +}
> +
> +if ( !(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED) )
> +return ~0ULL;
> +
> +flush->address_space = 0;
> +flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
> +if ( !(flags & FLUSH_TLB_GLOBAL) )
> +flush->flags |= HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY;
> +
> +nr_banks = cpumask_to_vpset(>hv_vp_set, mask);
> +if ( nr_banks < 0 )
> +return ~0ULL;
> +
> +max_gvas =
> +(PAGE_SIZE - sizeof(*flush) - nr_banks *
> + sizeof(flush->hv_vp_set.bank_contents[0])) /
> +sizeof(uint64_t);   /* gva is represented as uint64_t */
> +
> +/*
> + * Flush the entire address space if va is NULL or if there is not
> + * enough space for gva_list.
> + */
> +if ( !va || (PAGE_SIZE << order) / HV_TLB_FLUSH_UNIT > max_gvas )
> +return 

Re: [Xen-devel] [PATCH] AMD/IOMMU: Common the #732/#733 errata handling in iommu_read_log()

2020-02-17 Thread Jan Beulich
On 14.02.2020 19:55, Andrew Cooper wrote:
> There is no need to have both helpers implement the same workaround.  The size
> and layout of the the Event and PPR logs (and others for that matter) share a
> lot of commonality.
> 
> Use MASK_EXTR() to locate the code field, and use ACCESS_ONCE() rather than
> barrier() to prevent hoisting of the repeated read.
> 
> Avoid unnecessary zeroing by only clobbering the 'code' field - this alone is
> sufficient to spot the errata when the rings wrap.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 
with one remark / adjustment request:

> @@ -319,11 +319,36 @@ static int iommu_read_log(struct amd_iommu *iommu,
>  
>  while ( tail != log->head )
>  {
> -/* read event log entry */
> -entry = log->buffer + log->head;
> +uint32_t *entry = log->buffer + log->head;
> +unsigned int count = 0;
> +
> +/* Event and PPR logs have their code field in the same position. */
> +unsigned int code = MASK_EXTR(entry[1], IOMMU_EVENT_CODE_MASK);
> +
> +/*
> + * Workaround for errata #732, #733:
> + *
> + * It can happen that the tail pointer is updated before the actual
> + * entry got written. As suggested by RevGuide, we initialize the
> + * buffer to all zeros and clear entries after processing them.

I don't think "clear entries" is applicable anymore with ...

> + */
> +while ( unlikely(code == 0) )
> +{
> +if ( unlikely(++count == IOMMU_LOG_ENTRY_TIMEOUT) )
> +{
> +AMD_IOMMU_DEBUG("AMD-Vi: No entry written to %s Log\n",
> +log == >event_log ? "Event" : "PPR");
> +return 0;
> +}
> +udelay(1);
> +code = MASK_EXTR(ACCESS_ONCE(entry[1]), IOMMU_EVENT_CODE_MASK);
> +}
>  
>  parse_func(iommu, entry);
>  
> +/* Clear 'code' to be able to spot the erratum when the ring wraps. 
> */
> +ACCESS_ONCE(entry[1]) = 0;

... this. Perhaps at least add "sufficiently"?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/6] tools/libx[cl]: Don't use HVM_PARAM_PAE_ENABLED as a function parameter

2020-02-17 Thread Ian Jackson
Andrew Cooper writes ("[PATCH 5/6] tools/libx[cl]: Don't use 
HVM_PARAM_PAE_ENABLED as a function parameter"):
> The sole use of HVM_PARAM_PAE_ENABLED is as a non-standard calling
> convention for xc_cpuid_apply_policy().  Pass PAE as a regular
> parameter instead.

Following our conversation on irc, I have tried to write an
explanation in my own words of what this commit is doing.

  The value of HVM_PARAM_PAE_ENABLED is set by the toolstack.  And the
  only place that reads it is also in the toolstack, in
  xc_cpuid_apply_policy.  Effectively, this hypervisor domain
  parameter is used solely as a way to pass this boolean parameter
  from one part of the toolstack to another.

  This is not sensible.

  Replace its use in xc_cpuid_apply_policy with a plain boolean
  parameter, passed directly by the one (in-tree) caller.
  The now-redundant code to set the value in the hypervisor will be
  deleted in the next patch.

> Leave a rather better explaination of why only HVM guests have a
> choice in PAE setting.

I approve of this part of the commit message.

> No functional change.

I would write

   No overall functional change.  The new code fior calculating the
   `pae' value (in libxl__cpuid_legacy) is isomorphic to the
   obselescent code (in libxl_x86.c).

I had a look to see whether this was true and it seemed to be.

>  /*
> - * HVM_PARAM_PAE_ENABLED is a parameter to this function, stashed in
> - * Xen.  Nothing else has ever taken notice of the value.
> + * PAE used to be a parameter passed to this function by
> + * HVM_PARAM_PAE_ENABLED.  It is now passed normally.
>   */

I find this phrasing confusing, particularly this very loose use of
the word `parameter'.  I would drop this comment entirely and let the
commit message stand as the historical documentation.

>  char *cpuid_res[4];
> +bool pae = true;
> +
> +/*
> + * PAE is a Xen-controlled for PV guests (it is the 'p' that causes the
> + * difference between the xen-3.0-x86_32 and xen-3.0-x86_32p ABIs).  It\
 is
> + * mandatory as Xen is running in 64bit mode.
> + *
> + * PVH guests don't have a top-level PAE control, and is treated as
> + * available.
> + */

I approve of putting a new comment here with an explanation.  However,
it should be wrapped rather more tightly (eg, in my MUA it is now
suffering from wrap damage as I demonstrate above) and it seems to
have some problems with the grammar ?  And I think the 2nd sentence
"It is mandatory" could usefully be re-qualified "for PV guests".  Or
you could write something like this.

   PV guests: PAE is Xen-controlled (it is the 'p' that causes the
   difference between the xen-3.0-x86_32 and xen-3.0-x86_32p ABIs);
   Xen is in 64-bit mode so PAE is mandatory.

   PVH guests: there is no top-level PAE control in the libx domain
   config; we always make it available.

   So only this test only applies to HVM guests:

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Jürgen Groß

On 17.02.20 15:23, Igor Druzhinin wrote:

On 17/02/2020 12:30, Igor Druzhinin wrote:

On 17/02/2020 12:28, Jürgen Groß wrote:

On 17.02.20 13:26, Igor Druzhinin wrote:

On 17/02/2020 07:20, Juergen Gross wrote:

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Finally switch rcu_barrier() to return void as it now can never fail.



Would this implementation guarantee progress as previous implementation
guaranteed?


Yes.


Thanks, I'll put it to test today to see if it solves our use case.


Just manually tried it - gives infinite (up to stack size) trace like:

(XEN) [1.496520][] F softirq.c#__do_softirq+0x85/0x90
(XEN) [1.496561][] F 
process_pending_softirqs+0x35/0x37
(XEN) [1.496600][] F 
rcupdate.c#rcu_process_callbacks+0x1df/0x1f6
(XEN) [1.496643][] F softirq.c#__do_softirq+0x85/0x90
(XEN) [1.496685][] F 
process_pending_softirqs+0x35/0x37
(XEN) [1.496726][] F 
rcupdate.c#rcu_process_callbacks+0x1df/0x1f6
(XEN) [1.496766][] F softirq.c#__do_softirq+0x85/0x90
(XEN) [1.496806][] F 
process_pending_softirqs+0x35/0x37
(XEN) [1.496847][] F 
rcupdate.c#rcu_process_callbacks+0x1df/0x1f6
(XEN) [1.496887][] F softirq.c#__do_softirq+0x85/0x90
(XEN) [1.496927][] F 
process_pending_softirqs+0x35/0x37


Interesting I didn't run into this problem. Obviously I managed to
forget handling the case of recursion.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH V2] iommu/arm: Don't allow the same micro-TLB to be shared between domains

2020-02-17 Thread Oleksandr Tyshchenko
From: Oleksandr Tyshchenko 

For the IPMMU-VMSA we need to prevent the use cases where devices
which use the same micro-TLB are assigned to different Xen domains
(micro-TLB cannot be shared between multiple Xen domains, since it
points to the context bank to use for the page walk).

As each Xen domain uses individual context bank pointed by context_id,
we can potentially recognize that use case by comparing current and new
context_id for the already enabled micro-TLB and prevent different
context bank from being set.

Signed-off-by: Oleksandr Tyshchenko 

---

CC: Julien Grall 
CC: Stefano Stabellini 
CC: Volodymyr Babchuk 
CC: Yoshihiro Shimoda 

Changes V1 [1] -> V2:
   - Rename "data" to "imuctr" in ipmmu_utlb_enable()
   - Disable already enabled uTLBs in ipmmu_attach_device()
 in case of error

[1] https://patchwork.kernel.org/patch/11356303/

---
 xen/drivers/passthrough/arm/ipmmu-vmsa.c | 49 
 1 file changed, 43 insertions(+), 6 deletions(-)

diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c 
b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
index 9cfae7e..b2a65df 100644
--- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -257,6 +257,7 @@ static DEFINE_SPINLOCK(ipmmu_devices_lock);
 #define IMUCTR_TTSEL_MMU(n)((n) << 4)
 #define IMUCTR_TTSEL_PMB   (8 << 4)
 #define IMUCTR_TTSEL_MASK  (15 << 4)
+#define IMUCTR_TTSEL_SHIFT 4
 #define IMUCTR_FLUSH   (1 << 1)
 #define IMUCTR_MMUEN   (1 << 0)
 
@@ -434,19 +435,45 @@ static void ipmmu_tlb_invalidate(struct ipmmu_vmsa_domain 
*domain)
 }
 
 /* Enable MMU translation for the micro-TLB. */
-static void ipmmu_utlb_enable(struct ipmmu_vmsa_domain *domain,
-  unsigned int utlb)
+static int ipmmu_utlb_enable(struct ipmmu_vmsa_domain *domain,
+ unsigned int utlb)
 {
 struct ipmmu_vmsa_device *mmu = domain->mmu;
+uint32_t imuctr;
+
+/*
+ * We need to prevent the use cases where devices which use the same
+ * micro-TLB are assigned to different Xen domains (micro-TLB cannot be
+ * shared between multiple Xen domains, since it points to the context bank
+ * to use for the page walk).
+ * As each Xen domain uses individual context bank pointed by context_id,
+ * we can potentially recognize that use case by comparing current and new
+ * context_id for already enabled micro-TLB and prevent different context
+ * bank from being set.
+ */
+imuctr = ipmmu_read(mmu, IMUCTR(utlb));
+if ( imuctr & IMUCTR_MMUEN )
+{
+unsigned int context_id;
+
+context_id = (imuctr & IMUCTR_TTSEL_MASK) >> IMUCTR_TTSEL_SHIFT;
+if ( domain->context_id != context_id )
+{
+dev_err(mmu->dev, "Micro-TLB %u already assigned to IPMMU context 
%u\n",
+utlb, context_id);
+return -EINVAL;
+}
+}
 
 /*
  * TODO: Reference-count the micro-TLB as several bus masters can be
- * connected to the same micro-TLB. Prevent the use cases where
- * the same micro-TLB could be shared between multiple Xen domains.
+ * connected to the same micro-TLB.
  */
 ipmmu_write(mmu, IMUASID(utlb), 0);
-ipmmu_write(mmu, IMUCTR(utlb), ipmmu_read(mmu, IMUCTR(utlb)) |
+ipmmu_write(mmu, IMUCTR(utlb), imuctr |
 IMUCTR_TTSEL_MMU(domain->context_id) | IMUCTR_MMUEN);
+
+return 0;
 }
 
 /* Disable MMU translation for the micro-TLB. */
@@ -671,7 +698,17 @@ static int ipmmu_attach_device(struct ipmmu_vmsa_domain 
*domain,
 dev_info(dev, "Reusing IPMMU context %u\n", domain->context_id);
 
 for ( i = 0; i < fwspec->num_ids; ++i )
-ipmmu_utlb_enable(domain, fwspec->ids[i]);
+{
+int ret = ipmmu_utlb_enable(domain, fwspec->ids[i]);
+
+if ( ret )
+{
+while ( i-- )
+ipmmu_utlb_disable(domain, fwspec->ids[i]);
+
+return ret;
+}
+}
 
 return 0;
 }
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] live migration from 4.12 to 4.13 fails due to qemu-xen bug

2020-02-17 Thread Olaf Hering
Am Mon, 27 Jan 2020 11:54:37 +
schrieb "Durrant, Paul" :

> I suppose. Could we have "pc-i440fx" as the default, which libxl prefix 
> matches against qemu's supported versions to select the latest? I guess that 
> would work.

This can not be fixed in libxl because libxl can not possibly know what is 
inside the domU.

With '-machine xenfv' the PCI device is at :00:02.0/platform, while with 
'-machine pc-i440fx-3.1,accel=xen -device xen-platform' the PCI device is 
somewhere else. As a result the receiving host rejects this approach:

qemu-system-i386: Unknown savevm section or instance ':00:02.0/platform' 0. 
Make sure that your current VM setup matches your saved VM setup, including any 
hotplugged devices

In my earlier testing I forced -machine pc-i440fx* on the sending side, and did 
not spot the flaw in this patch for libxl.

For short: we are doomed...


Olaf


pgp717Kt_PwMR.pgp
Description: Digitale Signatur von OpenPGP
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v5 0/7] xl/libxl: domid allocation/preservation changes

2020-02-17 Thread Durrant, Paul
Ping?

> -Original Message-
> From: Paul Durrant 
> Sent: 31 January 2020 15:02
> To: xen-devel@lists.xenproject.org
> Cc: Durrant, Paul ; Andrew Cooper
> ; Anthony PERARD ;
> George Dunlap ; Ian Jackson
> ; Jan Beulich ; Jason
> Andryuk ; Julien Grall ; Konrad
> Rzeszutek Wilk ; Stefano Stabellini
> ; Wei Liu 
> Subject: [PATCH v5 0/7] xl/libxl: domid allocation/preservation changes
> 
> Paul Durrant (7):
>   libxl: add definition of INVALID_DOMID to the API
>   libxl_create: make 'soft reset' explicit
>   libxl: generalise libxl__domain_userdata_lock()
>   libxl: add infrastructure to track and query 'recent' domids
>   libxl: allow creation of domains with a specified or random domid
>   xl.conf: introduce 'domid_policy'
>   xl: allow domid to be preserved on save/restore or migrate
> 
>  docs/man/xl.1.pod.in  |  14 +++
>  docs/man/xl.conf.5.pod|  10 ++
>  tools/examples/xl.conf|   4 +
>  tools/helpers/xen-init-dom0.c |  30 +
>  tools/libxl/libxl.h   |  15 ++-
>  tools/libxl/libxl_create.c| 125 ++-
>  tools/libxl/libxl_device.c|   4 +-
>  tools/libxl/libxl_disk.c  |  12 +-
>  tools/libxl/libxl_dm.c|   2 +-
>  tools/libxl/libxl_dom.c   |  12 +-
>  tools/libxl/libxl_domain.c| 218 --
>  tools/libxl/libxl_internal.c  |  67 +++
>  tools/libxl/libxl_internal.h  |  30 +++--
>  tools/libxl/libxl_mem.c   |   8 +-
>  tools/libxl/libxl_pci.c   |   4 +-
>  tools/libxl/libxl_types.idl   |   1 +
>  tools/libxl/libxl_usb.c   |   8 +-
>  tools/xl/xl.c |  10 ++
>  tools/xl/xl.h |   2 +
>  tools/xl/xl_cmdtable.c|   6 +-
>  tools/xl/xl_migrate.c |  15 ++-
>  tools/xl/xl_saverestore.c |  19 ++-
>  tools/xl/xl_utils.h   |   2 -
>  tools/xl/xl_vmcontrol.c   |   3 +
>  xen/include/public/xen.h  |   3 +
>  25 files changed, 517 insertions(+), 107 deletions(-)
> ---
> Cc: Andrew Cooper 
> Cc: Anthony PERARD 
> Cc: George Dunlap 
> Cc: Ian Jackson 
> Cc: Jan Beulich 
> Cc: Jason Andryuk 
> Cc: Julien Grall 
> Cc: Konrad Rzeszutek Wilk 
> Cc: Stefano Stabellini 
> Cc: Wei Liu 
> --
> 2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Igor Druzhinin
On 17/02/2020 12:30, Igor Druzhinin wrote:
> On 17/02/2020 12:28, Jürgen Groß wrote:
>> On 17.02.20 13:26, Igor Druzhinin wrote:
>>> On 17/02/2020 07:20, Juergen Gross wrote:
 Today rcu_barrier() is calling stop_machine_run() to synchronize all
 physical cpus in order to ensure all pending rcu calls have finished
 when returning.

 As stop_machine_run() is using tasklets this requires scheduling of
 idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
 cpus only in case of core scheduling being active, as otherwise a
 scheduling deadlock would occur.

 There is no need at all to do the syncing of the cpus in tasklets, as
 rcu activity is started in __do_softirq() called whenever softirq
 activity is allowed. So rcu_barrier() can easily be modified to use
 softirq for synchronization of the cpus no longer requiring any
 scheduling activity.

 As there already is a rcu softirq reuse that for the synchronization.

 Finally switch rcu_barrier() to return void as it now can never fail.

>>>
>>> Would this implementation guarantee progress as previous implementation
>>> guaranteed?
>>
>> Yes.
> 
> Thanks, I'll put it to test today to see if it solves our use case.

Just manually tried it - gives infinite (up to stack size) trace like:

(XEN) [1.496520][] F softirq.c#__do_softirq+0x85/0x90
(XEN) [1.496561][] F 
process_pending_softirqs+0x35/0x37
(XEN) [1.496600][] F 
rcupdate.c#rcu_process_callbacks+0x1df/0x1f6
(XEN) [1.496643][] F softirq.c#__do_softirq+0x85/0x90
(XEN) [1.496685][] F 
process_pending_softirqs+0x35/0x37
(XEN) [1.496726][] F 
rcupdate.c#rcu_process_callbacks+0x1df/0x1f6
(XEN) [1.496766][] F softirq.c#__do_softirq+0x85/0x90
(XEN) [1.496806][] F 
process_pending_softirqs+0x35/0x37
(XEN) [1.496847][] F 
rcupdate.c#rcu_process_callbacks+0x1df/0x1f6
(XEN) [1.496887][] F softirq.c#__do_softirq+0x85/0x90
(XEN) [1.496927][] F 
process_pending_softirqs+0x35/0x37

Igor

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Durrant, Paul
> -Original Message-
> From: Wei Liu  On Behalf Of Wei Liu
> Sent: 17 February 2020 13:55
> To: Xen Development List 
> Cc: Michael Kelley ; Durrant, Paul
> ; Wei Liu ; Roger Pau Monné
> ; Wei Liu ; Jan Beulich
> ; Andrew Cooper 
> Subject: [PATCH v3 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> 
> Implement a basic hook for L0 assisted TLB flush. The hook needs to
> check if prerequisites are met. If they are not met, it returns an error
> number to fall back to native flushes.
> 
> Introduce a new variable to indicate if hypercall page is ready.
> 
> Signed-off-by: Wei Liu 
> Reviewed-by: Roger Pau Monné 

Reviewed-by: Paul Durrant 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V2] x86/altp2m: Hypercall to set altp2m view visibility

2020-02-17 Thread Jan Beulich
On 30.01.2020 14:07, Alexandru Stefan ISAILA wrote:
> @@ -4814,6 +4815,30 @@ static int do_altp2m_op(
>  break;
>  }
>  
> +case HVMOP_altp2m_set_visibility:
> +{
> +uint16_t altp2m_idx = a.u.set_visibility.altp2m_idx;
> +
> +if ( a.u.set_visibility.pad || a.u.set_visibility.pad2 )
> +rc = -EINVAL;
> +else
> +{
> +if ( !altp2m_active(d) || !hap_enabled(d) )

Doesn't altp2m_active() imply hap_enabled()? At the very least
there's no other use of hap_enabled() in do_altp2m_op().

> +{
> +rc = -EOPNOTSUPP;
> +break;
> +}
> +
> +if ( a.u.set_visibility.visible )
> +d->arch.altp2m_working_eptp[altp2m_idx] =
> +d->arch.altp2m_eptp[altp2m_idx];
> +else
> +d->arch.altp2m_working_eptp[altp2m_idx] =
> +mfn_x(INVALID_MFN);
> +}
> +break;

Also the code here lends itself to reduction of indentation
depth:

case HVMOP_altp2m_set_visibility:
{
uint16_t altp2m_idx = a.u.set_visibility.altp2m_idx;

if ( a.u.set_visibility.pad || a.u.set_visibility.pad2 )
rc = -EINVAL;
else if ( !altp2m_active(d) || !hap_enabled(d) )
rc = -EOPNOTSUPP;
else if ( a.u.set_visibility.visible )
d->arch.altp2m_working_eptp[altp2m_idx] =
d->arch.altp2m_eptp[altp2m_idx];
else
d->arch.altp2m_working_eptp[altp2m_idx] =
mfn_x(INVALID_MFN);

break;
}


Also note the altered indentation of the assignments.

> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -488,8 +488,17 @@ int hap_enable(struct domain *d, u32 mode)
>  goto out;
>  }
>  
> +if ( (d->arch.altp2m_working_eptp = alloc_xenheap_page()) == NULL )
> +{
> +rv = -ENOMEM;
> +goto out;
> +}

Isn't there a pre-existing error handling issue here which you
widen, in that later encountered errors don't cause clean up
of what had already succeeded before?

> @@ -2651,6 +2652,8 @@ int p2m_destroy_altp2m_by_id(struct domain *d, unsigned 
> int idx)
>  p2m_reset_altp2m(d, idx, ALTP2M_DEACTIVATE);
>  d->arch.altp2m_eptp[array_index_nospec(idx, MAX_EPTP)] =
>  mfn_x(INVALID_MFN);
> +d->arch.altp2m_working_eptp[array_index_nospec(idx, MAX_EPTP)] =
> +mfn_x(INVALID_MFN);

Like above, and irrespective of you cloning existing code -
indentation of the 2nd line is wrong here.

> --- a/xen/include/public/hvm/hvm_op.h
> +++ b/xen/include/public/hvm/hvm_op.h
> @@ -317,6 +317,13 @@ struct xen_hvm_altp2m_get_vcpu_p2m_idx {
>  uint16_t altp2m_idx;
>  };
>  
> +struct xen_hvm_altp2m_set_visibility {
> +uint16_t altp2m_idx;
> +uint8_t visible;
> +uint8_t pad;
> +uint32_t pad2;
> +};

What is pad2 good/intended for? 32-bit padding fields in some
other structures are needed because one or more uint64_t
fields follow, but this isn't the case here.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 0/3] Xen on Hyper-V: Implement L0 assisted TLB flush

2020-02-17 Thread Wei Liu
Hi all

This seris is based on Roger's L0 assisted flush series.

I have done some testing against a Linux on Hyper-V in a 32-vcpu VM.
All builds were done with -j32.



Building Xen on Linux:
real0m45.376s
user2m28.156s
sys 0m51.672s

Building Xen on Linux on Xen on Hyper-V, no assisted flush:
real3m8.762s
user10m46.787s
sys 30m14.492s

Building Xen on Linux on Xen on Hyper-V, with assisted flush:
real0m44.369s
user3m16.231s
sys 3m3.330s



Building Linux x86_64_defconfig on Linux:
real0m59.698s
user21m14.014s
sys 2m58.742s

Building Linux x86_64_defconfig on Linux on Xen on Hyper-V, no assisted
flush:
real2m6.284s
user31m18.706s
sys 20m31.106s

Building Linux x86_64_defconfig on Linux on Xen on Hyper-V, with assisted
flush:
real1m38.968s
user28m40.398s
sys 11m20.151s



There are various degrees of improvement depending on the workload. Xen
can perhaps be optmised a bit more because it currently doesn't pass the
address space id (cr3) to Hyper-V, but that requires reworking TLB flush
APIs within Xen.

Wei.

Cc: Jan Beulich 
Cc: Andrew Cooper 
Cc: Wei Liu 
Cc: Roger Pau Monné 
Cc: Michael Kelley 
Cc: Paul Durrant 

Wei Liu (3):
  x86/hypervisor: pass flags to hypervisor_flush_tlb
  x86/hyperv: skeleton for L0 assisted TLB flush
  x86/hyperv: L0 assisted TLB flush

 xen/arch/x86/guest/hyperv/Makefile |   2 +
 xen/arch/x86/guest/hyperv/hyperv.c |  17 ++
 xen/arch/x86/guest/hyperv/private.h|  13 ++
 xen/arch/x86/guest/hyperv/tlb.c| 212 +
 xen/arch/x86/guest/hyperv/util.c   |  74 +
 xen/arch/x86/guest/hypervisor.c|   7 +-
 xen/arch/x86/guest/xen/xen.c   |   2 +-
 xen/arch/x86/smp.c |   5 +-
 xen/include/asm-x86/flushtlb.h |   3 +
 xen/include/asm-x86/guest/hypervisor.h |  10 +-
 10 files changed, 334 insertions(+), 11 deletions(-)
 create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
 create mode 100644 xen/arch/x86/guest/hyperv/util.c

-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 3/3] x86/hyperv: L0 assisted TLB flush

2020-02-17 Thread Wei Liu
Implement L0 assisted TLB flush for Xen on Hyper-V. It takes advantage
of several hypercalls:

 * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST
 * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX
 * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE
 * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX

Pick the most efficient hypercalls available.

Signed-off-by: Wei Liu 
---
v3:
1. Address more comments.
2. Fix usage of max_vp_index.
3. Use the fill_gva_list algorithm from Linux.

v2:
1. Address Roger and Jan's comments re types etc.
2. Fix pointer arithmetic.
3. Misc improvement to code.
---
 xen/arch/x86/guest/hyperv/Makefile  |   1 +
 xen/arch/x86/guest/hyperv/private.h |   9 ++
 xen/arch/x86/guest/hyperv/tlb.c | 173 +++-
 xen/arch/x86/guest/hyperv/util.c|  74 
 4 files changed, 256 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/x86/guest/hyperv/util.c

diff --git a/xen/arch/x86/guest/hyperv/Makefile 
b/xen/arch/x86/guest/hyperv/Makefile
index 18902c33e9..0e39410968 100644
--- a/xen/arch/x86/guest/hyperv/Makefile
+++ b/xen/arch/x86/guest/hyperv/Makefile
@@ -1,2 +1,3 @@
 obj-y += hyperv.o
 obj-y += tlb.o
+obj-y += util.o
diff --git a/xen/arch/x86/guest/hyperv/private.h 
b/xen/arch/x86/guest/hyperv/private.h
index 509bedaafa..79a77930a0 100644
--- a/xen/arch/x86/guest/hyperv/private.h
+++ b/xen/arch/x86/guest/hyperv/private.h
@@ -24,12 +24,21 @@
 
 #include 
 #include 
+#include 
 
 DECLARE_PER_CPU(void *, hv_input_page);
 DECLARE_PER_CPU(void *, hv_vp_assist);
 DECLARE_PER_CPU(unsigned int, hv_vp_index);
 
+static inline unsigned int hv_vp_index(unsigned int cpu)
+{
+return per_cpu(hv_vp_index, cpu);
+}
+
 int hyperv_flush_tlb(const cpumask_t *mask, const void *va,
  unsigned int flags);
 
+/* Returns number of banks, -ev if error */
+int cpumask_to_vpset(struct hv_vpset *vpset, const cpumask_t *mask);
+
 #endif /* __XEN_HYPERV_PRIVIATE_H__  */
diff --git a/xen/arch/x86/guest/hyperv/tlb.c b/xen/arch/x86/guest/hyperv/tlb.c
index 48f527229e..8cd1c6f0ed 100644
--- a/xen/arch/x86/guest/hyperv/tlb.c
+++ b/xen/arch/x86/guest/hyperv/tlb.c
@@ -19,17 +19,188 @@
  * Copyright (c) 2020 Microsoft.
  */
 
+#include 
 #include 
 #include 
 
+#include 
+#include 
+#include 
+
 #include "private.h"
 
+/*
+ * It is possible to encode up to 4096 pages using the lower 12 bits
+ * in an element of gva_list
+ */
+#define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE)
+
+static unsigned int fill_gva_list(uint64_t *gva_list, const void *va,
+  unsigned int order)
+{
+unsigned long cur = (unsigned long)va;
+/* end is 1 past the range to be flushed */
+unsigned long end = cur + (PAGE_SIZE << order);
+unsigned int n = 0;
+
+do {
+unsigned long diff = end - cur;
+
+gva_list[n] = cur & PAGE_MASK;
+
+/*
+ * Use lower 12 bits to encode the number of additional pages
+ * to flush
+ */
+if ( diff >= HV_TLB_FLUSH_UNIT )
+{
+gva_list[n] |= ~PAGE_MASK;
+cur += HV_TLB_FLUSH_UNIT;
+}
+else
+{
+gva_list[n] |= (diff - 1) >> PAGE_SHIFT;
+cur = end;
+}
+
+n++;
+} while ( cur < end );
+
+return n;
+}
+
+static uint64_t flush_tlb_ex(const cpumask_t *mask, const void *va,
+ unsigned int flags)
+{
+struct hv_tlb_flush_ex *flush = this_cpu(hv_input_page);
+int nr_banks;
+unsigned int max_gvas, order = flags & FLUSH_ORDER_MASK;
+uint64_t *gva_list;
+
+if ( !flush || local_irq_is_enabled() )
+{
+ASSERT_UNREACHABLE();
+return ~0ULL;
+}
+
+if ( !(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED) )
+return ~0ULL;
+
+flush->address_space = 0;
+flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
+if ( !(flags & FLUSH_TLB_GLOBAL) )
+flush->flags |= HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY;
+
+nr_banks = cpumask_to_vpset(>hv_vp_set, mask);
+if ( nr_banks < 0 )
+return ~0ULL;
+
+max_gvas =
+(PAGE_SIZE - sizeof(*flush) - nr_banks *
+ sizeof(flush->hv_vp_set.bank_contents[0])) /
+sizeof(uint64_t);   /* gva is represented as uint64_t */
+
+/*
+ * Flush the entire address space if va is NULL or if there is not
+ * enough space for gva_list.
+ */
+if ( !va || (PAGE_SIZE << order) / HV_TLB_FLUSH_UNIT > max_gvas )
+return hv_do_rep_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX, 0,
+   nr_banks, virt_to_maddr(flush), 0);
+
+/*
+ * The calculation of gva_list address requires the structure to
+ * be 64 bits aligned.
+ */
+BUILD_BUG_ON(sizeof(*flush) % sizeof(uint64_t));
+gva_list = (uint64_t *)flush + sizeof(*flush) / sizeof(uint64_t) + 
nr_banks;
+
+return hv_do_rep_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX,
+   fill_gva_list(gva_list, va, order),
+   

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Jürgen Groß

On 17.02.20 14:47, Roger Pau Monné wrote:

On Mon, Feb 17, 2020 at 02:17:23PM +0100, Jürgen Groß wrote:

On 17.02.20 13:49, Roger Pau Monné wrote:

On Mon, Feb 17, 2020 at 01:32:59PM +0100, Jürgen Groß wrote:

On 17.02.20 13:17, Roger Pau Monné wrote:

On Mon, Feb 17, 2020 at 01:11:59PM +0100, Jürgen Groß wrote:

On 17.02.20 12:49, Julien Grall wrote:

Hi Juergen,

On 17/02/2020 07:20, Juergen Gross wrote:

+void rcu_barrier(void)
     {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, _count, NR_CPUS);
+    if ( !atomic_cmpxchg(_count, 0, num_online_cpus()) )


What does prevent the cpu_online_map to change under your feet?
Shouldn't you grab the lock via get_cpu_maps()?


Oh, indeed.

This in turn will require a modification of the logic to detect parallel
calls on multiple cpus.


If you pick my patch to turn that into a rw lock you shouldn't worry
about parallel calls I think, but the lock acquisition can still fail
if there's a CPU plug/unplug going on:

https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg00940.html


Thanks, but letting rcu_barrier() fail is a no go, so I still need to
handle that case (I mean the case failing to get the lock). And handling
of parallel calls is not needed to be functional correct, but to avoid
not necessary cpu synchronization (each parallel call detected can just
wait until the master has finished and then return).

BTW - The recursive spinlock today would allow for e.g. rcu_barrier() to
be called inside a CPU plug/unplug section. Your rwlock is removing that
possibility. Any chance that could be handled?


While this might be interesting for the rcu stuff, it certainly isn't
for other pieces also relying on the cpu maps lock.

Ie: get_cpu_maps must fail when called by send_IPI_mask if there's a
CPU plug/unplug operation going on, even if it's on the same pCPU
that's holding the lock in write mode.


Sure? How is cpu_down() working then?


send_IPI_mask failing to acquire the cpu maps lock prevents it from
using the APIC shorthand, which is what we want in that case.


It is calling stop_machine_run()
which is using send_IPI_mask()...


Xen should avoid using the APIC shorthand in that case, which I don't
think it's happening now, as the lock is recursive.


In fact the code area where this is true is much smaller than that
protected by the lock.

Basically only __cpu_disable() and __cpu_up() (on x86) are critical in
this regard.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 1/3] x86/hypervisor: pass flags to hypervisor_flush_tlb

2020-02-17 Thread Wei Liu
Hyper-V's L0 assisted flush has fine-grained control over what gets
flushed. We need all the flags available to make the best decisions
possible.

No functional change because Xen's implementation doesn't care about
what is passed to it.

Signed-off-by: Wei Liu 
Reviewed-by: Roger Pau Monné 
Reviewed-by: Paul Durrant 
---
v2:
1. Introduce FLUSH_TLB_FLAGS_MASK
---
 xen/arch/x86/guest/hypervisor.c|  7 +--
 xen/arch/x86/guest/xen/xen.c   |  2 +-
 xen/arch/x86/smp.c |  5 ++---
 xen/include/asm-x86/flushtlb.h |  3 +++
 xen/include/asm-x86/guest/hypervisor.h | 10 +-
 5 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/guest/hypervisor.c b/xen/arch/x86/guest/hypervisor.c
index 47e938e287..6ee28c9df1 100644
--- a/xen/arch/x86/guest/hypervisor.c
+++ b/xen/arch/x86/guest/hypervisor.c
@@ -75,10 +75,13 @@ void __init hypervisor_e820_fixup(struct e820map *e820)
 }
 
 int hypervisor_flush_tlb(const cpumask_t *mask, const void *va,
- unsigned int order)
+ unsigned int flags)
 {
+if ( flags & ~FLUSH_TLB_FLAGS_MASK )
+return -EINVAL;
+
 if ( ops.flush_tlb )
-return alternative_call(ops.flush_tlb, mask, va, order);
+return alternative_call(ops.flush_tlb, mask, va, flags);
 
 return -ENOSYS;
 }
diff --git a/xen/arch/x86/guest/xen/xen.c b/xen/arch/x86/guest/xen/xen.c
index 5d3427a713..0eb1115c4d 100644
--- a/xen/arch/x86/guest/xen/xen.c
+++ b/xen/arch/x86/guest/xen/xen.c
@@ -324,7 +324,7 @@ static void __init e820_fixup(struct e820map *e820)
 pv_shim_fixup_e820(e820);
 }
 
-static int flush_tlb(const cpumask_t *mask, const void *va, unsigned int order)
+static int flush_tlb(const cpumask_t *mask, const void *va, unsigned int flags)
 {
 return xen_hypercall_hvm_op(HVMOP_flush_tlbs, NULL);
 }
diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index c7caf5bc26..4dab74c0d5 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -258,9 +258,8 @@ void flush_area_mask(const cpumask_t *mask, const void *va, 
unsigned int flags)
  !cpumask_subset(mask, cpumask_of(cpu)) )
 {
 if ( cpu_has_hypervisor &&
- !(flags & ~(FLUSH_TLB | FLUSH_TLB_GLOBAL | FLUSH_VA_VALID |
- FLUSH_ORDER_MASK)) &&
- !hypervisor_flush_tlb(mask, va, flags & FLUSH_ORDER_MASK) )
+ !(flags & ~FLUSH_TLB_FLAGS_MASK) &&
+ !hypervisor_flush_tlb(mask, va, flags) )
 {
 if ( tlb_clk_enabled )
 tlb_clk_enabled = false;
diff --git a/xen/include/asm-x86/flushtlb.h b/xen/include/asm-x86/flushtlb.h
index 9773014320..a4de317452 100644
--- a/xen/include/asm-x86/flushtlb.h
+++ b/xen/include/asm-x86/flushtlb.h
@@ -123,6 +123,9 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4);
  /* Flush all HVM guests linear TLB (using ASID/VPID) */
 #define FLUSH_GUESTS_TLB 0x4000
 
+#define FLUSH_TLB_FLAGS_MASK (FLUSH_TLB | FLUSH_TLB_GLOBAL | FLUSH_VA_VALID | \
+  FLUSH_ORDER_MASK)
+
 /* Flush local TLBs/caches. */
 unsigned int flush_area_local(const void *va, unsigned int flags);
 #define flush_local(flags) flush_area_local(NULL, flags)
diff --git a/xen/include/asm-x86/guest/hypervisor.h 
b/xen/include/asm-x86/guest/hypervisor.h
index 432e57c2a0..48d54735d2 100644
--- a/xen/include/asm-x86/guest/hypervisor.h
+++ b/xen/include/asm-x86/guest/hypervisor.h
@@ -35,7 +35,7 @@ struct hypervisor_ops {
 /* Fix up e820 map */
 void (*e820_fixup)(struct e820map *e820);
 /* L0 assisted TLB flush */
-int (*flush_tlb)(const cpumask_t *mask, const void *va, unsigned int 
order);
+int (*flush_tlb)(const cpumask_t *mask, const void *va, unsigned int 
flags);
 };
 
 #ifdef CONFIG_GUEST
@@ -48,11 +48,11 @@ void hypervisor_e820_fixup(struct e820map *e820);
 /*
  * L0 assisted TLB flush.
  * mask: cpumask of the dirty vCPUs that should be flushed.
- * va: linear address to flush, or NULL for global flushes.
- * order: order of the linear address pointed by va.
+ * va: linear address to flush, or NULL for entire address space.
+ * flags: flags for flushing, including the order of va.
  */
 int hypervisor_flush_tlb(const cpumask_t *mask, const void *va,
- unsigned int order);
+ unsigned int flags);
 
 #else
 
@@ -65,7 +65,7 @@ static inline int hypervisor_ap_setup(void) { return 0; }
 static inline void hypervisor_resume(void) { ASSERT_UNREACHABLE(); }
 static inline void hypervisor_e820_fixup(struct e820map *e820) {}
 static inline int hypervisor_flush_tlb(const cpumask_t *mask, const void *va,
-   unsigned int order)
+   unsigned int flags)
 {
 return -ENOSYS;
 }
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org

[Xen-devel] [libvirt test] 147141: regressions - FAIL

2020-02-17 Thread osstest service owner
flight 147141 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/147141/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 146182
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 146182
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 146182
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 146182

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  b18328256b565806c04c153ce49fc3641412b35b
baseline version:
 libvirt  a1cd25b919509be2645dbe6f952d5263e0d4e4e5

Last test of basis   146182  2020-01-17 06:00:23 Z   31 days
Failing since146211  2020-01-18 04:18:52 Z   30 days   30 attempts
Testing same since   147084  2020-02-15 10:49:39 Z2 days2 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Arnaud Patard 
  Boris Fiuczynski 
  Christian Ehrhardt 
  Daniel Henrique Barboza 
  Daniel P. Berrangé 
  Dario Faggioli 
  Erik Skultety 
  Han Han 
  Jim Fehlig 
  Jiri Denemark 
  Jonathon Jongsma 
  Julio Faracco 
  Ján Tomko 
  Laine Stump 
  Marek Marczykowski-Górecki 
  Michal Privoznik 
  Nikolay Shirokovskiy 
  Pavel Hrdina 
  Peter Krempa 
  Richard W.M. Jones 
  Sahid Orentino Ferdjaoui 
  Stefan Berger 
  Stefan Berger 
  Thomas Huth 
  Your Name 
  zhenwei pi 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-arm64-libvirt  fail
 build-armhf-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   blocked 
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked 
 test-amd64-amd64-libvirt-xsm blocked 
 test-arm64-arm64-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  blocked 
 test-amd64-amd64-libvirt blocked 
 test-arm64-arm64-libvirt blocked 
 test-armhf-armhf-libvirt blocked 
 test-amd64-i386-libvirt  blocked 
 test-amd64-amd64-libvirt-pairblocked 
 test-amd64-i386-libvirt-pair blocked 
 test-arm64-arm64-libvirt-qcow2   blocked 
 test-armhf-armhf-libvirt-raw blocked 
 test-amd64-amd64-libvirt-vhd blocked 



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in 

[Xen-devel] [PATCH v3 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Wei Liu
Implement a basic hook for L0 assisted TLB flush. The hook needs to
check if prerequisites are met. If they are not met, it returns an error
number to fall back to native flushes.

Introduce a new variable to indicate if hypercall page is ready.

Signed-off-by: Wei Liu 
Reviewed-by: Roger Pau Monné 
---
v3:
1. Change hv_hcall_page_ready to hcall_page_ready
---
 xen/arch/x86/guest/hyperv/Makefile  |  1 +
 xen/arch/x86/guest/hyperv/hyperv.c  | 17 
 xen/arch/x86/guest/hyperv/private.h |  4 +++
 xen/arch/x86/guest/hyperv/tlb.c | 41 +
 4 files changed, 63 insertions(+)
 create mode 100644 xen/arch/x86/guest/hyperv/tlb.c

diff --git a/xen/arch/x86/guest/hyperv/Makefile 
b/xen/arch/x86/guest/hyperv/Makefile
index 68170109a9..18902c33e9 100644
--- a/xen/arch/x86/guest/hyperv/Makefile
+++ b/xen/arch/x86/guest/hyperv/Makefile
@@ -1 +1,2 @@
 obj-y += hyperv.o
+obj-y += tlb.o
diff --git a/xen/arch/x86/guest/hyperv/hyperv.c 
b/xen/arch/x86/guest/hyperv/hyperv.c
index 70f4cd5ae0..f1b3073712 100644
--- a/xen/arch/x86/guest/hyperv/hyperv.c
+++ b/xen/arch/x86/guest/hyperv/hyperv.c
@@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, hv_input_page);
 DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
 DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
 
+static bool __read_mostly hcall_page_ready;
+
 static uint64_t generate_guest_id(void)
 {
 union hv_guest_os_id id = {};
@@ -119,6 +121,8 @@ static void __init setup_hypercall_page(void)
 BUG_ON(!hypercall_msr.enable);
 
 set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
+
+hcall_page_ready = true;
 }
 
 static int setup_hypercall_pcpu_arg(void)
@@ -199,11 +203,24 @@ static void __init e820_fixup(struct e820map *e820)
 panic("Unable to reserve Hyper-V hypercall range\n");
 }
 
+static int flush_tlb(const cpumask_t *mask, const void *va,
+ unsigned int flags)
+{
+if ( !(ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED) )
+return -EOPNOTSUPP;
+
+if ( !hcall_page_ready || !this_cpu(hv_input_page) )
+return -ENXIO;
+
+return hyperv_flush_tlb(mask, va, flags);
+}
+
 static const struct hypervisor_ops __initdata ops = {
 .name = "Hyper-V",
 .setup = setup,
 .ap_setup = ap_setup,
 .e820_fixup = e820_fixup,
+.flush_tlb = flush_tlb,
 };
 
 /*
diff --git a/xen/arch/x86/guest/hyperv/private.h 
b/xen/arch/x86/guest/hyperv/private.h
index 956eff831f..509bedaafa 100644
--- a/xen/arch/x86/guest/hyperv/private.h
+++ b/xen/arch/x86/guest/hyperv/private.h
@@ -22,10 +22,14 @@
 #ifndef __XEN_HYPERV_PRIVIATE_H__
 #define __XEN_HYPERV_PRIVIATE_H__
 
+#include 
 #include 
 
 DECLARE_PER_CPU(void *, hv_input_page);
 DECLARE_PER_CPU(void *, hv_vp_assist);
 DECLARE_PER_CPU(unsigned int, hv_vp_index);
 
+int hyperv_flush_tlb(const cpumask_t *mask, const void *va,
+ unsigned int flags);
+
 #endif /* __XEN_HYPERV_PRIVIATE_H__  */
diff --git a/xen/arch/x86/guest/hyperv/tlb.c b/xen/arch/x86/guest/hyperv/tlb.c
new file mode 100644
index 00..48f527229e
--- /dev/null
+++ b/xen/arch/x86/guest/hyperv/tlb.c
@@ -0,0 +1,41 @@
+/**
+ * arch/x86/guest/hyperv/tlb.c
+ *
+ * Support for TLB management using hypercalls
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see .
+ *
+ * Copyright (c) 2020 Microsoft.
+ */
+
+#include 
+#include 
+
+#include "private.h"
+
+int hyperv_flush_tlb(const cpumask_t *mask, const void *va,
+ unsigned int flags)
+{
+return -EOPNOTSUPP;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.20.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v3] ns16550: Add ACPI support for ARM only

2020-02-17 Thread Jan Beulich
On 03.02.2020 12:21, Wei Xu wrote:
> Parse the ACPI SPCR table and initialize the 16550 compatible serial port
> for ARM only. Currently we only support one UART on ARM. Some fields
> which we do not care yet on ARM are ignored.
> 
> Signed-off-by: Wei Xu 
> 
> ---
> Changes in v3:
> - address the code style comments from Jan
> - use container_of to do cast
> - list all fields we ignored
> - check the console redirection is disabled or not before init the uart
> - init the uart io_size and width via spcr->serial_port
> 
> Changes in v2:
> - improve commit message
> - remove the spcr initialization
> - add comments for the uart initialization and configuration
> - adjust the code style issue
> - limit the code only built on ACPI and ARM
> ---
>  xen/drivers/char/ns16550.c | 75 
> ++
>  1 file changed, 75 insertions(+)
> 
> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
> index aa87c57..741b510 100644
> --- a/xen/drivers/char/ns16550.c
> +++ b/xen/drivers/char/ns16550.c
> @@ -1620,6 +1620,81 @@ DT_DEVICE_START(ns16550, "NS16550 UART", DEVICE_SERIAL)
>  DT_DEVICE_END
> 
>  #endif /* HAS_DEVICE_TREE */
> +
> +#if defined(CONFIG_ACPI) && defined(CONFIG_ARM)
> +#include 
> +
> +static int __init ns16550_acpi_uart_init(const void *data)
> +{
> +struct acpi_table_header *table;
> +struct acpi_table_spcr *spcr;
> +acpi_status status;
> +/*
> + * Same as the DT part.
> + * Only support one UART on ARM which happen to be ns16550_com[0].
> + */
> +struct ns16550 *uart = _com[0];
> +
> +status = acpi_get_table(ACPI_SIG_SPCR, 0, );
> +if ( ACPI_FAILURE(status) )
> +{
> +printk("ns16550: Failed to get SPCR table\n");
> +return -EINVAL;
> +}
> +
> +spcr = container_of(table, struct acpi_table_spcr, header);
> +
> +/*
> + * The serial port address may be 0 for example
> + * if the console redirection is disabled.
> + */
> +if ( unlikely(!spcr->serial_port.address) )
> +{
> +printk("ns16550: the serial port address is invalid\n");

Is zero really an invalid address, or is it rather a proper
indicator of there not being any device?

> +return -EINVAL;
> +}
> +
> +ns16550_init_common(uart);
> +
> +/*
> + * The baud rate is pre-configured by the firmware.

But this isn't the same as BAUD_AUTO, is it? If firmware pre-configures
the baud rate, isn't it this structure which it would use to communicate
the information?

> + * And currently the ACPI part is only targeting ARM so the following
> + * fields pc_interrupt, pci_device_id, pci_vendor_id, pci_bus, 
> pci_device,
> + * pci_function, pci_flags, pci_segment and flow_control which we do not
> + * care yet are ignored.

How come flow control is of no interest?

I'd also group all the pci_* fields into a simple "and all PCI related
ones".

> + */
> +uart->baud = BAUD_AUTO;
> +uart->data_bits = 8;
> +uart->parity = spcr->parity;
> +uart->stop_bits = spcr->stop_bits;
> +uart->io_base = spcr->serial_port.address;
> +uart->io_size = spcr->serial_port.bit_width;

Once again: You should not ignore the GAS address space indicator.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Roger Pau Monné
On Mon, Feb 17, 2020 at 02:17:23PM +0100, Jürgen Groß wrote:
> On 17.02.20 13:49, Roger Pau Monné wrote:
> > On Mon, Feb 17, 2020 at 01:32:59PM +0100, Jürgen Groß wrote:
> > > On 17.02.20 13:17, Roger Pau Monné wrote:
> > > > On Mon, Feb 17, 2020 at 01:11:59PM +0100, Jürgen Groß wrote:
> > > > > On 17.02.20 12:49, Julien Grall wrote:
> > > > > > Hi Juergen,
> > > > > > 
> > > > > > On 17/02/2020 07:20, Juergen Gross wrote:
> > > > > > > +void rcu_barrier(void)
> > > > > > >     {
> > > > > > > -    atomic_t cpu_count = ATOMIC_INIT(0);
> > > > > > > -    return stop_machine_run(rcu_barrier_action, _count, 
> > > > > > > NR_CPUS);
> > > > > > > +    if ( !atomic_cmpxchg(_count, 0, num_online_cpus()) )
> > > > > > 
> > > > > > What does prevent the cpu_online_map to change under your feet?
> > > > > > Shouldn't you grab the lock via get_cpu_maps()?
> > > > > 
> > > > > Oh, indeed.
> > > > > 
> > > > > This in turn will require a modification of the logic to detect 
> > > > > parallel
> > > > > calls on multiple cpus.
> > > > 
> > > > If you pick my patch to turn that into a rw lock you shouldn't worry
> > > > about parallel calls I think, but the lock acquisition can still fail
> > > > if there's a CPU plug/unplug going on:
> > > > 
> > > > https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg00940.html
> > > 
> > > Thanks, but letting rcu_barrier() fail is a no go, so I still need to
> > > handle that case (I mean the case failing to get the lock). And handling
> > > of parallel calls is not needed to be functional correct, but to avoid
> > > not necessary cpu synchronization (each parallel call detected can just
> > > wait until the master has finished and then return).
> > > 
> > > BTW - The recursive spinlock today would allow for e.g. rcu_barrier() to
> > > be called inside a CPU plug/unplug section. Your rwlock is removing that
> > > possibility. Any chance that could be handled?
> > 
> > While this might be interesting for the rcu stuff, it certainly isn't
> > for other pieces also relying on the cpu maps lock.
> > 
> > Ie: get_cpu_maps must fail when called by send_IPI_mask if there's a
> > CPU plug/unplug operation going on, even if it's on the same pCPU
> > that's holding the lock in write mode.
> 
> Sure? How is cpu_down() working then?

send_IPI_mask failing to acquire the cpu maps lock prevents it from
using the APIC shorthand, which is what we want in that case.

> It is calling stop_machine_run()
> which is using send_IPI_mask()...

Xen should avoid using the APIC shorthand in that case, which I don't
think it's happening now, as the lock is recursive.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Jürgen Groß

On 17.02.20 13:49, Roger Pau Monné wrote:

On Mon, Feb 17, 2020 at 01:32:59PM +0100, Jürgen Groß wrote:

On 17.02.20 13:17, Roger Pau Monné wrote:

On Mon, Feb 17, 2020 at 01:11:59PM +0100, Jürgen Groß wrote:

On 17.02.20 12:49, Julien Grall wrote:

Hi Juergen,

On 17/02/2020 07:20, Juergen Gross wrote:

+void rcu_barrier(void)
    {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, _count, NR_CPUS);
+    if ( !atomic_cmpxchg(_count, 0, num_online_cpus()) )


What does prevent the cpu_online_map to change under your feet?
Shouldn't you grab the lock via get_cpu_maps()?


Oh, indeed.

This in turn will require a modification of the logic to detect parallel
calls on multiple cpus.


If you pick my patch to turn that into a rw lock you shouldn't worry
about parallel calls I think, but the lock acquisition can still fail
if there's a CPU plug/unplug going on:

https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg00940.html


Thanks, but letting rcu_barrier() fail is a no go, so I still need to
handle that case (I mean the case failing to get the lock). And handling
of parallel calls is not needed to be functional correct, but to avoid
not necessary cpu synchronization (each parallel call detected can just
wait until the master has finished and then return).

BTW - The recursive spinlock today would allow for e.g. rcu_barrier() to
be called inside a CPU plug/unplug section. Your rwlock is removing that
possibility. Any chance that could be handled?


While this might be interesting for the rcu stuff, it certainly isn't
for other pieces also relying on the cpu maps lock.

Ie: get_cpu_maps must fail when called by send_IPI_mask if there's a
CPU plug/unplug operation going on, even if it's on the same pCPU
that's holding the lock in write mode.


Sure? How is cpu_down() working then? It is calling stop_machine_run()
which is using send_IPI_mask()...


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Durrant, Paul
> -Original Message-
> From: Wei Liu 
> Sent: 17 February 2020 12:48
> To: Durrant, Paul 
> Cc: Roger Pau Monné ; Wei Liu ; Xen
> Development List ; Michael Kelley
> ; Wei Liu ; Jan Beulich
> ; Andrew Cooper 
> Subject: Re: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> 
> On Mon, Feb 17, 2020 at 12:21:09PM +, Durrant, Paul wrote:
> > > -Original Message-
> > > From: Roger Pau Monné 
> > > Sent: 17 February 2020 12:08
> > > To: Durrant, Paul 
> > > Cc: Wei Liu ; Xen Development List  > > de...@lists.xenproject.org>; Michael Kelley ;
> Wei
> > > Liu ; Jan Beulich ; Andrew
> Cooper
> > > 
> > > Subject: Re: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB
> flush
> > >
> > > On Mon, Feb 17, 2020 at 12:01:23PM +, Durrant, Paul wrote:
> > > > > -Original Message-
> > > > > From: Roger Pau Monné 
> > > > > Sent: 17 February 2020 11:41
> > > > > To: Wei Liu 
> > > > > Cc: Durrant, Paul ; Xen Development List
>  > > > > de...@lists.xenproject.org>; Michael Kelley
> ;
> > > Wei
> > > > > Liu ; Jan Beulich ; Andrew
> > > Cooper
> > > > > 
> > > > > Subject: Re: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted
> TLB
> > > flush
> > > > >
> > > > > On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > > > > > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > > > > > -Original Message-
> > > > > > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > > > > > Sent: 14 February 2020 13:34
> > > > > > > > To: Xen Development List 
> > > > > > > > Cc: Michael Kelley ; Durrant, Paul
> > > > > > > > ; Wei Liu ; Wei
> Liu
> > > > > > > > ; Jan Beulich ; Andrew Cooper
> > > > > > > > ; Roger Pau Monné
> > > 
> > > > > > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted
> TLB
> > > > > flush
> > > > > > > >
> > > > > > > > Implement a basic hook for L0 assisted TLB flush. The hook
> needs
> > > to
> > > > > > > > check if prerequisites are met. If they are not met, it
> returns
> > > an
> > > > > error
> > > > > > > > number to fall back to native flushes.
> > > > > > > >
> > > > > > > > Introduce a new variable to indicate if hypercall page is
> ready.
> > > > > > > >
> > > > > > > > Signed-off-by: Wei Liu 
> > > > > > > > ---
> > > > > > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > > > > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > > > > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > > > > > >  xen/arch/x86/guest/hyperv/tlb.c | 41
> > > > > +
> > > > > > > >  4 files changed, 63 insertions(+)
> > > > > > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > > > > > >
> > > > > > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > > index 68170109a9..18902c33e9 100644
> > > > > > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > > @@ -1 +1,2 @@
> > > > > > > >  obj-y += hyperv.o
> > > > > > > > +obj-y += tlb.o
> > > > > > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > > > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *,
> > > hv_input_page);
> > > > > > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > > > > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > > > > > >
> > > > > > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > > > > > +
> > > > > > > >  static uint64_t generate_guest_id(void)
> > > > > > > >  {
> > > > > > > >  union hv_guest_os_id id = {};
> > > > > > > > @@ -119,6 +121,8 @@ static void __init
> > > setup_hypercall_page(void)
> > > > > > > >  BUG_ON(!hypercall_msr.enable);
> > > > > > > >
> > > > > > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > > > > >
> > > > > > > Shouldn't this have at least a compiler barrier here?
> > > > > > >
> > > > > >
> > > > > > OK. I will add a write barrier here.
> > > > >
> > > > > Hm, shouldn't such barrier be part of set_fixmap_x itself?
> > > > >
> > > >
> > > > Not really, for the purpose I had in mind. The hv_hcall_page_ready
> > > global is specific to this code and we need to make sure the page is
> > > actually ready before the code says it is.
> > >
> > > But anything that modifies the page tables should already have a
> > > barrier if required in order to prevent accesses from being moved
> > > ahead of it, or else things would certainly go wrong in many other
> > > places?
> >
> > Oh. I'm not saying that we don't need a barrier there too (and more
> > than a compiler one in that case).
> >
> 
> The argument Roger has is that set_fixmap_x also contains strong enough
> barriers to prevent hcall_page_ready to be set before page table is
> correctly set up.

Re: [Xen-devel] [PATCH 3/3] AMD/IOMMU: replace a few literal numbers

2020-02-17 Thread Jan Beulich
On 10.02.2020 15:28, Andrew Cooper wrote:
> On 05/02/2020 09:43, Jan Beulich wrote:
>> Introduce IOMMU_PDE_NEXT_LEVEL_{MIN,MAX} to replace literal 1, 6, and 7
>> instances. While doing so replace two uses of memset() by initializers.
>>
>> Signed-off-by: Jan Beulich 
> 
> This does not look to be an improvement.  IOMMU_PDE_NEXT_LEVEL_MIN is
> definitely bogus, and in all cases, a literal 1 is better, because that
> is how we describe pagetable levels.

I disagree. The device table entry's mode field is bounded by 1
(min) and 6 (max) for the legitimate values to put there.

> Something to replace literal 6/7 probably is ok, but doesn't want to be
> done like this.
> 
> The majority of the problems here as caused by iommu_pde_from_dfn()'s
> silly ABI.  The pt_mfn[] array is problematic (because it is used as a
> 1-based array, not 0-based) and useless because both callers only want
> the 4k-equivelent mfn.  Fixing the ABI gets rid of quite a lot of wasted
> stack space, every use of '1', and every upper bound other than the bug
> on and amd_iommu_get_paging_mode().

I didn't mean to alter that function's behavior, at the very least
not until being certain there wasn't a reason it was coded with this
array approach. IOW the alternative to going with this patch
(subject to corrections of course) is for me to drop it altogether,
keeping the hard-coded numbers in place. Just let me know.

>> ---
>> TBD: We should really honor the hats field of union
>>  amd_iommu_ext_features, but the specification (or at least the
>>  parts I did look at in the course of putting together this patch)
>>  is unclear about the maximum valid value in case EFRSup is clear.
> 
> It is available from PCI config space (Misc0 register, cap+0x10) even on
> first gen IOMMUs,

I don't think any of the address size fields there matches what
HATS is about (limiting of the values valid to put in a DTE's
mode field). In fact I'm having some difficulty bringing the
two in (sensible) sync.

> and the IVRS table in Type 10.

Which may in turn be absent, i.e. the question of what to use as
a default merely gets shifted.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Roger Pau Monné
On Mon, Feb 17, 2020 at 01:32:59PM +0100, Jürgen Groß wrote:
> On 17.02.20 13:17, Roger Pau Monné wrote:
> > On Mon, Feb 17, 2020 at 01:11:59PM +0100, Jürgen Groß wrote:
> > > On 17.02.20 12:49, Julien Grall wrote:
> > > > Hi Juergen,
> > > > 
> > > > On 17/02/2020 07:20, Juergen Gross wrote:
> > > > > +void rcu_barrier(void)
> > > > >    {
> > > > > -    atomic_t cpu_count = ATOMIC_INIT(0);
> > > > > -    return stop_machine_run(rcu_barrier_action, _count, NR_CPUS);
> > > > > +    if ( !atomic_cmpxchg(_count, 0, num_online_cpus()) )
> > > > 
> > > > What does prevent the cpu_online_map to change under your feet?
> > > > Shouldn't you grab the lock via get_cpu_maps()?
> > > 
> > > Oh, indeed.
> > > 
> > > This in turn will require a modification of the logic to detect parallel
> > > calls on multiple cpus.
> > 
> > If you pick my patch to turn that into a rw lock you shouldn't worry
> > about parallel calls I think, but the lock acquisition can still fail
> > if there's a CPU plug/unplug going on:
> > 
> > https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg00940.html
> 
> Thanks, but letting rcu_barrier() fail is a no go, so I still need to
> handle that case (I mean the case failing to get the lock). And handling
> of parallel calls is not needed to be functional correct, but to avoid
> not necessary cpu synchronization (each parallel call detected can just
> wait until the master has finished and then return).
>
> BTW - The recursive spinlock today would allow for e.g. rcu_barrier() to
> be called inside a CPU plug/unplug section. Your rwlock is removing that
> possibility. Any chance that could be handled?

While this might be interesting for the rcu stuff, it certainly isn't
for other pieces also relying on the cpu maps lock.

Ie: get_cpu_maps must fail when called by send_IPI_mask if there's a
CPU plug/unplug operation going on, even if it's on the same pCPU
that's holding the lock in write mode.

I guess you could add a pCPU variable to record whether the current
pCPU is in the middle of a CPU plug/unplug operation (and hence has
the maps locked in write mode) and avoid taking the lock in that case?

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Wei Liu
On Mon, Feb 17, 2020 at 12:21:09PM +, Durrant, Paul wrote:
> > -Original Message-
> > From: Roger Pau Monné 
> > Sent: 17 February 2020 12:08
> > To: Durrant, Paul 
> > Cc: Wei Liu ; Xen Development List  > de...@lists.xenproject.org>; Michael Kelley ; Wei
> > Liu ; Jan Beulich ; Andrew Cooper
> > 
> > Subject: Re: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> > 
> > On Mon, Feb 17, 2020 at 12:01:23PM +, Durrant, Paul wrote:
> > > > -Original Message-
> > > > From: Roger Pau Monné 
> > > > Sent: 17 February 2020 11:41
> > > > To: Wei Liu 
> > > > Cc: Durrant, Paul ; Xen Development List  > > > de...@lists.xenproject.org>; Michael Kelley ;
> > Wei
> > > > Liu ; Jan Beulich ; Andrew
> > Cooper
> > > > 
> > > > Subject: Re: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB
> > flush
> > > >
> > > > On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > > > > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > > > > -Original Message-
> > > > > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > > > > Sent: 14 February 2020 13:34
> > > > > > > To: Xen Development List 
> > > > > > > Cc: Michael Kelley ; Durrant, Paul
> > > > > > > ; Wei Liu ; Wei Liu
> > > > > > > ; Jan Beulich ; Andrew Cooper
> > > > > > > ; Roger Pau Monné
> > 
> > > > > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB
> > > > flush
> > > > > > >
> > > > > > > Implement a basic hook for L0 assisted TLB flush. The hook needs
> > to
> > > > > > > check if prerequisites are met. If they are not met, it returns
> > an
> > > > error
> > > > > > > number to fall back to native flushes.
> > > > > > >
> > > > > > > Introduce a new variable to indicate if hypercall page is ready.
> > > > > > >
> > > > > > > Signed-off-by: Wei Liu 
> > > > > > > ---
> > > > > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > > > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > > > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > > > > >  xen/arch/x86/guest/hyperv/tlb.c | 41
> > > > +
> > > > > > >  4 files changed, 63 insertions(+)
> > > > > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > > > > >
> > > > > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > index 68170109a9..18902c33e9 100644
> > > > > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > @@ -1 +1,2 @@
> > > > > > >  obj-y += hyperv.o
> > > > > > > +obj-y += tlb.o
> > > > > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *,
> > hv_input_page);
> > > > > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > > > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > > > > >
> > > > > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > > > > +
> > > > > > >  static uint64_t generate_guest_id(void)
> > > > > > >  {
> > > > > > >  union hv_guest_os_id id = {};
> > > > > > > @@ -119,6 +121,8 @@ static void __init
> > setup_hypercall_page(void)
> > > > > > >  BUG_ON(!hypercall_msr.enable);
> > > > > > >
> > > > > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > > > >
> > > > > > Shouldn't this have at least a compiler barrier here?
> > > > > >
> > > > >
> > > > > OK. I will add a write barrier here.
> > > >
> > > > Hm, shouldn't such barrier be part of set_fixmap_x itself?
> > > >
> > >
> > > Not really, for the purpose I had in mind. The hv_hcall_page_ready
> > global is specific to this code and we need to make sure the page is
> > actually ready before the code says it is.
> > 
> > But anything that modifies the page tables should already have a
> > barrier if required in order to prevent accesses from being moved
> > ahead of it, or else things would certainly go wrong in many other
> > places?
> 
> Oh. I'm not saying that we don't need a barrier there too (and more
> than a compiler one in that case).
> 

The argument Roger has is that set_fixmap_x also contains strong enough
barriers to prevent hcall_page_ready to be set before page table is
correctly set up.

Since you asked for it, there must be something on your mind that
prompted this (maybe it is simply because you were bitten by similar
things and wants to be extra sure, maybe you think it is harder to grasp
the side effect of set_fixmap_x, maybe something else).

Code is written to be read by humans after all. I would rather be more
explicit / redundant to make humans happy than to save a potential
barrier / some typing in a code path.

Wei.

___
Xen-devel mailing list

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Wei Liu
On Mon, Feb 17, 2020 at 01:13:28PM +0100, Roger Pau Monné wrote:
> On Mon, Feb 17, 2020 at 12:08:01PM +, Wei Liu wrote:
> > On Mon, Feb 17, 2020 at 01:00:54PM +0100, Roger Pau Monné wrote:
> > > On Mon, Feb 17, 2020 at 11:45:38AM +, Wei Liu wrote:
> > > > On Mon, Feb 17, 2020 at 12:40:31PM +0100, Roger Pau Monné wrote:
> > > > > On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > > > > > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > > > > > -Original Message-
> > > > > > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > > > > > Sent: 14 February 2020 13:34
> > > > > > > > To: Xen Development List 
> > > > > > > > Cc: Michael Kelley ; Durrant, Paul
> > > > > > > > ; Wei Liu ; Wei Liu
> > > > > > > > ; Jan Beulich ; Andrew Cooper
> > > > > > > > ; Roger Pau Monné 
> > > > > > > > 
> > > > > > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted 
> > > > > > > > TLB flush
> > > > > > > > 
> > > > > > > > Implement a basic hook for L0 assisted TLB flush. The hook 
> > > > > > > > needs to
> > > > > > > > check if prerequisites are met. If they are not met, it returns 
> > > > > > > > an error
> > > > > > > > number to fall back to native flushes.
> > > > > > > > 
> > > > > > > > Introduce a new variable to indicate if hypercall page is ready.
> > > > > > > > 
> > > > > > > > Signed-off-by: Wei Liu 
> > > > > > > > ---
> > > > > > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > > > > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > > > > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > > > > > >  xen/arch/x86/guest/hyperv/tlb.c | 41 
> > > > > > > > +
> > > > > > > >  4 files changed, 63 insertions(+)
> > > > > > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > > > > > > 
> > > > > > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > > index 68170109a9..18902c33e9 100644
> > > > > > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > > @@ -1 +1,2 @@
> > > > > > > >  obj-y += hyperv.o
> > > > > > > > +obj-y += tlb.o
> > > > > > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > > > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, 
> > > > > > > > hv_input_page);
> > > > > > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > > > > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > > > > > > 
> > > > > > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > > > > > +
> > > > > > > >  static uint64_t generate_guest_id(void)
> > > > > > > >  {
> > > > > > > >  union hv_guest_os_id id = {};
> > > > > > > > @@ -119,6 +121,8 @@ static void __init 
> > > > > > > > setup_hypercall_page(void)
> > > > > > > >  BUG_ON(!hypercall_msr.enable);
> > > > > > > > 
> > > > > > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > > > > > 
> > > > > > > Shouldn't this have at least a compiler barrier here?
> > > > > > > 
> > > > > > 
> > > > > > OK. I will add a write barrier here.
> > > > > 
> > > > > Hm, shouldn't such barrier be part of set_fixmap_x itself?
> > > > > 
> > > > > Note that map_pages_to_xen already performs atomic writes.
> > > > 
> > > > I don't mind making things more explicit though. However unlikely, I
> > > > may end up putting something in between set_fixmap_x and setting
> > > > hcall_page_ready, I will need the barrier by then, I may as well put it
> > > > in now.
> > > 
> > > IMO set_fixmap_x should have the necessary barriers (or other
> > > synchronization methods) so that on return the address is correctly
> > > mapped across all processors, and that it prevents the compiler from
> > > moving accesses past it. I would consider a bug of set_fixmap_x
> > > not having this behavior and requiring callers to do extra work in
> > > order to ensure this.
> > > 
> > > Ie: something like the snipped below should not require an extra
> > > barrier IMO:
> > > 
> > > set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > *((unsigned int *)fix_x_to_virt(FIX_X_HYPERV_HCALL)) = 0;
> > 
> > That's different though. Compiler can't make the connection between
> > hcall_page_ready and the address returned by set_fixmap_x.
> 
> I'm not sure the compiler can make a connection between set_fixmap_x
> and fix_x_to_virt either (as fix_x_to_virt is a simple mathematical
> operation and FIX_X_HYPERV_HCALL is a constant known at build time).

Oh, I misread your example, sorry.

Wei.

> 
> Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Durrant, Paul
> -Original Message-
> From: Roger Pau Monné 
> Sent: 17 February 2020 12:08
> To: Durrant, Paul 
> Cc: Wei Liu ; Xen Development List  de...@lists.xenproject.org>; Michael Kelley ; Wei
> Liu ; Jan Beulich ; Andrew Cooper
> 
> Subject: Re: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> 
> On Mon, Feb 17, 2020 at 12:01:23PM +, Durrant, Paul wrote:
> > > -Original Message-
> > > From: Roger Pau Monné 
> > > Sent: 17 February 2020 11:41
> > > To: Wei Liu 
> > > Cc: Durrant, Paul ; Xen Development List  > > de...@lists.xenproject.org>; Michael Kelley ;
> Wei
> > > Liu ; Jan Beulich ; Andrew
> Cooper
> > > 
> > > Subject: Re: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB
> flush
> > >
> > > On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > > > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > > > -Original Message-
> > > > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > > > Sent: 14 February 2020 13:34
> > > > > > To: Xen Development List 
> > > > > > Cc: Michael Kelley ; Durrant, Paul
> > > > > > ; Wei Liu ; Wei Liu
> > > > > > ; Jan Beulich ; Andrew Cooper
> > > > > > ; Roger Pau Monné
> 
> > > > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB
> > > flush
> > > > > >
> > > > > > Implement a basic hook for L0 assisted TLB flush. The hook needs
> to
> > > > > > check if prerequisites are met. If they are not met, it returns
> an
> > > error
> > > > > > number to fall back to native flushes.
> > > > > >
> > > > > > Introduce a new variable to indicate if hypercall page is ready.
> > > > > >
> > > > > > Signed-off-by: Wei Liu 
> > > > > > ---
> > > > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > > > >  xen/arch/x86/guest/hyperv/tlb.c | 41
> > > +
> > > > > >  4 files changed, 63 insertions(+)
> > > > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > > > >
> > > > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > index 68170109a9..18902c33e9 100644
> > > > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > @@ -1 +1,2 @@
> > > > > >  obj-y += hyperv.o
> > > > > > +obj-y += tlb.o
> > > > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *,
> hv_input_page);
> > > > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > > > >
> > > > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > > > +
> > > > > >  static uint64_t generate_guest_id(void)
> > > > > >  {
> > > > > >  union hv_guest_os_id id = {};
> > > > > > @@ -119,6 +121,8 @@ static void __init
> setup_hypercall_page(void)
> > > > > >  BUG_ON(!hypercall_msr.enable);
> > > > > >
> > > > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > > >
> > > > > Shouldn't this have at least a compiler barrier here?
> > > > >
> > > >
> > > > OK. I will add a write barrier here.
> > >
> > > Hm, shouldn't such barrier be part of set_fixmap_x itself?
> > >
> >
> > Not really, for the purpose I had in mind. The hv_hcall_page_ready
> global is specific to this code and we need to make sure the page is
> actually ready before the code says it is.
> 
> But anything that modifies the page tables should already have a
> barrier if required in order to prevent accesses from being moved
> ahead of it, or else things would certainly go wrong in many other
> places?

Oh. I'm not saying that we don't need a barrier there too (and more than a 
compiler one in that case).

  Paul

> 
> Roger.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Jürgen Groß

On 17.02.20 13:17, Roger Pau Monné wrote:

On Mon, Feb 17, 2020 at 01:11:59PM +0100, Jürgen Groß wrote:

On 17.02.20 12:49, Julien Grall wrote:

Hi Juergen,

On 17/02/2020 07:20, Juergen Gross wrote:

+void rcu_barrier(void)
   {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, _count, NR_CPUS);
+    if ( !atomic_cmpxchg(_count, 0, num_online_cpus()) )


What does prevent the cpu_online_map to change under your feet?
Shouldn't you grab the lock via get_cpu_maps()?


Oh, indeed.

This in turn will require a modification of the logic to detect parallel
calls on multiple cpus.


If you pick my patch to turn that into a rw lock you shouldn't worry
about parallel calls I think, but the lock acquisition can still fail
if there's a CPU plug/unplug going on:

https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg00940.html


Thanks, but letting rcu_barrier() fail is a no go, so I still need to
handle that case (I mean the case failing to get the lock). And handling
of parallel calls is not needed to be functional correct, but to avoid
not necessary cpu synchronization (each parallel call detected can just
wait until the master has finished and then return).

BTW - The recursive spinlock today would allow for e.g. rcu_barrier() to
be called inside a CPU plug/unplug section. Your rwlock is removing that
possibility. Any chance that could be handled?


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Igor Druzhinin
On 17/02/2020 12:28, Jürgen Groß wrote:
> On 17.02.20 13:26, Igor Druzhinin wrote:
>> On 17/02/2020 07:20, Juergen Gross wrote:
>>> Today rcu_barrier() is calling stop_machine_run() to synchronize all
>>> physical cpus in order to ensure all pending rcu calls have finished
>>> when returning.
>>>
>>> As stop_machine_run() is using tasklets this requires scheduling of
>>> idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
>>> cpus only in case of core scheduling being active, as otherwise a
>>> scheduling deadlock would occur.
>>>
>>> There is no need at all to do the syncing of the cpus in tasklets, as
>>> rcu activity is started in __do_softirq() called whenever softirq
>>> activity is allowed. So rcu_barrier() can easily be modified to use
>>> softirq for synchronization of the cpus no longer requiring any
>>> scheduling activity.
>>>
>>> As there already is a rcu softirq reuse that for the synchronization.
>>>
>>> Finally switch rcu_barrier() to return void as it now can never fail.
>>>
>>
>> Would this implementation guarantee progress as previous implementation
>> guaranteed?
> 
> Yes.

Thanks, I'll put it to test today to see if it solves our use case.

Igor

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Jürgen Groß

On 17.02.20 13:26, Igor Druzhinin wrote:

On 17/02/2020 07:20, Juergen Gross wrote:

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Finally switch rcu_barrier() to return void as it now can never fail.



Would this implementation guarantee progress as previous implementation
guaranteed?


Yes.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Igor Druzhinin
On 17/02/2020 07:20, Juergen Gross wrote:
> Today rcu_barrier() is calling stop_machine_run() to synchronize all
> physical cpus in order to ensure all pending rcu calls have finished
> when returning.
> 
> As stop_machine_run() is using tasklets this requires scheduling of
> idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
> cpus only in case of core scheduling being active, as otherwise a
> scheduling deadlock would occur.
> 
> There is no need at all to do the syncing of the cpus in tasklets, as
> rcu activity is started in __do_softirq() called whenever softirq
> activity is allowed. So rcu_barrier() can easily be modified to use
> softirq for synchronization of the cpus no longer requiring any
> scheduling activity.
> 
> As there already is a rcu softirq reuse that for the synchronization.
> 
> Finally switch rcu_barrier() to return void as it now can never fail.
> 

Would this implementation guarantee progress as previous implementation
guaranteed?

Igor


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Durrant, Paul
> -Original Message-
> From: Roger Pau Monné 
> Sent: 17 February 2020 11:41
> To: Wei Liu 
> Cc: Durrant, Paul ; Xen Development List  de...@lists.xenproject.org>; Michael Kelley ; Wei
> Liu ; Jan Beulich ; Andrew Cooper
> 
> Subject: Re: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> 
> On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > -Original Message-
> > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > Sent: 14 February 2020 13:34
> > > > To: Xen Development List 
> > > > Cc: Michael Kelley ; Durrant, Paul
> > > > ; Wei Liu ; Wei Liu
> > > > ; Jan Beulich ; Andrew Cooper
> > > > ; Roger Pau Monné 
> > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB
> flush
> > > >
> > > > Implement a basic hook for L0 assisted TLB flush. The hook needs to
> > > > check if prerequisites are met. If they are not met, it returns an
> error
> > > > number to fall back to native flushes.
> > > >
> > > > Introduce a new variable to indicate if hypercall page is ready.
> > > >
> > > > Signed-off-by: Wei Liu 
> > > > ---
> > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > >  xen/arch/x86/guest/hyperv/tlb.c | 41
> +
> > > >  4 files changed, 63 insertions(+)
> > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > >
> > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > index 68170109a9..18902c33e9 100644
> > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > @@ -1 +1,2 @@
> > > >  obj-y += hyperv.o
> > > > +obj-y += tlb.o
> > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, hv_input_page);
> > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > >
> > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > +
> > > >  static uint64_t generate_guest_id(void)
> > > >  {
> > > >  union hv_guest_os_id id = {};
> > > > @@ -119,6 +121,8 @@ static void __init setup_hypercall_page(void)
> > > >  BUG_ON(!hypercall_msr.enable);
> > > >
> > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > >
> > > Shouldn't this have at least a compiler barrier here?
> > >
> >
> > OK. I will add a write barrier here.
> 
> Hm, shouldn't such barrier be part of set_fixmap_x itself?
> 

Not really, for the purpose I had in mind. The hv_hcall_page_ready global is 
specific to this code and we need to make sure the page is actually ready 
before the code says it is.

  Paul

> Note that map_pages_to_xen already performs atomic writes.
> 
> Roger.
___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Roger Pau Monné
On Mon, Feb 17, 2020 at 01:11:59PM +0100, Jürgen Groß wrote:
> On 17.02.20 12:49, Julien Grall wrote:
> > Hi Juergen,
> > 
> > On 17/02/2020 07:20, Juergen Gross wrote:
> > > +void rcu_barrier(void)
> > >   {
> > > -    atomic_t cpu_count = ATOMIC_INIT(0);
> > > -    return stop_machine_run(rcu_barrier_action, _count, NR_CPUS);
> > > +    if ( !atomic_cmpxchg(_count, 0, num_online_cpus()) )
> > 
> > What does prevent the cpu_online_map to change under your feet?
> > Shouldn't you grab the lock via get_cpu_maps()?
> 
> Oh, indeed.
> 
> This in turn will require a modification of the logic to detect parallel
> calls on multiple cpus.

If you pick my patch to turn that into a rw lock you shouldn't worry
about parallel calls I think, but the lock acquisition can still fail
if there's a CPU plug/unplug going on:

https://lists.xenproject.org/archives/html/xen-devel/2020-02/msg00940.html

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Roger Pau Monné
On Mon, Feb 17, 2020 at 12:08:01PM +, Wei Liu wrote:
> On Mon, Feb 17, 2020 at 01:00:54PM +0100, Roger Pau Monné wrote:
> > On Mon, Feb 17, 2020 at 11:45:38AM +, Wei Liu wrote:
> > > On Mon, Feb 17, 2020 at 12:40:31PM +0100, Roger Pau Monné wrote:
> > > > On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > > > > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > > > > -Original Message-
> > > > > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > > > > Sent: 14 February 2020 13:34
> > > > > > > To: Xen Development List 
> > > > > > > Cc: Michael Kelley ; Durrant, Paul
> > > > > > > ; Wei Liu ; Wei Liu
> > > > > > > ; Jan Beulich ; Andrew Cooper
> > > > > > > ; Roger Pau Monné 
> > > > > > > 
> > > > > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB 
> > > > > > > flush
> > > > > > > 
> > > > > > > Implement a basic hook for L0 assisted TLB flush. The hook needs 
> > > > > > > to
> > > > > > > check if prerequisites are met. If they are not met, it returns 
> > > > > > > an error
> > > > > > > number to fall back to native flushes.
> > > > > > > 
> > > > > > > Introduce a new variable to indicate if hypercall page is ready.
> > > > > > > 
> > > > > > > Signed-off-by: Wei Liu 
> > > > > > > ---
> > > > > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > > > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > > > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > > > > >  xen/arch/x86/guest/hyperv/tlb.c | 41 
> > > > > > > +
> > > > > > >  4 files changed, 63 insertions(+)
> > > > > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > > > > > 
> > > > > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > index 68170109a9..18902c33e9 100644
> > > > > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > > @@ -1 +1,2 @@
> > > > > > >  obj-y += hyperv.o
> > > > > > > +obj-y += tlb.o
> > > > > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, 
> > > > > > > hv_input_page);
> > > > > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > > > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > > > > > 
> > > > > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > > > > +
> > > > > > >  static uint64_t generate_guest_id(void)
> > > > > > >  {
> > > > > > >  union hv_guest_os_id id = {};
> > > > > > > @@ -119,6 +121,8 @@ static void __init setup_hypercall_page(void)
> > > > > > >  BUG_ON(!hypercall_msr.enable);
> > > > > > > 
> > > > > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > > > > 
> > > > > > Shouldn't this have at least a compiler barrier here?
> > > > > > 
> > > > > 
> > > > > OK. I will add a write barrier here.
> > > > 
> > > > Hm, shouldn't such barrier be part of set_fixmap_x itself?
> > > > 
> > > > Note that map_pages_to_xen already performs atomic writes.
> > > 
> > > I don't mind making things more explicit though. However unlikely, I
> > > may end up putting something in between set_fixmap_x and setting
> > > hcall_page_ready, I will need the barrier by then, I may as well put it
> > > in now.
> > 
> > IMO set_fixmap_x should have the necessary barriers (or other
> > synchronization methods) so that on return the address is correctly
> > mapped across all processors, and that it prevents the compiler from
> > moving accesses past it. I would consider a bug of set_fixmap_x
> > not having this behavior and requiring callers to do extra work in
> > order to ensure this.
> > 
> > Ie: something like the snipped below should not require an extra
> > barrier IMO:
> > 
> > set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > *((unsigned int *)fix_x_to_virt(FIX_X_HYPERV_HCALL)) = 0;
> 
> That's different though. Compiler can't make the connection between
> hcall_page_ready and the address returned by set_fixmap_x.

I'm not sure the compiler can make a connection between set_fixmap_x
and fix_x_to_virt either (as fix_x_to_virt is a simple mathematical
operation and FIX_X_HYPERV_HCALL is a constant known at build time).

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Jürgen Groß

On 17.02.20 12:49, Julien Grall wrote:

Hi Juergen,

On 17/02/2020 07:20, Juergen Gross wrote:

+void rcu_barrier(void)
  {
-    atomic_t cpu_count = ATOMIC_INIT(0);
-    return stop_machine_run(rcu_barrier_action, _count, NR_CPUS);
+    if ( !atomic_cmpxchg(_count, 0, num_online_cpus()) )


What does prevent the cpu_online_map to change under your feet? 
Shouldn't you grab the lock via get_cpu_maps()?


Oh, indeed.

This in turn will require a modification of the logic to detect parallel
calls on multiple cpus.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Roger Pau Monné
On Mon, Feb 17, 2020 at 12:01:23PM +, Durrant, Paul wrote:
> > -Original Message-
> > From: Roger Pau Monné 
> > Sent: 17 February 2020 11:41
> > To: Wei Liu 
> > Cc: Durrant, Paul ; Xen Development List  > de...@lists.xenproject.org>; Michael Kelley ; Wei
> > Liu ; Jan Beulich ; Andrew Cooper
> > 
> > Subject: Re: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> > 
> > On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > > -Original Message-
> > > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > > Sent: 14 February 2020 13:34
> > > > > To: Xen Development List 
> > > > > Cc: Michael Kelley ; Durrant, Paul
> > > > > ; Wei Liu ; Wei Liu
> > > > > ; Jan Beulich ; Andrew Cooper
> > > > > ; Roger Pau Monné 
> > > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB
> > flush
> > > > >
> > > > > Implement a basic hook for L0 assisted TLB flush. The hook needs to
> > > > > check if prerequisites are met. If they are not met, it returns an
> > error
> > > > > number to fall back to native flushes.
> > > > >
> > > > > Introduce a new variable to indicate if hypercall page is ready.
> > > > >
> > > > > Signed-off-by: Wei Liu 
> > > > > ---
> > > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > > >  xen/arch/x86/guest/hyperv/tlb.c | 41
> > +
> > > > >  4 files changed, 63 insertions(+)
> > > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > > >
> > > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > > index 68170109a9..18902c33e9 100644
> > > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > > @@ -1 +1,2 @@
> > > > >  obj-y += hyperv.o
> > > > > +obj-y += tlb.o
> > > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, hv_input_page);
> > > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > > >
> > > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > > +
> > > > >  static uint64_t generate_guest_id(void)
> > > > >  {
> > > > >  union hv_guest_os_id id = {};
> > > > > @@ -119,6 +121,8 @@ static void __init setup_hypercall_page(void)
> > > > >  BUG_ON(!hypercall_msr.enable);
> > > > >
> > > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > >
> > > > Shouldn't this have at least a compiler barrier here?
> > > >
> > >
> > > OK. I will add a write barrier here.
> > 
> > Hm, shouldn't such barrier be part of set_fixmap_x itself?
> > 
> 
> Not really, for the purpose I had in mind. The hv_hcall_page_ready global is 
> specific to this code and we need to make sure the page is actually ready 
> before the code says it is.

But anything that modifies the page tables should already have a
barrier if required in order to prevent accesses from being moved
ahead of it, or else things would certainly go wrong in many other
places?

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Wei Liu
On Mon, Feb 17, 2020 at 01:00:54PM +0100, Roger Pau Monné wrote:
> On Mon, Feb 17, 2020 at 11:45:38AM +, Wei Liu wrote:
> > On Mon, Feb 17, 2020 at 12:40:31PM +0100, Roger Pau Monné wrote:
> > > On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > > > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > > > -Original Message-
> > > > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > > > Sent: 14 February 2020 13:34
> > > > > > To: Xen Development List 
> > > > > > Cc: Michael Kelley ; Durrant, Paul
> > > > > > ; Wei Liu ; Wei Liu
> > > > > > ; Jan Beulich ; Andrew Cooper
> > > > > > ; Roger Pau Monné 
> > > > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB 
> > > > > > flush
> > > > > > 
> > > > > > Implement a basic hook for L0 assisted TLB flush. The hook needs to
> > > > > > check if prerequisites are met. If they are not met, it returns an 
> > > > > > error
> > > > > > number to fall back to native flushes.
> > > > > > 
> > > > > > Introduce a new variable to indicate if hypercall page is ready.
> > > > > > 
> > > > > > Signed-off-by: Wei Liu 
> > > > > > ---
> > > > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > > > >  xen/arch/x86/guest/hyperv/tlb.c | 41 
> > > > > > +
> > > > > >  4 files changed, 63 insertions(+)
> > > > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > > > > 
> > > > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > index 68170109a9..18902c33e9 100644
> > > > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > > > @@ -1 +1,2 @@
> > > > > >  obj-y += hyperv.o
> > > > > > +obj-y += tlb.o
> > > > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, hv_input_page);
> > > > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > > > > 
> > > > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > > > +
> > > > > >  static uint64_t generate_guest_id(void)
> > > > > >  {
> > > > > >  union hv_guest_os_id id = {};
> > > > > > @@ -119,6 +121,8 @@ static void __init setup_hypercall_page(void)
> > > > > >  BUG_ON(!hypercall_msr.enable);
> > > > > > 
> > > > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > > > 
> > > > > Shouldn't this have at least a compiler barrier here?
> > > > > 
> > > > 
> > > > OK. I will add a write barrier here.
> > > 
> > > Hm, shouldn't such barrier be part of set_fixmap_x itself?
> > > 
> > > Note that map_pages_to_xen already performs atomic writes.
> > 
> > I don't mind making things more explicit though. However unlikely, I
> > may end up putting something in between set_fixmap_x and setting
> > hcall_page_ready, I will need the barrier by then, I may as well put it
> > in now.
> 
> IMO set_fixmap_x should have the necessary barriers (or other
> synchronization methods) so that on return the address is correctly
> mapped across all processors, and that it prevents the compiler from
> moving accesses past it. I would consider a bug of set_fixmap_x
> not having this behavior and requiring callers to do extra work in
> order to ensure this.
> 
> Ie: something like the snipped below should not require an extra
> barrier IMO:
> 
> set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> *((unsigned int *)fix_x_to_virt(FIX_X_HYPERV_HCALL)) = 0;

That's different though. Compiler can't make the connection between
hcall_page_ready and the address returned by set_fixmap_x.

Wei.

> 
> Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 3/3] x86/hyperv: L0 assisted TLB flush

2020-02-17 Thread Wei Liu
On Fri, Feb 14, 2020 at 04:42:47PM +, Michael Kelley wrote:
> From: Wei Liu  On Behalf Of Wei Liu Sent: Friday, 
> February 14, 2020 4:35 AM
> > 
> > Implement L0 assisted TLB flush for Xen on Hyper-V. It takes advantage
> > of several hypercalls:
> > 
> >  * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST
> >  * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX
> >  * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE
> >  * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX
> > 
> > Pick the most efficient hypercalls available.
> > 
> > Signed-off-by: Wei Liu 
> > ---
> > v2:
> > 1. Address Roger and Jan's comments re types etc.
> > 2. Fix pointer arithmetic.
> > 3. Misc improvement to code.
> > ---
> >  xen/arch/x86/guest/hyperv/Makefile  |   1 +
> >  xen/arch/x86/guest/hyperv/private.h |   9 ++
> >  xen/arch/x86/guest/hyperv/tlb.c | 172 +++-
> >  xen/arch/x86/guest/hyperv/util.c|  74 
> >  4 files changed, 255 insertions(+), 1 deletion(-)
> >  create mode 100644 xen/arch/x86/guest/hyperv/util.c
> > 
> > diff --git a/xen/arch/x86/guest/hyperv/Makefile 
> > b/xen/arch/x86/guest/hyperv/Makefile
> > index 18902c33e9..0e39410968 100644
> > --- a/xen/arch/x86/guest/hyperv/Makefile
> > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > @@ -1,2 +1,3 @@
> >  obj-y += hyperv.o
> >  obj-y += tlb.o
> > +obj-y += util.o
> > diff --git a/xen/arch/x86/guest/hyperv/private.h 
> > b/xen/arch/x86/guest/hyperv/private.h
> > index 509bedaafa..79a77930a0 100644
> > --- a/xen/arch/x86/guest/hyperv/private.h
> > +++ b/xen/arch/x86/guest/hyperv/private.h
> > @@ -24,12 +24,21 @@
> > 
> >  #include 
> >  #include 
> > +#include 
> > 
> >  DECLARE_PER_CPU(void *, hv_input_page);
> >  DECLARE_PER_CPU(void *, hv_vp_assist);
> >  DECLARE_PER_CPU(unsigned int, hv_vp_index);
> > 
> > +static inline unsigned int hv_vp_index(unsigned int cpu)
> > +{
> > +return per_cpu(hv_vp_index, cpu);
> > +}
> > +
> >  int hyperv_flush_tlb(const cpumask_t *mask, const void *va,
> >   unsigned int flags);
> > 
> > +/* Returns number of banks, -ev if error */
> > +int cpumask_to_vpset(struct hv_vpset *vpset, const cpumask_t *mask);
> > +
> >  #endif /* __XEN_HYPERV_PRIVIATE_H__  */
> > diff --git a/xen/arch/x86/guest/hyperv/tlb.c 
> > b/xen/arch/x86/guest/hyperv/tlb.c
> > index 48f527229e..f68e14f151 100644
> > --- a/xen/arch/x86/guest/hyperv/tlb.c
> > +++ b/xen/arch/x86/guest/hyperv/tlb.c
> > @@ -19,15 +19,185 @@
> >   * Copyright (c) 2020 Microsoft.
> >   */
> > 
> > +#include 
> >  #include 
> >  #include 
> > 
> > +#include 
> > +#include 
> > +#include 
> > +
> >  #include "private.h"
> > 
> > +/*
> > + * It is possible to encode up to 4096 pages using the lower 12 bits
> > + * in an element of gva_list
> > + */
> > +#define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE)
> > +
> > +static unsigned int fill_gva_list(uint64_t *gva_list, const void *va,
> > +  unsigned int order)
> > +{
> > +unsigned long start = (unsigned long)va;
> > +unsigned long end = start + (PAGE_SIZE << order) - 1;
> > +unsigned int n = 0;
> > +
> > +do {
> > +unsigned long remain = end - start;
> 
> The calculated value here isn't actually the remaining bytes in the
> range to flush -- it's one less than the remaining bytes in the range
> to flush because of the -1 in the calculation of 'end'.   That difference
> will mess up the comparison below against HV_TLB_FLUSH_UNIT
> in the case that there are exactly 4096 page remaining to be
> flushed.  It should take the "=" case, but won't.  Also, the
> '-1' in 'remain - 1' in the else clause becomes unneeded, and
> the 'start = end' assignment then propagates the error.
> 
> In the parallel code in Linux, if you follow the call sequence to get to
> fill_gav_list(), the 'end' argument is really the address of the first byte
> of the first page that isn't in the flush range (i.e., one beyond the true
> 'end') and so is a bit misnamed.
> 
> I think the calculation of 'end' should drop the -1, and perhaps 'end'
> should be renamed.

Thanks for the detailed review. Let me fix this.

Wei.

> 
> Michael
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Roger Pau Monné
On Mon, Feb 17, 2020 at 11:45:38AM +, Wei Liu wrote:
> On Mon, Feb 17, 2020 at 12:40:31PM +0100, Roger Pau Monné wrote:
> > On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > > -Original Message-
> > > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > > Sent: 14 February 2020 13:34
> > > > > To: Xen Development List 
> > > > > Cc: Michael Kelley ; Durrant, Paul
> > > > > ; Wei Liu ; Wei Liu
> > > > > ; Jan Beulich ; Andrew Cooper
> > > > > ; Roger Pau Monné 
> > > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> > > > > 
> > > > > Implement a basic hook for L0 assisted TLB flush. The hook needs to
> > > > > check if prerequisites are met. If they are not met, it returns an 
> > > > > error
> > > > > number to fall back to native flushes.
> > > > > 
> > > > > Introduce a new variable to indicate if hypercall page is ready.
> > > > > 
> > > > > Signed-off-by: Wei Liu 
> > > > > ---
> > > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > > >  xen/arch/x86/guest/hyperv/tlb.c | 41 
> > > > > +
> > > > >  4 files changed, 63 insertions(+)
> > > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > > > 
> > > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > > index 68170109a9..18902c33e9 100644
> > > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > > @@ -1 +1,2 @@
> > > > >  obj-y += hyperv.o
> > > > > +obj-y += tlb.o
> > > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, hv_input_page);
> > > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > > > 
> > > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > > +
> > > > >  static uint64_t generate_guest_id(void)
> > > > >  {
> > > > >  union hv_guest_os_id id = {};
> > > > > @@ -119,6 +121,8 @@ static void __init setup_hypercall_page(void)
> > > > >  BUG_ON(!hypercall_msr.enable);
> > > > > 
> > > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > > 
> > > > Shouldn't this have at least a compiler barrier here?
> > > > 
> > > 
> > > OK. I will add a write barrier here.
> > 
> > Hm, shouldn't such barrier be part of set_fixmap_x itself?
> > 
> > Note that map_pages_to_xen already performs atomic writes.
> 
> I don't mind making things more explicit though. However unlikely, I
> may end up putting something in between set_fixmap_x and setting
> hcall_page_ready, I will need the barrier by then, I may as well put it
> in now.

IMO set_fixmap_x should have the necessary barriers (or other
synchronization methods) so that on return the address is correctly
mapped across all processors, and that it prevents the compiler from
moving accesses past it. I would consider a bug of set_fixmap_x
not having this behavior and requiring callers to do extra work in
order to ensure this.

Ie: something like the snipped below should not require an extra
barrier IMO:

set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
*((unsigned int *)fix_x_to_virt(FIX_X_HYPERV_HCALL)) = 0;

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] Ping: [PATCH V2] x86/altp2m: Hypercall to set altp2m view visibility

2020-02-17 Thread Alexandru Stefan ISAILA
Hi all,

Any ideas on this patch appreciated.

Regards,
Alex

On 30.01.2020 15:07, Alexandru Stefan ISAILA wrote:
> At this moment a guest can call vmfunc to change the altp2m view. This
> should be limited in order to avoid any unwanted view switch.
> 
> The new xc_altp2m_set_visibility() solves this by making views invisible
> to vmfunc.
> This is done by having a separate arch.altp2m_working_eptp that is
> populated and made invalid in the same places as altp2m_eptp. This is
> written to EPTP_LIST_ADDR.
> The views are made in/visible by marking them with INVALID_MFN or
> copying them back from altp2m_eptp.
> To have consistency the visibility also applies to
> p2m_switch_domain_altp2m_by_id().
> 
> Signed-off-by: Alexandru Isaila 
> ---
> CC: Ian Jackson 
> CC: Wei Liu 
> CC: Andrew Cooper 
> CC: George Dunlap 
> CC: Jan Beulich 
> CC: Julien Grall 
> CC: Konrad Rzeszutek Wilk 
> CC: Stefano Stabellini 
> CC: "Roger Pau Monné" 
> CC: Jun Nakajima 
> CC: Kevin Tian 
> CC: George Dunlap 
> ---
> Changes since V1:
>   - Drop double view from title.
> ---
>   tools/libxc/include/xenctrl.h   |  2 ++
>   tools/libxc/xc_altp2m.c | 24 
>   xen/arch/x86/hvm/hvm.c  | 25 +
>   xen/arch/x86/hvm/vmx/vmx.c  |  2 +-
>   xen/arch/x86/mm/hap/hap.c   | 15 +++
>   xen/arch/x86/mm/p2m-ept.c   |  1 +
>   xen/arch/x86/mm/p2m.c   |  5 -
>   xen/include/asm-x86/domain.h|  1 +
>   xen/include/public/hvm/hvm_op.h | 10 ++
>   9 files changed, 83 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index cc4eb1e3d3..dbea7861e7 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -1943,6 +1943,8 @@ int xc_altp2m_change_gfn(xc_interface *handle, uint32_t 
> domid,
>xen_pfn_t new_gfn);
>   int xc_altp2m_get_vcpu_p2m_idx(xc_interface *handle, uint32_t domid,
>  uint32_t vcpuid, uint16_t *p2midx);
> +int xc_altp2m_set_visibility(xc_interface *handle, uint32_t domid,
> + uint16_t view_id, bool visible);
>   
>   /**
>* Mem paging operations.
> diff --git a/tools/libxc/xc_altp2m.c b/tools/libxc/xc_altp2m.c
> index 46fb725806..6987c9541f 100644
> --- a/tools/libxc/xc_altp2m.c
> +++ b/tools/libxc/xc_altp2m.c
> @@ -410,3 +410,27 @@ int xc_altp2m_get_vcpu_p2m_idx(xc_interface *handle, 
> uint32_t domid,
>   xc_hypercall_buffer_free(handle, arg);
>   return rc;
>   }
> +
> +int xc_altp2m_set_visibility(xc_interface *handle, uint32_t domid,
> + uint16_t view_id, bool visible)
> +{
> +int rc;
> +
> +DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
> +
> +arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
> +if ( arg == NULL )
> +return -1;
> +
> +arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
> +arg->cmd = HVMOP_altp2m_set_visibility;
> +arg->domain = domid;
> +arg->u.set_visibility.altp2m_idx = view_id;
> +arg->u.set_visibility.visible = visible;
> +
> +rc = xencall2(handle->xcall, __HYPERVISOR_hvm_op, HVMOP_altp2m,
> +  HYPERCALL_BUFFER_AS_ARG(arg));
> +
> +xc_hypercall_buffer_free(handle, arg);
> +return rc;
> +}
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 0b93609a82..a41e9b6356 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4537,6 +4537,7 @@ static int do_altp2m_op(
>   case HVMOP_altp2m_get_mem_access:
>   case HVMOP_altp2m_change_gfn:
>   case HVMOP_altp2m_get_p2m_idx:
> +case HVMOP_altp2m_set_visibility:
>   break;
>   
>   default:
> @@ -4814,6 +4815,30 @@ static int do_altp2m_op(
>   break;
>   }
>   
> +case HVMOP_altp2m_set_visibility:
> +{
> +uint16_t altp2m_idx = a.u.set_visibility.altp2m_idx;
> +
> +if ( a.u.set_visibility.pad || a.u.set_visibility.pad2 )
> +rc = -EINVAL;
> +else
> +{
> +if ( !altp2m_active(d) || !hap_enabled(d) )
> +{
> +rc = -EOPNOTSUPP;
> +break;
> +}
> +
> +if ( a.u.set_visibility.visible )
> +d->arch.altp2m_working_eptp[altp2m_idx] =
> +d->arch.altp2m_eptp[altp2m_idx];
> +else
> +d->arch.altp2m_working_eptp[altp2m_idx] =
> +mfn_x(INVALID_MFN);
> +}
> +break;
> +}
> +
>   default:
>   ASSERT_UNREACHABLE();
>   }
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index b262d38a7c..65fe75383f 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2139,7 +2139,7 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
>   {
>   v->arch.hvm.vmx.secondary_exec_control |= mask;
>   

Re: [Xen-devel] [PATCH 2/2] xen/rcu: don't use stop_machine_run() for rcu_barrier()

2020-02-17 Thread Julien Grall

Hi Juergen,

On 17/02/2020 07:20, Juergen Gross wrote:

Today rcu_barrier() is calling stop_machine_run() to synchronize all
physical cpus in order to ensure all pending rcu calls have finished
when returning.

As stop_machine_run() is using tasklets this requires scheduling of
idle vcpus on all cpus imposing the need to call rcu_barrier() on idle
cpus only in case of core scheduling being active, as otherwise a
scheduling deadlock would occur.

There is no need at all to do the syncing of the cpus in tasklets, as
rcu activity is started in __do_softirq() called whenever softirq
activity is allowed. So rcu_barrier() can easily be modified to use
softirq for synchronization of the cpus no longer requiring any
scheduling activity.

As there already is a rcu softirq reuse that for the synchronization.

Finally switch rcu_barrier() to return void as it now can never fail.

Signed-off-by: Juergen Gross 
---
  xen/common/rcupdate.c  | 49 ++
  xen/include/xen/rcupdate.h |  2 +-
  2 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c
index 079ea9d8a1..1f02a804e3 100644
--- a/xen/common/rcupdate.c
+++ b/xen/common/rcupdate.c
@@ -143,47 +143,51 @@ static int qhimark = 1;
  static int qlowmark = 100;
  static int rsinterval = 1000;
  
-struct rcu_barrier_data {

-struct rcu_head head;
-atomic_t *cpu_count;
-};
+/*
+ * rcu_barrier() handling:
+ * cpu_count holds the number of cpu required to finish barrier handling.
+ * Cpus are synchronized via softirq mechanism. rcu_barrier() is regarded to
+ * be active if cpu_count is not zero. In case rcu_barrier() is called on
+ * multiple cpus it is enough to check for cpu_count being not zero on entry
+ * and to call process_pending_softirqs() in a loop until cpu_count drops to
+ * zero, as syncing has been requested already and we don't need to sync
+ * multiple times.
+ */
+static atomic_t cpu_count = ATOMIC_INIT(0);
  
  static void rcu_barrier_callback(struct rcu_head *head)

  {
-struct rcu_barrier_data *data = container_of(
-head, struct rcu_barrier_data, head);
-atomic_inc(data->cpu_count);
+atomic_dec(_count);
  }
  
-static int rcu_barrier_action(void *_cpu_count)

+static void rcu_barrier_action(void)
  {
-struct rcu_barrier_data data = { .cpu_count = _cpu_count };
-
-ASSERT(!local_irq_is_enabled());
-local_irq_enable();
+struct rcu_head head;
  
  /*

   * When callback is executed, all previously-queued RCU work on this CPU
   * is completed. When all CPUs have executed their callback, 
data.cpu_count
   * will have been incremented to include every online CPU.
   */
-call_rcu(, rcu_barrier_callback);
+call_rcu(, rcu_barrier_callback);
  
-while ( atomic_read(data.cpu_count) != num_online_cpus() )

+while ( atomic_read(_count) )
  {
  process_pending_softirqs();
  cpu_relax();
  }
-
-local_irq_disable();
-
-return 0;
  }
  
-int rcu_barrier(void)

+void rcu_barrier(void)
  {
-atomic_t cpu_count = ATOMIC_INIT(0);
-return stop_machine_run(rcu_barrier_action, _count, NR_CPUS);
+if ( !atomic_cmpxchg(_count, 0, num_online_cpus()) )


What does prevent the cpu_online_map to change under your feet? 
Shouldn't you grab the lock via get_cpu_maps()?



+cpumask_raise_softirq(_online_map, RCU_SOFTIRQ);
+
+while ( atomic_read(_count) )
+{
+process_pending_softirqs();
+cpu_relax();
+}
  }
  
  /* Is batch a before batch b ? */

@@ -422,6 +426,9 @@ static void rcu_process_callbacks(void)
  rdp->process_callbacks = false;
  __rcu_process_callbacks(_ctrlblk, rdp);
  }
+
+if ( atomic_read(_count) )
+rcu_barrier_action();
  }
  
  static int __rcu_pending(struct rcu_ctrlblk *rcp, struct rcu_data *rdp)

diff --git a/xen/include/xen/rcupdate.h b/xen/include/xen/rcupdate.h
index 174d058113..87f35b7704 100644
--- a/xen/include/xen/rcupdate.h
+++ b/xen/include/xen/rcupdate.h
@@ -143,7 +143,7 @@ void rcu_check_callbacks(int cpu);
  void call_rcu(struct rcu_head *head,
void (*func)(struct rcu_head *head));
  
-int rcu_barrier(void);

+void rcu_barrier(void);
  
  void rcu_idle_enter(unsigned int cpu);

  void rcu_idle_exit(unsigned int cpu);



Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 4/4] nvmx: always trap accesses to x2APIC MSRs

2020-02-17 Thread Roger Pau Monne
Nested VMX doesn't expose support for
SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE,
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY or
SECONDARY_EXEC_APIC_REGISTER_VIRT, and hence the x2APIC MSRs should
always be trapped in the nested guest MSR bitmap, or else a nested
guest could access the hardware x2APIC MSRs given certain conditions.

Accessing the hardware MSRs could be achieved by forcing the L0 Xen to
use SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE and
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY or
SECONDARY_EXEC_APIC_REGISTER_VIRT (if supported), and then creating a
L2 guest with a MSR bitmap that doesn't trap accesses to the x2APIC
MSR range. Then OR'ing both L0 and L1 MSR bitmaps would result in a
bitmap that doesn't trap certain x2APIC MSRs and a VMCS that doesn't
have SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE and
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY or
SECONDARY_EXEC_APIC_REGISTER_VIRT set either.

Fix this by making sure x2APIC MSRs are always trapped in the nested
MSR bitmap.

Signed-off-by: Roger Pau Monné 
Reviewed-by: Kevin Tian 
---
Changes since v4:
 - Fix size of x2APIC region to use 0x100.

Changes since v3:
 - Use bitmap_set.

Changes since v1:
 - New in this version (split from #1 patch).
 - Use non-locked set_bit.
---
 xen/arch/x86/hvm/vmx/vvmx.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 3337260d4b..926a11c15f 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -596,6 +596,13 @@ static void update_msrbitmap(struct vcpu *v, uint32_t 
shadow_ctrl)
   v->arch.hvm.vmx.msr_bitmap->write_high,
   sizeof(msr_bitmap->write_high) * 8);
 
+/*
+ * Nested VMX doesn't support any x2APIC hardware virtualization, so
+ * make sure all the x2APIC MSRs are trapped.
+ */
+bitmap_set(msr_bitmap->read_low, MSR_X2APIC_FIRST, 0x100);
+bitmap_set(msr_bitmap->write_low, MSR_X2APIC_FIRST, 0x100);
+
 unmap_domain_page(msr_bitmap);
 
 __vmwrite(MSR_BITMAP, page_to_maddr(nvmx->msr_merged));
-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 2/4] arm: rename BIT_WORD to BITOP_WORD

2020-02-17 Thread Roger Pau Monne
So BIT_WORD can be imported from Linux. The difference between current
Linux implementation of BIT_WORD is that the size of the word unit is
a long integer, while the Xen one is hardcoded to 32 bits.

Current users of BITOP_WORD on Arm (which considers a word a long
integer) are switched to use the generic BIT_WORD which also operates
on long integers.

No functional change intended.

Suggested-by: Julien Grall 
Suggested-by: Jan Beulich 
Signed-off-by: Roger Pau Monné 
---
Changes since v4:
 - New in this version.
---
 xen/arch/arm/arm32/lib/bitops.c|  4 ++--
 xen/arch/arm/arm64/lib/bitops.c|  4 ++--
 xen/arch/arm/arm64/lib/find_next_bit.c | 10 --
 xen/include/asm-arm/bitops.h   | 10 +-
 xen/include/xen/bitops.h   |  2 ++
 5 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/xen/arch/arm/arm32/lib/bitops.c b/xen/arch/arm/arm32/lib/bitops.c
index 3dca769bf0..82d935ce33 100644
--- a/xen/arch/arm/arm32/lib/bitops.c
+++ b/xen/arch/arm/arm32/lib/bitops.c
@@ -33,7 +33,7 @@
 static always_inline bool int_##name(int nr, volatile void *p, bool timeout,\
  unsigned int max_try)  \
 {   \
-volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr);\
+volatile uint32_t *ptr = (uint32_t *)p + BITOP_WORD((unsigned int)nr);  \
 const uint32_t mask = BIT_MASK((unsigned int)nr);   \
 unsigned long res, tmp; \
 \
@@ -71,7 +71,7 @@ bool name##_timeout(int nr, volatile void *p, unsigned int 
max_try) \
 static always_inline bool int_##name(int nr, volatile void *p, int *oldbit, \
  bool timeout, unsigned int max_try)\
 {   \
-volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr);\
+volatile uint32_t *ptr = (uint32_t *)p + BITOP_WORD((unsigned int)nr);  \
 unsigned int bit = (unsigned int)nr % BITS_PER_WORD;\
 const uint32_t mask = BIT_MASK(bit);\
 unsigned long res, tmp; \
diff --git a/xen/arch/arm/arm64/lib/bitops.c b/xen/arch/arm/arm64/lib/bitops.c
index 27688e5418..f5128c58f5 100644
--- a/xen/arch/arm/arm64/lib/bitops.c
+++ b/xen/arch/arm/arm64/lib/bitops.c
@@ -32,7 +32,7 @@
 static always_inline bool int_##name(int nr, volatile void *p, bool timeout,\
  unsigned int max_try)  \
 {   \
-volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr);\
+volatile uint32_t *ptr = (uint32_t *)p + BITOP_WORD((unsigned int)nr);  \
 const uint32_t mask = BIT_MASK((unsigned int)nr);   \
 unsigned long res, tmp; \
 \
@@ -67,7 +67,7 @@ bool name##_timeout(int nr, volatile void *p, unsigned int 
max_try) \
 static always_inline bool int_##name(int nr, volatile void *p, int *oldbit, \
  bool timeout, unsigned int max_try)\
 {   \
-volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr);\
+volatile uint32_t *ptr = (uint32_t *)p + BITOP_WORD((unsigned int)nr);  \
 unsigned int bit = (unsigned int)nr % BITS_PER_WORD;\
 const uint32_t mask = BIT_MASK(bit);\
 unsigned long res, tmp; \
diff --git a/xen/arch/arm/arm64/lib/find_next_bit.c 
b/xen/arch/arm/arm64/lib/find_next_bit.c
index 17cb176266..8ebf8bfe97 100644
--- a/xen/arch/arm/arm64/lib/find_next_bit.c
+++ b/xen/arch/arm/arm64/lib/find_next_bit.c
@@ -12,8 +12,6 @@
 #include 
 #include 
 
-#define BITOP_WORD(nr) ((nr) / BITS_PER_LONG)
-
 #ifndef find_next_bit
 /*
  * Find the next set bit in a memory region.
@@ -21,7 +19,7 @@
 unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
unsigned long offset)
 {
-   const unsigned long *p = addr + BITOP_WORD(offset);
+   const unsigned long *p = addr + BIT_WORD(offset);
unsigned long result = offset & ~(BITS_PER_LONG-1);
unsigned long tmp;
 
@@ -67,7 +65,7 @@ EXPORT_SYMBOL(find_next_bit);
 unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
 unsigned long offset)
 {
-   const unsigned long *p = addr + BITOP_WORD(offset);
+   const unsigned long 

[Xen-devel] [PATCH v5 1/4] nvmx: implement support for MSR bitmaps

2020-02-17 Thread Roger Pau Monne
Current implementation of nested VMX has a half baked handling of MSR
bitmaps for the L1 VMM: it maps the L1 VMM provided MSR bitmap, but
doesn't actually load it into the nested vmcs, and thus the nested
guest vmcs ends up using the same MSR bitmap as the L1 VMM.

This is wrong as there's no assurance that the set of features enabled
for the L1 vmcs are the same that L1 itself is going to use in the
nested vmcs, and thus can lead to misconfigurations.

For example L1 vmcs can use x2APIC virtualization and virtual
interrupt delivery, and thus some x2APIC MSRs won't be trapped so that
they can be handled directly by the hardware using virtualization
extensions. On the other hand, the nested vmcs created by L1 VMM might
not use any of such features, so using a MSR bitmap that doesn't trap
accesses to the x2APIC MSRs will be leaking them to the underlying
hardware.

Fix this by crafting a merged MSR bitmap between the one used by L1
and the nested guest.

Signed-off-by: Roger Pau Monné 
---
Changes since v4:
 - Add static to vcpu_relinquish_resources.

Changes since v3:
 - Free the merged MSR bitmap page in nvmx_purge_vvmcs.

Changes since v2:
 - Pass shadow_ctrl into update_msrbitmap, and check there if
   CPU_BASED_ACTIVATE_MSR_BITMAP is set.
 - Do not enable MSR bitmap unless it's enabled in both L1 and L2.
 - Rename L1 guest to L2 in nestedvmx struct comment.

Changes since v1:
 - Split the x2APIC MSR fix into a separate patch.
 - Move setting MSR_BITMAP vmcs field into load_vvmcs_host_state for
   virtual vmexit.
 - Allocate memory with MEMF_no_owner.
 - Use tabs to align comment of the nestedvmx struct field.
---
 xen/arch/x86/hvm/vmx/vvmx.c| 73 --
 xen/include/asm-x86/hvm/vmx/vvmx.h |  3 +-
 2 files changed, 71 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 47eee1e5b9..3337260d4b 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -128,6 +128,16 @@ int nvmx_vcpu_initialise(struct vcpu *v)
 unmap_domain_page(vw);
 }
 
+if ( cpu_has_vmx_msr_bitmap )
+{
+nvmx->msr_merged = alloc_domheap_page(d, MEMF_no_owner);
+if ( !nvmx->msr_merged )
+{
+gdprintk(XENLOG_ERR, "nest: allocation for MSR bitmap failed\n");
+return -ENOMEM;
+}
+}
+
 nvmx->ept.enabled = 0;
 nvmx->guest_vpid = 0;
 nvmx->vmxon_region_pa = INVALID_PADDR;
@@ -183,13 +193,27 @@ void nvmx_vcpu_destroy(struct vcpu *v)
 v->arch.hvm.vmx.vmwrite_bitmap = NULL;
 }
 }
- 
+
+static void vcpu_relinquish_resources(struct vcpu *v)
+{
+struct nestedvmx *nvmx = _2_nvmx(v);
+
+if ( nvmx->msr_merged )
+{
+free_domheap_page(nvmx->msr_merged);
+nvmx->msr_merged = NULL;
+}
+}
+
 void nvmx_domain_relinquish_resources(struct domain *d)
 {
 struct vcpu *v;
 
 for_each_vcpu ( d, v )
+{
 nvmx_purge_vvmcs(v);
+vcpu_relinquish_resources(v);
+}
 }
 
 int nvmx_vcpu_reset(struct vcpu *v)
@@ -548,6 +572,35 @@ unsigned long *_shadow_io_bitmap(struct vcpu *v)
 return nestedhvm_vcpu_iomap_get(port80, portED);
 }
 
+static void update_msrbitmap(struct vcpu *v, uint32_t shadow_ctrl)
+{
+struct nestedvmx *nvmx = _2_nvmx(v);
+struct vmx_msr_bitmap *msr_bitmap;
+
+if ( !(shadow_ctrl & CPU_BASED_ACTIVATE_MSR_BITMAP) ||
+ !nvmx->msrbitmap )
+   return;
+
+msr_bitmap = __map_domain_page(nvmx->msr_merged);
+
+bitmap_or(msr_bitmap->read_low, nvmx->msrbitmap->read_low,
+  v->arch.hvm.vmx.msr_bitmap->read_low,
+  sizeof(msr_bitmap->read_low) * 8);
+bitmap_or(msr_bitmap->read_high, nvmx->msrbitmap->read_high,
+  v->arch.hvm.vmx.msr_bitmap->read_high,
+  sizeof(msr_bitmap->read_high) * 8);
+bitmap_or(msr_bitmap->write_low, nvmx->msrbitmap->write_low,
+  v->arch.hvm.vmx.msr_bitmap->write_low,
+  sizeof(msr_bitmap->write_low) * 8);
+bitmap_or(msr_bitmap->write_high, nvmx->msrbitmap->write_high,
+  v->arch.hvm.vmx.msr_bitmap->write_high,
+  sizeof(msr_bitmap->write_high) * 8);
+
+unmap_domain_page(msr_bitmap);
+
+__vmwrite(MSR_BITMAP, page_to_maddr(nvmx->msr_merged));
+}
+
 void nvmx_update_exec_control(struct vcpu *v, u32 host_cntrl)
 {
 u32 pio_cntrl = (CPU_BASED_ACTIVATE_IO_BITMAP
@@ -558,10 +611,17 @@ void nvmx_update_exec_control(struct vcpu *v, u32 
host_cntrl)
 shadow_cntrl = __n2_exec_control(v);
 pio_cntrl &= shadow_cntrl;
 /* Enforce the removed features */
-shadow_cntrl &= ~(CPU_BASED_ACTIVATE_MSR_BITMAP
-  | CPU_BASED_ACTIVATE_IO_BITMAP
+shadow_cntrl &= ~(CPU_BASED_ACTIVATE_IO_BITMAP
   | CPU_BASED_UNCOND_IO_EXITING);
-shadow_cntrl |= host_cntrl;
+/*
+ * Do NOT enforce the MSR bitmap currently used by L1, as certain hardware
+ * virtualization features require specific 

[Xen-devel] [PATCH v5 3/4] bitmap: import bitmap_{set/clear} from Linux 5.5

2020-02-17 Thread Roger Pau Monne
Import the functions and it's dependencies. Based on Linux 5.5, commit
id d5226fa6dbae0569ee43ecfc08bdcd6770fc4755.

Signed-off-by: Roger Pau Monné 
---
Changes since v4:
 - Introduce BIT_WORD in generic header bitops.h (instead of the x86
   one).
 - Include byteorder.h for __LITTLE_ENDIAN
 - Remove EXPORT_SYMBOL.
---
 xen/common/bitmap.c  | 39 +++
 xen/include/xen/bitmap.h | 40 
 2 files changed, 79 insertions(+)

diff --git a/xen/common/bitmap.c b/xen/common/bitmap.c
index fd070bee97..88768bf8bc 100644
--- a/xen/common/bitmap.c
+++ b/xen/common/bitmap.c
@@ -212,6 +212,45 @@ int __bitmap_weight(const unsigned long *bitmap, int bits)
 #endif
 EXPORT_SYMBOL(__bitmap_weight);
 
+void __bitmap_set(unsigned long *map, unsigned int start, int len)
+{
+   unsigned long *p = map + BIT_WORD(start);
+   const unsigned int size = start + len;
+   int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
+   unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+
+   while (len - bits_to_set >= 0) {
+   *p |= mask_to_set;
+   len -= bits_to_set;
+   bits_to_set = BITS_PER_LONG;
+   mask_to_set = ~0UL;
+   p++;
+   }
+   if (len) {
+   mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+   *p |= mask_to_set;
+   }
+}
+
+void __bitmap_clear(unsigned long *map, unsigned int start, int len)
+{
+   unsigned long *p = map + BIT_WORD(start);
+   const unsigned int size = start + len;
+   int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+   unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+
+   while (len - bits_to_clear >= 0) {
+   *p &= ~mask_to_clear;
+   len -= bits_to_clear;
+   bits_to_clear = BITS_PER_LONG;
+   mask_to_clear = ~0UL;
+   p++;
+   }
+   if (len) {
+   mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+   *p &= ~mask_to_clear;
+   }
+}
 
 /**
  * bitmap_find_free_region - find a contiguous aligned mem region
diff --git a/xen/include/xen/bitmap.h b/xen/include/xen/bitmap.h
index 4e1e690af1..c44e009f8c 100644
--- a/xen/include/xen/bitmap.h
+++ b/xen/include/xen/bitmap.h
@@ -85,6 +85,8 @@ extern int __bitmap_intersects(const unsigned long *bitmap1,
 extern int __bitmap_subset(const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits);
 extern int __bitmap_weight(const unsigned long *bitmap, int bits);
+extern void __bitmap_set(unsigned long *map, unsigned int start, int len);
+extern void __bitmap_clear(unsigned long *map, unsigned int start, int len);
 
 extern int bitmap_find_free_region(unsigned long *bitmap, int bits, int order);
 extern void bitmap_release_region(unsigned long *bitmap, int pos, int order);
@@ -227,6 +229,44 @@ static inline int bitmap_weight(const unsigned long *src, 
int nbits)
return __bitmap_weight(src, nbits);
 }
 
+#include 
+
+#ifdef __LITTLE_ENDIAN
+#define BITMAP_MEM_ALIGNMENT 8
+#else
+#define BITMAP_MEM_ALIGNMENT (8 * sizeof(unsigned long))
+#endif
+#define BITMAP_MEM_MASK (BITMAP_MEM_ALIGNMENT - 1)
+#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
+
+static inline void bitmap_set(unsigned long *map, unsigned int start,
+   unsigned int nbits)
+{
+   if (__builtin_constant_p(nbits) && nbits == 1)
+   __set_bit(start, map);
+   else if (__builtin_constant_p(start & BITMAP_MEM_MASK) &&
+IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) &&
+__builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
+IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT))
+   memset((char *)map + start / 8, 0xff, nbits / 8);
+   else
+   __bitmap_set(map, start, nbits);
+}
+
+static inline void bitmap_clear(unsigned long *map, unsigned int start,
+   unsigned int nbits)
+{
+   if (__builtin_constant_p(nbits) && nbits == 1)
+   __clear_bit(start, map);
+   else if (__builtin_constant_p(start & BITMAP_MEM_MASK) &&
+IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) &&
+__builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
+IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT))
+   memset((char *)map + start / 8, 0, nbits / 8);
+   else
+   __bitmap_clear(map, start, nbits);
+}
+
 #undef bitmap_switch
 #undef bitmap_bytes
 
-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Wei Liu
On Mon, Feb 17, 2020 at 12:40:31PM +0100, Roger Pau Monné wrote:
> On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> > On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > > -Original Message-
> > > > From: Wei Liu  On Behalf Of Wei Liu
> > > > Sent: 14 February 2020 13:34
> > > > To: Xen Development List 
> > > > Cc: Michael Kelley ; Durrant, Paul
> > > > ; Wei Liu ; Wei Liu
> > > > ; Jan Beulich ; Andrew Cooper
> > > > ; Roger Pau Monné 
> > > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> > > > 
> > > > Implement a basic hook for L0 assisted TLB flush. The hook needs to
> > > > check if prerequisites are met. If they are not met, it returns an error
> > > > number to fall back to native flushes.
> > > > 
> > > > Introduce a new variable to indicate if hypercall page is ready.
> > > > 
> > > > Signed-off-by: Wei Liu 
> > > > ---
> > > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > > >  xen/arch/x86/guest/hyperv/tlb.c | 41 +
> > > >  4 files changed, 63 insertions(+)
> > > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > > 
> > > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > > b/xen/arch/x86/guest/hyperv/Makefile
> > > > index 68170109a9..18902c33e9 100644
> > > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > > @@ -1 +1,2 @@
> > > >  obj-y += hyperv.o
> > > > +obj-y += tlb.o
> > > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, hv_input_page);
> > > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > > 
> > > > +static bool __read_mostly hv_hcall_page_ready;
> > > > +
> > > >  static uint64_t generate_guest_id(void)
> > > >  {
> > > >  union hv_guest_os_id id = {};
> > > > @@ -119,6 +121,8 @@ static void __init setup_hypercall_page(void)
> > > >  BUG_ON(!hypercall_msr.enable);
> > > > 
> > > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > > 
> > > Shouldn't this have at least a compiler barrier here?
> > > 
> > 
> > OK. I will add a write barrier here.
> 
> Hm, shouldn't such barrier be part of set_fixmap_x itself?
> 
> Note that map_pages_to_xen already performs atomic writes.

I don't mind making things more explicit though. However unlikely, I
may end up putting something in between set_fixmap_x and setting
hcall_page_ready, I will need the barrier by then, I may as well put it
in now.

Wei.

> 
> Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v5 0/4] nvmx: implement support for MSR bitmaps

2020-02-17 Thread Roger Pau Monne
Hello,

Current nested VMX code advertises support for the MSR bitmap feature,
yet the implementation isn't done. Previous to this series Xen just maps
the nested guest MSR bitmap (as set by L1) and that's it, the L2 guest
ends up using the L1 MSR bitmap.

This series adds handling of the L2 MSR bitmap and merging with the L1
MSR bitmap and loading it into the nested guest VMCS.

Patch #4 makes sure the x2APIC MSR range is always trapped, or else a
guest with nested virtualization enabled could manage to access some of
the x2APIC MSR registers from the host.

Thanks, Roger.Roger Pau Monne (4):
  nvmx: implement support for MSR bitmaps
  arm: rename BIT_WORD to BITOP_WORD
  bitmap: import bitmap_{set/clear} from Linux 5.5
  nvmx: always trap accesses to x2APIC MSRs

 xen/arch/arm/arm32/lib/bitops.c|  4 +-
 xen/arch/arm/arm64/lib/bitops.c|  4 +-
 xen/arch/arm/arm64/lib/find_next_bit.c | 10 ++--
 xen/arch/x86/hvm/vmx/vvmx.c| 80 --
 xen/common/bitmap.c| 39 +
 xen/include/asm-arm/bitops.h   | 10 ++--
 xen/include/asm-x86/hvm/vmx/vvmx.h |  3 +-
 xen/include/xen/bitmap.h   | 40 +
 xen/include/xen/bitops.h   |  2 +
 9 files changed, 172 insertions(+), 20 deletions(-)

-- 
2.25.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Roger Pau Monné
On Mon, Feb 17, 2020 at 11:34:41AM +, Wei Liu wrote:
> On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > > -Original Message-
> > > From: Wei Liu  On Behalf Of Wei Liu
> > > Sent: 14 February 2020 13:34
> > > To: Xen Development List 
> > > Cc: Michael Kelley ; Durrant, Paul
> > > ; Wei Liu ; Wei Liu
> > > ; Jan Beulich ; Andrew Cooper
> > > ; Roger Pau Monné 
> > > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> > > 
> > > Implement a basic hook for L0 assisted TLB flush. The hook needs to
> > > check if prerequisites are met. If they are not met, it returns an error
> > > number to fall back to native flushes.
> > > 
> > > Introduce a new variable to indicate if hypercall page is ready.
> > > 
> > > Signed-off-by: Wei Liu 
> > > ---
> > >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> > >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> > >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> > >  xen/arch/x86/guest/hyperv/tlb.c | 41 +
> > >  4 files changed, 63 insertions(+)
> > >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > > 
> > > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > > b/xen/arch/x86/guest/hyperv/Makefile
> > > index 68170109a9..18902c33e9 100644
> > > --- a/xen/arch/x86/guest/hyperv/Makefile
> > > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > > @@ -1 +1,2 @@
> > >  obj-y += hyperv.o
> > > +obj-y += tlb.o
> > > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > > b/xen/arch/x86/guest/hyperv/hyperv.c
> > > index 70f4cd5ae0..f9d1f11ae3 100644
> > > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, hv_input_page);
> > >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> > >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > > 
> > > +static bool __read_mostly hv_hcall_page_ready;
> > > +
> > >  static uint64_t generate_guest_id(void)
> > >  {
> > >  union hv_guest_os_id id = {};
> > > @@ -119,6 +121,8 @@ static void __init setup_hypercall_page(void)
> > >  BUG_ON(!hypercall_msr.enable);
> > > 
> > >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> > 
> > Shouldn't this have at least a compiler barrier here?
> > 
> 
> OK. I will add a write barrier here.

Hm, shouldn't such barrier be part of set_fixmap_x itself?

Note that map_pages_to_xen already performs atomic writes.

Roger.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush

2020-02-17 Thread Wei Liu
On Fri, Feb 14, 2020 at 04:55:44PM +, Durrant, Paul wrote:
> > -Original Message-
> > From: Wei Liu  On Behalf Of Wei Liu
> > Sent: 14 February 2020 13:34
> > To: Xen Development List 
> > Cc: Michael Kelley ; Durrant, Paul
> > ; Wei Liu ; Wei Liu
> > ; Jan Beulich ; Andrew Cooper
> > ; Roger Pau Monné 
> > Subject: [PATCH v2 2/3] x86/hyperv: skeleton for L0 assisted TLB flush
> > 
> > Implement a basic hook for L0 assisted TLB flush. The hook needs to
> > check if prerequisites are met. If they are not met, it returns an error
> > number to fall back to native flushes.
> > 
> > Introduce a new variable to indicate if hypercall page is ready.
> > 
> > Signed-off-by: Wei Liu 
> > ---
> >  xen/arch/x86/guest/hyperv/Makefile  |  1 +
> >  xen/arch/x86/guest/hyperv/hyperv.c  | 17 
> >  xen/arch/x86/guest/hyperv/private.h |  4 +++
> >  xen/arch/x86/guest/hyperv/tlb.c | 41 +
> >  4 files changed, 63 insertions(+)
> >  create mode 100644 xen/arch/x86/guest/hyperv/tlb.c
> > 
> > diff --git a/xen/arch/x86/guest/hyperv/Makefile
> > b/xen/arch/x86/guest/hyperv/Makefile
> > index 68170109a9..18902c33e9 100644
> > --- a/xen/arch/x86/guest/hyperv/Makefile
> > +++ b/xen/arch/x86/guest/hyperv/Makefile
> > @@ -1 +1,2 @@
> >  obj-y += hyperv.o
> > +obj-y += tlb.o
> > diff --git a/xen/arch/x86/guest/hyperv/hyperv.c
> > b/xen/arch/x86/guest/hyperv/hyperv.c
> > index 70f4cd5ae0..f9d1f11ae3 100644
> > --- a/xen/arch/x86/guest/hyperv/hyperv.c
> > +++ b/xen/arch/x86/guest/hyperv/hyperv.c
> > @@ -33,6 +33,8 @@ DEFINE_PER_CPU_READ_MOSTLY(void *, hv_input_page);
> >  DEFINE_PER_CPU_READ_MOSTLY(void *, hv_vp_assist);
> >  DEFINE_PER_CPU_READ_MOSTLY(unsigned int, hv_vp_index);
> > 
> > +static bool __read_mostly hv_hcall_page_ready;
> > +
> >  static uint64_t generate_guest_id(void)
> >  {
> >  union hv_guest_os_id id = {};
> > @@ -119,6 +121,8 @@ static void __init setup_hypercall_page(void)
> >  BUG_ON(!hypercall_msr.enable);
> > 
> >  set_fixmap_x(FIX_X_HYPERV_HCALL, mfn << PAGE_SHIFT);
> 
> Shouldn't this have at least a compiler barrier here?
> 

OK. I will add a write barrier here.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 3/3] xen/x86: Rename and simplify async_event_* infrastructure

2020-02-17 Thread Andrew Cooper
The name async_exception isn't appropriate.  NMI isn't an exception at all,
and while MCE is classified as an exception (i.e. can occur at any point), the
mechanics of injecting it behave like other external interrupts.  Rename to
async_event_* which is a little shorter.

Drop VCPU_TRAP_NONE and renumber VCPU_TRAP_* to be 0-based, rather than
1-based, and remove async_exception_state() which hides the fixup internally.
This shifts the bits used in async_event_mask along by one, but doesn't alter
the overall logic.

Drop the {nmi,mce}_{state,pending} defines which obfuscate the data layout.
Instead, use an anonymous union to overlay names on the async_event[] array,
to retain the easy-to-follow v->arch.{nmi,mce}_pending logic.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
---
 xen/arch/x86/domain.c  |  5 ++---
 xen/arch/x86/nmi.c |  2 +-
 xen/arch/x86/pv/iret.c | 15 +++
 xen/arch/x86/x86_64/asm-offsets.c  |  6 +++---
 xen/arch/x86/x86_64/compat/entry.S | 12 ++--
 xen/arch/x86/x86_64/entry.S| 12 ++--
 xen/include/asm-x86/domain.h   | 33 -
 7 files changed, 41 insertions(+), 44 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index fe63c23676..7ee6853522 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1246,9 +1246,8 @@ int arch_initialise_vcpu(struct vcpu *v, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 
 int arch_vcpu_reset(struct vcpu *v)
 {
-v->arch.async_exception_mask = 0;
-memset(v->arch.async_exception_state, 0,
-   sizeof(v->arch.async_exception_state));
+v->arch.async_event_mask = 0;
+memset(v->arch.async_event, 0, sizeof(v->arch.async_event));
 
 if ( is_pv_vcpu(v) )
 {
diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
index 0390d9b0b4..44507cd86b 100644
--- a/xen/arch/x86/nmi.c
+++ b/xen/arch/x86/nmi.c
@@ -600,7 +600,7 @@ static void do_nmi_stats(unsigned char key)
 return;
 
 pend = v->arch.nmi_pending;
-mask = v->arch.async_exception_mask & (1 << VCPU_TRAP_NMI);
+mask = v->arch.async_event_mask & (1 << VCPU_TRAP_NMI);
 if ( pend || mask )
 printk("%pv: NMI%s%s\n",
v, pend ? " pending" : "", mask ? " masked" : "");
diff --git a/xen/arch/x86/pv/iret.c b/xen/arch/x86/pv/iret.c
index 9e34b616f9..27bb39f162 100644
--- a/xen/arch/x86/pv/iret.c
+++ b/xen/arch/x86/pv/iret.c
@@ -27,15 +27,15 @@ static void async_exception_cleanup(struct vcpu *curr)
 {
 unsigned int trap;
 
-if ( !curr->arch.async_exception_mask )
+if ( !curr->arch.async_event_mask )
 return;
 
-if ( !(curr->arch.async_exception_mask & (curr->arch.async_exception_mask 
- 1)) )
-trap = __scanbit(curr->arch.async_exception_mask, VCPU_TRAP_NONE);
+if ( !(curr->arch.async_event_mask & (curr->arch.async_event_mask - 1)) )
+trap = __scanbit(curr->arch.async_event_mask, 0);
 else
-for ( trap = VCPU_TRAP_NONE + 1; trap <= VCPU_TRAP_LAST; ++trap )
-if ( (curr->arch.async_exception_mask ^
-  curr->arch.async_exception_state(trap).old_mask) == (1u << 
trap) )
+for ( trap = 0; trap <= VCPU_TRAP_LAST; ++trap )
+if ( (curr->arch.async_event_mask ^
+  curr->arch.async_event[trap].old_mask) == (1u << trap) )
 break;
 if ( unlikely(trap > VCPU_TRAP_LAST) )
 {
@@ -44,8 +44,7 @@ static void async_exception_cleanup(struct vcpu *curr)
 }
 
 /* Restore previous asynchronous exception mask. */
-curr->arch.async_exception_mask =
-curr->arch.async_exception_state(trap).old_mask;
+curr->arch.async_event_mask = curr->arch.async_event[trap].old_mask;
 }
 
 unsigned long do_iret(void)
diff --git a/xen/arch/x86/x86_64/asm-offsets.c 
b/xen/arch/x86/x86_64/asm-offsets.c
index b8e8510439..59b62649e2 100644
--- a/xen/arch/x86/x86_64/asm-offsets.c
+++ b/xen/arch/x86/x86_64/asm-offsets.c
@@ -74,9 +74,9 @@ void __dummy__(void)
 OFFSET(VCPU_arch_msrs, struct vcpu, arch.msrs);
 OFFSET(VCPU_nmi_pending, struct vcpu, arch.nmi_pending);
 OFFSET(VCPU_mce_pending, struct vcpu, arch.mce_pending);
-OFFSET(VCPU_nmi_old_mask, struct vcpu, arch.nmi_state.old_mask);
-OFFSET(VCPU_mce_old_mask, struct vcpu, arch.mce_state.old_mask);
-OFFSET(VCPU_async_exception_mask, struct vcpu, arch.async_exception_mask);
+OFFSET(VCPU_nmi_old_mask, struct vcpu, arch.nmi_old_mask);
+OFFSET(VCPU_mce_old_mask, struct vcpu, arch.mce_old_mask);
+OFFSET(VCPU_async_event_mask, struct vcpu, arch.async_event_mask);
 DEFINE(VCPU_TRAP_NMI, VCPU_TRAP_NMI);
 DEFINE(VCPU_TRAP_MCE, VCPU_TRAP_MCE);
 DEFINE(_VGCF_syscall_disables_events,  _VGCF_syscall_disables_events);
diff --git a/xen/arch/x86/x86_64/compat/entry.S 
b/xen/arch/x86/x86_64/compat/entry.S
index 3cd375bd48..17b1153f79 100644
--- 

[Xen-devel] [PATCH 1/3] x86/nmi: Corrections and improvements to do_nmi_stats()

2020-02-17 Thread Andrew Cooper
The hardware domain doesn't necessarily have the domid 0.  Render v instead,
adjusting the strings to avoid printing trailing whitespace.

Rename i to cpu, and use separate booleans for pending/masked.  Drop the
unnecessary domain local variable.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
---
 xen/arch/x86/nmi.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
index a5c6bdd0ce..638677a5fe 100644
--- a/xen/arch/x86/nmi.c
+++ b/xen/arch/x86/nmi.c
@@ -587,25 +587,25 @@ static void do_nmi_trigger(unsigned char key)
 
 static void do_nmi_stats(unsigned char key)
 {
-int i;
-struct domain *d;
-struct vcpu *v;
+const struct vcpu *v;
+unsigned int cpu;
+bool pend, mask;
 
 printk("CPU\tNMI\n");
-for_each_online_cpu ( i )
-printk("%3d\t%3d\n", i, nmi_count(i));
+for_each_online_cpu ( cpu )
+printk("%3d\t%3d\n", cpu, nmi_count(cpu));
 
-if ( ((d = hardware_domain) == NULL) || (d->vcpu == NULL) ||
- ((v = d->vcpu[0]) == NULL) )
+if ( !hardware_domain || !hardware_domain->vcpu ||
+ !(v = hardware_domain->vcpu[0]) )
 return;
 
-i = v->async_exception_mask & (1 << VCPU_TRAP_NMI);
-if ( v->nmi_pending || i )
-printk("dom0 vpu0: NMI %s%s\n",
-   v->nmi_pending ? "pending " : "",
-   i ? "masked " : "");
+pend = v->nmi_pending;
+mask = v->async_exception_mask & (1 << VCPU_TRAP_NMI);
+if ( pend || mask )
+printk("%pv: NMI%s%s\n",
+   v, pend ? " pending" : "", mask ? " masked" : "");
 else
-printk("dom0 vcpu0: NMI neither pending nor masked\n");
+printk("%pv: NMI neither pending nor masked\n", v);
 }
 
 static __init int register_nmi_trigger(void)
-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 2/3] xen: Move async_exception_* infrastructure into x86

2020-02-17 Thread Andrew Cooper
The async_exception_{state,mask} infrastructure is implemented in common code,
but is limited to x86 because of the VCPU_TRAP_LAST ifdef-ary.

The internals are very x86 specific (and even then, in need of correction),
and won't be of interest to other architectures.  Move it all into x86
specific code.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Wei Liu 
CC: Roger Pau Monné 
CC: Stefano Stabellini 
CC: Julien Grall 
CC: Volodymyr Babchuk 
---
 xen/arch/x86/cpu/mcheck/vmce.c|  2 +-
 xen/arch/x86/cpu/vpmu.c   |  2 +-
 xen/arch/x86/domain.c | 12 
 xen/arch/x86/domctl.c |  2 +-
 xen/arch/x86/hvm/irq.c|  8 
 xen/arch/x86/hvm/vioapic.c|  2 +-
 xen/arch/x86/hvm/vlapic.c |  2 +-
 xen/arch/x86/nmi.c|  4 ++--
 xen/arch/x86/oprofile/nmi_int.c   |  2 +-
 xen/arch/x86/pv/callback.c|  2 +-
 xen/arch/x86/pv/iret.c| 13 +++--
 xen/arch/x86/pv/traps.c   |  2 +-
 xen/arch/x86/x86_64/asm-offsets.c | 10 +-
 xen/common/domain.c   | 15 ---
 xen/include/asm-x86/domain.h  |  8 
 xen/include/xen/sched.h   | 11 ---
 16 files changed, 46 insertions(+), 51 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index 4f5de07e01..816ef61ad4 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -412,7 +412,7 @@ int inject_vmce(struct domain *d, int vcpu)
 
 if ( (is_hvm_domain(d) ||
   pv_trap_callback_registered(v, TRAP_machine_check)) &&
- !test_and_set_bool(v->mce_pending) )
+ !test_and_set_bool(v->arch.mce_pending) )
 {
 mce_printk(MCE_VERBOSE, "MCE: inject vMCE to %pv\n", v);
 vcpu_kick(v);
diff --git a/xen/arch/x86/cpu/vpmu.c b/xen/arch/x86/cpu/vpmu.c
index 3c778450ac..e50d478d23 100644
--- a/xen/arch/x86/cpu/vpmu.c
+++ b/xen/arch/x86/cpu/vpmu.c
@@ -329,7 +329,7 @@ void vpmu_do_interrupt(struct cpu_user_regs *regs)
 vlapic_set_irq(vlapic, vlapic_lvtpc & APIC_VECTOR_MASK, 0);
 break;
 case APIC_MODE_NMI:
-sampling->nmi_pending = 1;
+sampling->arch.nmi_pending = true;
 break;
 }
 #endif
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 66150abf4c..fe63c23676 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1246,6 +1246,10 @@ int arch_initialise_vcpu(struct vcpu *v, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 
 int arch_vcpu_reset(struct vcpu *v)
 {
+v->arch.async_exception_mask = 0;
+memset(v->arch.async_exception_state, 0,
+   sizeof(v->arch.async_exception_state));
+
 if ( is_pv_vcpu(v) )
 {
 pv_destroy_gdt(v);
@@ -1264,6 +1268,14 @@ arch_do_vcpu_op(
 
 switch ( cmd )
 {
+case VCPUOP_send_nmi:
+if ( !guest_handle_is_null(arg) )
+return -EINVAL;
+
+if ( !test_and_set_bool(v->arch.nmi_pending) )
+vcpu_kick(v);
+break;
+
 case VCPUOP_register_vcpu_time_memory_area:
 {
 struct vcpu_register_time_memory_area area;
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index ce76d6d776..ed86762fa6 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -614,7 +614,7 @@ long arch_do_domctl(
 {
 case XEN_DOMCTL_SENDTRIGGER_NMI:
 ret = 0;
-if ( !test_and_set_bool(v->nmi_pending) )
+if ( !test_and_set_bool(v->arch.nmi_pending) )
 vcpu_kick(v);
 break;
 
diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index c684422b24..dd202aab5a 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -526,10 +526,10 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
  */
 vlapic_sync_pir_to_irr(v);
 
-if ( unlikely(v->nmi_pending) )
+if ( unlikely(v->arch.nmi_pending) )
 return hvm_intack_nmi;
 
-if ( unlikely(v->mce_pending) )
+if ( unlikely(v->arch.mce_pending) )
 return hvm_intack_mce;
 
 if ( (plat->irq->callback_via_type == HVMIRQ_callback_vector)
@@ -554,11 +554,11 @@ struct hvm_intack hvm_vcpu_ack_pending_irq(
 switch ( intack.source )
 {
 case hvm_intsrc_nmi:
-if ( !test_and_clear_bool(v->nmi_pending) )
+if ( !test_and_clear_bool(v->arch.nmi_pending) )
 intack = hvm_intack_none;
 break;
 case hvm_intsrc_mce:
-if ( !test_and_clear_bool(v->mce_pending) )
+if ( !test_and_clear_bool(v->arch.mce_pending) )
 intack = hvm_intack_none;
 break;
 case hvm_intsrc_pic:
diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
index 9aeef32a14..b87facb0e0 100644
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -469,7 +469,7 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, 
unsigned int pin)
 for_each_vcpu ( 

[Xen-devel] [PATCH 0/3] xen: async_exception_* cleanup

2020-02-17 Thread Andrew Cooper
This infrastructure is only compiled for x86, very x86 specific (so of no
interest to other architectures), and very broken.

Amongst other things, MCEs have a higher priority than NMIs, and there is no
such thing as masking an MCE.  In order to address these bugs (which will
completely change the infrastructure), start by moving it all out of common
code.

Andrew Cooper (3):
  x86/nmi: Corrections and improvements to do_nmi_stats()
  xen: Move async_exception_* infrastructure into x86
  xen/x86: Rename and simplify async_event_* infrastructure

 xen/arch/x86/cpu/mcheck/vmce.c |  2 +-
 xen/arch/x86/cpu/vpmu.c|  2 +-
 xen/arch/x86/domain.c  | 11 +++
 xen/arch/x86/domctl.c  |  2 +-
 xen/arch/x86/hvm/irq.c |  8 
 xen/arch/x86/hvm/vioapic.c |  2 +-
 xen/arch/x86/hvm/vlapic.c  |  2 +-
 xen/arch/x86/nmi.c | 26 +-
 xen/arch/x86/oprofile/nmi_int.c|  2 +-
 xen/arch/x86/pv/callback.c |  2 +-
 xen/arch/x86/pv/iret.c | 14 +++---
 xen/arch/x86/pv/traps.c|  2 +-
 xen/arch/x86/x86_64/asm-offsets.c  | 10 +-
 xen/arch/x86/x86_64/compat/entry.S | 12 ++--
 xen/arch/x86/x86_64/entry.S| 12 ++--
 xen/common/domain.c| 15 ---
 xen/include/asm-x86/domain.h   | 27 +--
 xen/include/xen/sched.h| 11 ---
 18 files changed, 77 insertions(+), 85 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [linux-4.4 test] 147111: regressions - FAIL

2020-02-17 Thread osstest service owner
flight 147111 linux-4.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/147111/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-qemuu-nested-amd 14 xen-boot/l1 fail REGR. vs. 139698
 test-armhf-armhf-xl-credit1   7 xen-boot fail REGR. vs. 139698
 test-amd64-amd64-qemuu-nested-intel 17 debian-hvm-install/l1/l2 fail REGR. vs. 
139698
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 debian-hvm-install fail REGR. vs. 139698
 test-amd64-i386-xl-xsm   23 leak-check/check fail REGR. vs. 139698
 test-armhf-armhf-libvirt 19 leak-check/check fail REGR. vs. 139698

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail REGR. vs. 139698

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-amd64-xl-pvhv2-intel 12 guest-start fail never pass
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 10 debian-hvm-install 
fail never pass
 test-amd64-amd64-xl-pvhv2-amd 12 guest-start  fail  never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 linux76e5c6fd6d163f1aa63969cc982e79be1fee87a7
baseline version:
 linuxdc16a7e5f36d65b25a1b66ade14356773ed52875

Last test of basis   139698  2019-08-04 07:48:30 Z  197 days
Failing since139773  2019-08-06 16:40:26 Z  194 days  108 attempts
Testing same since   147111  2020-02-16 03:37:56 Z1 days1 attempts


1097 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 

Re: [Xen-devel] [RFC PATCH v3 06/12] xen-blkfront: add callbacks for PM suspend and hibernation

2020-02-17 Thread Roger Pau Monné
On Fri, Feb 14, 2020 at 11:25:34PM +, Anchal Agarwal wrote:
> From: Munehisa Kamata  
> Add freeze, thaw and restore callbacks for PM suspend and hibernation
> support. All frontend drivers that needs to use PM_HIBERNATION/PM_SUSPEND
> events, need to implement these xenbus_driver callbacks.
> The freeze handler stops a block-layer queue and disconnect the
> frontend from the backend while freeing ring_info and associated resources.
> The restore handler re-allocates ring_info and re-connect to the
> backend, so the rest of the kernel can continue to use the block device
> transparently. Also, the handlers are used for both PM suspend and
> hibernation so that we can keep the existing suspend/resume callbacks for
> Xen suspend without modification. Before disconnecting from backend,
> we need to prevent any new IO from being queued and wait for existing
> IO to complete.

This is different from Xen (xenstore) initiated suspension, as in that
case Linux doesn't flush the rings or disconnects from the backend.

This is done so that in case suspensions fails the recovery doesn't
need to reconnect the PV devices, and in order to speed up suspension
time (ie: waiting for all queues to be flushed can take time as Linux
supports multiqueue, multipage rings and indirect descriptors), and
the backend could be contended if there's a lot of IO pressure from
guests.

Linux already keeps a shadow of the ring contents, so in-flight
requests can be re-issued after the frontend has reconnected during
resume.

> Freeze/unfreeze of the queues will guarantee that there
> are no requests in use on the shared ring.
> 
> Note:For older backends,if a backend doesn't have commit'12ea729645ace'
> xen/blkback: unmap all persistent grants when frontend gets disconnected,
> the frontend may see massive amount of grant table warning when freeing
> resources.
> [   36.852659] deferring g.e. 0xf9 (pfn 0x)
> [   36.855089] xen:grant_table: WARNING:e.g. 0x112 still in use!
> 
> In this case, persistent grants would need to be disabled.
> 
> [Anchal Changelog: Removed timeout/request during blkfront freeze.
> Fixed major part of the code to work with blk-mq]
> Signed-off-by: Anchal Agarwal 
> Signed-off-by: Munehisa Kamata 
> ---
>  drivers/block/xen-blkfront.c | 119 ---
>  1 file changed, 112 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> index 478120233750..d715ed3cb69a 100644
> --- a/drivers/block/xen-blkfront.c
> +++ b/drivers/block/xen-blkfront.c
> @@ -47,6 +47,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  #include 
>  #include 
> @@ -79,6 +81,8 @@ enum blkif_state {
>   BLKIF_STATE_DISCONNECTED,
>   BLKIF_STATE_CONNECTED,
>   BLKIF_STATE_SUSPENDED,
> + BLKIF_STATE_FREEZING,
> + BLKIF_STATE_FROZEN
>  };
>  
>  struct grant {
> @@ -220,6 +224,7 @@ struct blkfront_info
>   struct list_head requests;
>   struct bio_list bio_list;
>   struct list_head info_list;
> + struct completion wait_backend_disconnected;
>  };
>  
>  static unsigned int nr_minors;
> @@ -261,6 +266,7 @@ static DEFINE_SPINLOCK(minor_lock);
>  static int blkfront_setup_indirect(struct blkfront_ring_info *rinfo);
>  static void blkfront_gather_backend_features(struct blkfront_info *info);
>  static int negotiate_mq(struct blkfront_info *info);
> +static void __blkif_free(struct blkfront_info *info);
>  
>  static int get_id_from_freelist(struct blkfront_ring_info *rinfo)
>  {
> @@ -995,6 +1001,7 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 
> sector_size,
>   info->sector_size = sector_size;
>   info->physical_sector_size = physical_sector_size;
>   blkif_set_queue_limits(info);
> + init_completion(>wait_backend_disconnected);
>  
>   return 0;
>  }
> @@ -1218,6 +1225,8 @@ static void xlvbd_release_gendisk(struct blkfront_info 
> *info)
>  /* Already hold rinfo->ring_lock. */
>  static inline void kick_pending_request_queues_locked(struct 
> blkfront_ring_info *rinfo)
>  {
> + if (unlikely(rinfo->dev_info->connected == BLKIF_STATE_FREEZING))
> + return;
>   if (!RING_FULL(>ring))
>   blk_mq_start_stopped_hw_queues(rinfo->dev_info->rq, true);
>  }
> @@ -1341,8 +1350,6 @@ static void blkif_free_ring(struct blkfront_ring_info 
> *rinfo)
>  
>  static void blkif_free(struct blkfront_info *info, int suspend)
>  {
> - unsigned int i;
> -
>   /* Prevent new requests being issued until we fix things up. */
>   info->connected = suspend ?
>   BLKIF_STATE_SUSPENDED : BLKIF_STATE_DISCONNECTED;
> @@ -1350,6 +1357,13 @@ static void blkif_free(struct blkfront_info *info, int 
> suspend)
>   if (info->rq)
>   blk_mq_stop_hw_queues(info->rq);
>  
> + __blkif_free(info);
> +}
> +
> +static void __blkif_free(struct blkfront_info *info)
> +{
> + unsigned int i;
> +
>   for (i = 0; i < 

  1   2   >