Re: [PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64

2024-02-26 Thread Jan Beulich
On 26.02.2024 18:39, Oleksii Kurochko wrote:
> This patch doesn't represent a strict lower bound for GCC and
> GNU Binutils; rather, these versions are specifically employed by
> the Xen RISC-V container and are anticipated to undergo continuous
> testing.

Up and until that container would be updated to a newer gcc. I'm
afraid I view this as too weak a criteria, but I'm also not meaning to
stand in the way if somebody else wants to ack this patch in this form;
my bare minimum requirement is now met.

> --- a/README
> +++ b/README
> @@ -48,6 +48,15 @@ provided by your OS distributor:
>- For ARM 64-bit:
>  - GCC 5.1 or later
>  - GNU Binutils 2.24 or later
> +  - For RISC-V 64-bit:
> +- GCC 12.2 or later
> +- GNU Binutils 2.39 or later
> +This doesn't represent a strict lower bound for GCC and GNU Binutils;
> +rather, these versions are specifically employed by the Xen RISC-V
> +container and are anticipated to undergo continuous testing.

As per above, I think here it really needs saying "at the time of writing"
or recording a concrete date. Furthermore I expect "these versions" relates
to the specifically named versions and particularly _not_ to "or later":
With the criteria you apply, using later versions (or in fact any version
other than the very specific ones used in the container) would be similarly
untested. Much like x86 and Arm don't have the full range of permitted
tool chain versions continuously tested. Plus don't forget that distros may
apply their own selection of patches on top of what they take from upstream
(and they may also take random snapshots rather than released versions).

IOW it is hard for me to see why RISC-V needs stronger restrictions here
than other architectures. It ought to be possible to determine a baseline
version. Even if taking the desire to have "pause" available as a
requirement, gas (and presumably gld) 2.36.1 would already suffice.

Jan



Re: [PATCH v5 03/23] xen/riscv: introduce nospec.h

2024-02-26 Thread Jan Beulich
On 26.02.2024 18:38, Oleksii Kurochko wrote:
> From the unpriviliged doc:
>   No standard hints are presently defined.
>   We anticipate standard hints to eventually include memory-system spatial
>   and temporal locality hints, branch prediction hints, thread-scheduling
>   hints, security tags, and instrumentation flags for simulation/emulation.
> 
> Also, there are no speculation execution barriers.
> 
> Therefore, functions evaluate_nospec() and block_speculation() should
> remain empty until a specific platform has an extension to deal with
> speculation execution.

What about array_index_mask_nospec(), though? No custom implementation,
meaning the generic one will be used there? If that's the intention,
then ...

> Signed-off-by: Oleksii Kurochko 

Acked-by: Jan Beulich 

Jan




Re: [PATCH v5 02/23] xen/riscv: use some asm-generic headers

2024-02-26 Thread Jan Beulich
On 26.02.2024 18:38, Oleksii Kurochko wrote:
> The following headers end up the same as asm-generic's version:
> * altp2m.h
> * device.h
> * div64.h
> * hardirq.h
> * hypercall.h
> * iocap.h
> * paging.h
> * percpu.h
> * random.h
> * softirq.h
> * vm_event.h
> 
> RISC-V should utilize the asm-generic's version of the mentioned
> headers instead of introducing them in the arch-specific folder.
> 
> Signed-off-by: Oleksii Kurochko 
> Acked-by: Jan Beulich 
> ---
> Changes in V5:
>  - Nothing changed. Only rebase.
>  - update the commit message.
>  - drop the message above revision log as there is no depenency for this patch
>from other patch series.

Please can you make sure you submit patches against a sufficiently up-to-
date version of the staging branch? I committed v4 of this patch a couple
of days ago already, upon your own confirmation that it was okay to go in
ahead of others.

Jan



Re: [XEN PATCH 2/2] xen/cpu: address MISRA C Rule 17.7

2024-02-26 Thread Jan Beulich
On 27.02.2024 01:26, Stefano Stabellini wrote:
> On Mon, 26 Feb 2024, Jan Beulich wrote:
>> On 23.02.2024 10:35, Nicola Vetrini wrote:
>>> Refactor cpu_notifier_call_chain into two functions:
>>> - the variant that is allowed to fail loses the nofail flag
>>> - the variant that shouldn't fail is encapsulated in a call
>>>   to the failing variant, with an additional check.
>>>
>>> This prevents uses of the function that are not supposed to
>>> fail from ignoring the return value, thus violating Rule 17.7:
>>> "The value returned by a function having non-void return type shall
>>> be used".
>>>
>>> No functional change.
>>>
>>> Signed-off-by: Nicola Vetrini 
>>
>> I'm afraid I disagree with this kind of bifurcation. No matter what
>> Misra thinks or says, it is normal for return values of functions to
>> not always be relevant to check.
> 
> Hi Jan, I disagree.
> 
> Regardless of MISRA, I really think return values need to be checked.
> Moreover, we decided as a group to honor MISRA Rule 17.7, which requires
> return values to be checked. This patch is a good step forward.

Yet splitting functions isn't the only way to deal with Misra's
requirements, I suppose. After all there are functions where the
return value is purely courtesy for perhaps just one of its callers.

Splitting simply doesn't scale very well, imo.

>> To deal with the Misra rule imo requires to first have an abstract
>> plan of how to handle such globally in the code base. Imo such a plan
>> can't be to introduce perhaps dozens of new wrapper functions like is
>> done here.
> 
> This patch is following the right pattern, one we already follow with
> the _locked suffix.

Right, and - just to mention it - one which I similarly dislike, albeit
to a lesser degree.

Jan



Re: [XEN PATCH 3/4] xen/include: add pure and const attributes

2024-02-26 Thread Jan Beulich
On 27.02.2024 00:48, Stefano Stabellini wrote:
> On Mon, 26 Feb 2024, Jan Beulich wrote:
>> On 23.02.2024 23:36, Stefano Stabellini wrote:
>>> On Fri, 23 Feb 2024, Jan Beulich wrote:
 On 23.02.2024 02:32, Stefano Stabellini wrote:
> On Tue, 24 Oct 2023, Jan Beulich wrote:
>> On 24.10.2023 00:05, Stefano Stabellini wrote:
>>> On Mon, 23 Oct 2023, Jan Beulich wrote:
 On 23.10.2023 17:23, Simone Ballarin wrote:
> On 23/10/23 15:34, Jan Beulich wrote:
>> On 18.10.2023 16:18, Simone Ballarin wrote:
>>> --- a/xen/include/xen/pdx.h
>>> +++ b/xen/include/xen/pdx.h
>>> @@ -164,7 +164,7 @@ static inline unsigned long pfn_to_pdx(unsigned 
>>> long pfn)
>>>* @param pdx Page index
>>>* @return Obtained pfn after decompressing the pdx
>>>*/
>>> -static inline unsigned long pdx_to_pfn(unsigned long pdx)
>>> +static inline __attribute_pure__ unsigned long pdx_to_pfn(unsigned 
>>> long pdx)
>>>   {
>>>   return (pdx & pfn_pdx_bottom_mask) |
>>>  ((pdx << pfn_pdx_hole_shift) & pfn_top_mask);
>>
>> Taking this as an example for what I've said above: The compiler 
>> can't
>> know that the globals used by the functions won't change value. Even
>> within Xen it is only by convention that these variables are assigned
>> their values during boot, and then aren't changed anymore. Which 
>> makes
>> me wonder: Did you check carefully that around the time the variables
>> have their values established, no calls to the functions exist (which
>> might then be subject to folding)?
>
> There is no need to check that, the GCC documentation explicitly says:
>
> However, functions declared with the pure attribute *can safely read 
> any 
> non-volatile objects*, and modify the value of objects in a way that 
> does not affect their return value or the observable state of the 
> program.

 I did quote this same text in response to what Andrew has said, but I 
 also
 did note there that this needs to be taken with a grain of salt: The
 compiler generally assumes a single-threaded environment, i.e. no 
 changes
 to globals behind the back of the code it is processing.
>>>
>>> Let's start from the beginning. The reason for Simone to add
>>> __attribute_pure__ to pdx_to_pfn and other functions is for
>>> documentation purposes. It is OK if it doesn't serve any purpose other
>>> than documentation.
>>>
>>> Andrew, for sure we do not want to lie to the compiler and introduce
>>> undefined behavior. If we think there is a risk of it, we should not do
>>> it.
>>>
>>> So, what do we want to document? We want to document that the function
>>> does not have side effects according to MISRA's definition of it, which
>>> might subtly differ from GCC's definition.
>>>
>>> Looking at GCC's definition of __attribute_pure__, with the
>>> clarification statement copy/pasted above by both Simone and Jan, it
>>> seems that __attribute_pure__ matches MISRA's definition of a function
>>> without side effects. It also seems that pdx_to_pfn abides to that
>>> definition.
>>>
>>> Jan has a point that GCC might be making other assumptions
>>> (single-thread execution) that might not hold true in our case. Given
>>> the way the GCC statement is written I think this is low risk. But maybe
>>> not all GCC versions we want to support in the project might have the
>>> same definition of __attribute_pure__. So we could end up using
>>> __attribute_pure__ correctly for the GCC version used for safety (GCC
>>> 12.1, see docs/misra/C-language-toolchain.rst) but it might actually
>>> break an older GCC version.
>>>
>>>
>>> So Option#1 is to use __attribute_pure__ taking the risk that a GCC or
>>> Clang version might interpret __attribute_pure__ differently and
>>> potentially misbehave.
>>>
>>> Option#2 is to avoid this risk, by not using __attribute_pure__.
>>> Instead, we can use SAF-xx-safe or deviations.rst to document that
>>> pdx_to_pfn and other functions like it are without side effects
>>> according to MISRA's definition.
>>>
>>>
>>> Both options have pros and cons. To me the most important factor is how
>>> many GCC versions come with the statement "pure attribute can safely
>>> read any non-volatile objects, and modify the value of objects in a way
>>> that does not affect their return value or the observable state of the
>>> program".
>>>
>>> I checked and these are the results:
>>> - gcc 4.0.2: no statement
>>> - gcc 5.1.0: no statement
>>> - gcc 6.1.0: no statement
>>> - gcc 7.1.0: no statement
>>

Re: [XEN PATCH 1/2] xen/console: drop return value from consoled_guest_rx/tx

2024-02-26 Thread Jan Beulich
On 26.02.2024 23:49, Stefano Stabellini wrote:
> On Mon, 26 Feb 2024, Jan Beulich wrote:
>> On 26.02.2024 09:23, Nicola Vetrini wrote:
>>> On 2024-02-26 09:00, Jan Beulich wrote:
 On 23.02.2024 23:56, Stefano Stabellini wrote:
> On Fri, 23 Feb 2024, Nicola Vetrini wrote:
>> These functions never saw a usage of their return value since
>> they were introduced, so it can be dropped since their usages
>> violate MISRA C Rule 17.7:
>> "The value returned by a function having non-void return type shall 
>> be used".
>>
>> No functional change.
>>
>> Signed-off-by: Nicola Vetrini 
>
> Reviewed-by: Stefano Stabellini 

 The cleanup is certainly okay, but would one of you mind clarifying in 
 how
 far this code is relevant for certification? I don't expect there are 
 plans
 to run shim Xen in any projected production uses for which 
 certification is
 relevant? (The subject prefix is also unnecessarily wide here, when 
 it's
 only daemon code which is affected, not console code in general.)

>>>
>>> I agree on the subject prefix being too wide. The configuration that 
>>> uses consoled_guest_tx is #ifdef-ed for x86, so even in configurations 
>>> that may never reach this condition this is relevant, unless its #ifdef 
>>> is restricted to cases where the call may actually be reachable.
>>
>> Hmm, I see. There are contradicting goals here then: It being just X86 is
>> to reduce the risk of someone overlooking a build breakage they may
>> introduce. Whereas for certification it's quite the other way around: We'd
>> like to "hide" as much code as possible.
>>
>> Really I would have been inclined to suggest to drop the #ifdef, if
>> possible even without replacing by IS_ENABLED(), but instead leveraging
>> that pv_shim ought to be compile-time false whenever CONFIG_PV_SHIM=n.
> 
> This is OK
> 
> 
>> After all that's a pattern we've been trying to follow. But with your
>> observation is becomes questionable whether extending use of IS_ENABLED()
>> is actually going to be helpful. Stefano - perhaps something to discuss
>> on one of the next meetings?
> 
> Yes. I checked with the safety manager and his opinion is that
> IS_ENABLED() is OK to use as a way to disable code from a safety
> perspective.

Yet unlike when #ifdef is used, such code would remain visible to e.g.
Eclair even after the preprocessing step. Note the context in which
I'm bringing this up - if IS_ENABLED() was properly used here (and as
tightly as possible), the tool would still have complained, aiui.

Jan



[ovmf test] 184777: all pass - PUSHED

2024-02-26 Thread osstest service owner
flight 184777 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184777/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 33c81c25bbd55141ad395af89160b4ed4b703cdf
baseline version:
 ovmf d25421d0d8cd2493b30215ef80d2424ecb19c870

Last test of basis   184775  2024-02-26 22:43:02 Z0 days
Testing same since   184777  2024-02-27 01:11:23 Z0 days1 attempts


People who touched revisions under test:
  Michael Kubacki 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   d25421d0d8..33c81c25bb  33c81c25bbd55141ad395af89160b4ed4b703cdf -> 
xen-tested-master



[linux-linus test] 184771: tolerable FAIL - PUSHED

2024-02-26 Thread osstest service owner
flight 184771 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184771/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 184762
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 184762
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 184762
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 184762
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 184762
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 184762
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 184762
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 184762
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass

version targeted for testing:
 linux45ec2f5f6ed3ec3a79ba1329ad585497cdcbe663
baseline version:
 linuxd206a76d7d2726f3b096037f2079ce0bd3ba329b

Last test of basis   184762  2024-02-26 01:13:54 Z1 days
Testing same since   184771  2024-02-26 19:42:20 Z0 days1 attempts


People who touched revisions under test:
  Chris Clayton 
  David Sterba 
  Elad Nachman 
  Filipe Manana 
  Han Xu 
  Johannes Thumshirn 
  Linus Torvalds 
  Mark O'Donovan 
  Miquel Raynal 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  

[ovmf test] 184775: all pass - PUSHED

2024-02-26 Thread osstest service owner
flight 184775 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184775/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf d25421d0d8cd2493b30215ef80d2424ecb19c870
baseline version:
 ovmf 68238d4f948069fc2c6b9cc13863bdced52a84d0

Last test of basis   184772  2024-02-26 19:43:08 Z0 days
Testing same since   184775  2024-02-26 22:43:02 Z0 days1 attempts


People who touched revisions under test:
  Gerd Hoffmann 
  Michael Kubacki 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   68238d4f94..d25421d0d8  d25421d0d8cd2493b30215ef80d2424ecb19c870 -> 
xen-tested-master



RE: question about virtio-vsock on xen

2024-02-26 Thread Peng Fan
> Subject: Re: question about virtio-vsock on xen
> 
> 
> 
> On 26.02.24 05:09, Peng Fan wrote:
> > Hi Oleksandr,
> 
> Hello Peng
> 
> 
> [snip]
> 
> >>
> >> ... Peng, we have vhost-vsock (and vhost-net) Xen PoC. Although
> >> it is non- upstreamable in its current shape (based on old Linux
> >> version, requires some rework and proper integration, most likely
> >> requires involving Qemu and protocol changes to pass an additional
> >> info to vhost), it works with Linux
> >> v5.10 + patched Qemu v7.0, so you can refer to the Yocto meta layer
> >> which contains kernel patches for the details [1].
> >
> > Thanks for the pointer, I am reading the code.
> >
> >>
> >> In a nutshell, before accessing the guest data the host module needs
> >> to map descriptors in virtio rings which contain either guest grant
> >> based DMA addresses (by using Xen grant mappings) or guest
> >> pseudo-physical addresses (by using Xen foreign mappings). After
> >> accessing the guest data the host module needs to unmap them.
> >
> > Ok, I thought  the current xen virtio code already map every ready.
> >
> 
> It does, as you said the virtio-blk-pci worked in your environment. But
> vhost(-vsock) is a special case, unlike for virtio-blk-pci where the whole
> backend resides in Qemu, here we have a split model. As I understand the
> Qemu performs only initial setup/configuration then offloads the I/O
> processing to a separate entity which is the Linux module in that particular
> case.

Thanks for sharing me the information. I need to learn more stuff:)

Thanks,
Peng.


[xen-unstable-smoke test] 184774: tolerable all pass - PUSHED

2024-02-26 Thread osstest service owner
flight 184774 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184774/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  37541208f119a9c552c6c6c3246ea61be0d44035
baseline version:
 xen  03fb5f503cb5faa0556b23d594496719ced6a11b

Last test of basis   184769  2024-02-26 16:00:26 Z0 days
Testing same since   184774  2024-02-26 22:00:26 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   03fb5f503c..37541208f1  37541208f119a9c552c6c6c3246ea61be0d44035 -> smoke



RE: question about virtio-vsock on xen

2024-02-26 Thread Peng Fan
> Subject: RE: question about virtio-vsock on xen
> 
> On Mon, 26 Feb 2024, Peng Fan wrote:
> > Hi Stefano,
> >
> > > Subject: Re: question about virtio-vsock on xen
> > >
> > > Hi Peng,
> > >
> > > We haven't tried to setup virtio-vsock yet.
> > >
> > > In general, I am very supportive of using QEMU for virtio backends.
> > > We use QEMU to provide virtio-net, virtio-block, virtio-console and more.
> >
> > Would you mind share how to setup virtio-console using qemu + xen?
> 
> Vikram (CCed) has been working on it and should be able to share more
> details
> 
> 
> > > However, typically virtio-vsock comes into play for VM-to-VM
> > > communication, which is different. Going via QEMU in Dom0 just to
> > > have 1 VM communicate with another VM is not an ideal design: it
> > > adds latency and uses resources in Dom0 when actually we could do
> without it.
> > >
> > > A better model for VM-to-VM communication would be to have the VM
> > > talk to each other directly via grant table or pre-shared memory
> > > (see the static shared memory feature) or via Xen hypercalls (see
> > > Argo.)
> >
> > The goal is to make android trout VM run with XEN + i.MX95, so need vsock.
> 
> I am not familiar with the details of Android Trout... Where is vsock used?
> Just asking for my own understanding.

Here is the link
https://source.android.com/docs/automotive/virtualization/reference_platform

vsock is used for audio control hal, dumpstate HAL, vehicle HAL.

Regards,
Peng.

> 
> 
> > > For a good Xen design, I think the virtio-vsock backend would need
> > > to be in Xen itself (the hypervisor).
> > >
> > > Of course that is more work and it doesn't help you with the
> > > specific question you had below :-)
> > >
> > > For that, I don't have a pointer to help you but maybe others in CC have.
> > >
> > > Cheers,
> > >
> > > Stefano
> > >
> > >
> > > On Fri, 23 Feb 2024, Peng Fan wrote:
> > > > Hi All,
> > > >
> > > > Has anyone make virtio-vsock on xen work? My dm args as below:
> > > >
> > > > virtio = [
> > > >
> > > 'backend=0,type=virtio,device,transport=pci,bdf=05:00.0,backend_type
> > > =qem
> > > u,grant_usage=true'
> > > > ]
> > > > device_model_args = [
> > > > '-D', '/home/root/qemu_log.txt',
> > > > '-d',
> > > > 'trace:*vsock*,trace:*vhost*,trace:*virtio*,trace:*pci_update*,tra
> > > > ce:* pci_route*,trace:*handle_ioreq*,trace:*xen*',
> > > > '-device',
> > > > 'vhost-vsock-pci,iommu_platform=false,id=vhost-vsock-pci0,bus=pcie
> > > > .0,a
> > > > ddr=5.0,guest-cid=3']
> > > >
> > > > During my test, it always return failure in dom0 kernel in below code:
> > > >
> > > > vhost_transport_do_send_pkt {
> > > > ...
> > > >nbytes = copy_to_iter(hdr, sizeof(*hdr), &iov_iter);
> > > > if (nbytes != sizeof(*hdr)) {
> > > > vq_err(vq, "Faulted on copying pkt hdr %x
> > > > %x %x %px\n",
> > > nbytes, sizeof(*hdr),
> > > > __builtin_object_size(hdr, 0), &iov_iter);
> > > > kfree_skb(skb);
> > > > break;
> > > > }
> > > > }
> > > >
> > > > I checked copy_to_iter, it is copy data to __user addr, but it
> > > > never pass, the copy to __user addr always return 0 bytes copied.
> > > >
> > > > The asm code "sttr x7, [x6]" will trigger data abort, the kernel
> > > > will run into do_page_fault, but lock_mm_and_find_vma report it is
> > > > VM_FAULT_BADMAP, that means the __user addr is not mapped, no
> vma
> > > has this addr.
> > > >
> > > > I am not sure what may cause this. Appreciate if any comments.
> > > >
> > > > BTW: I tested blk pci, it works, so the virtio pci should work on my 
> > > > setup.
> > > >
> > > > Thanks,
> > > > Peng.
> > > >
> >



Re: [XEN PATCH 2/2] xen/cpu: address MISRA C Rule 17.7

2024-02-26 Thread Stefano Stabellini
On Mon, 26 Feb 2024, Jan Beulich wrote:
> On 23.02.2024 10:35, Nicola Vetrini wrote:
> > Refactor cpu_notifier_call_chain into two functions:
> > - the variant that is allowed to fail loses the nofail flag
> > - the variant that shouldn't fail is encapsulated in a call
> >   to the failing variant, with an additional check.
> > 
> > This prevents uses of the function that are not supposed to
> > fail from ignoring the return value, thus violating Rule 17.7:
> > "The value returned by a function having non-void return type shall
> > be used".
> > 
> > No functional change.
> > 
> > Signed-off-by: Nicola Vetrini 
> 
> I'm afraid I disagree with this kind of bifurcation. No matter what
> Misra thinks or says, it is normal for return values of functions to
> not always be relevant to check.

Hi Jan, I disagree.

Regardless of MISRA, I really think return values need to be checked.
Moreover, we decided as a group to honor MISRA Rule 17.7, which requires
return values to be checked. This patch is a good step forward.


> To deal with the Misra rule imo requires to first have an abstract
> plan of how to handle such globally in the code base. Imo such a plan
> can't be to introduce perhaps dozens of new wrapper functions like is
> done here.

This patch is following the right pattern, one we already follow with
the _locked suffix.



Re: [XEN PATCH] xen: cache clearing and invalidation helpers refactoring

2024-02-26 Thread Stefano Stabellini
On Sat, 24 Feb 2024, Nicola Vetrini wrote:
> Hi Stefano,
> 
> On 2024-02-24 00:05, Stefano Stabellini wrote:
> > On Fri, 23 Feb 2024, Nicola Vetrini wrote:
> > > On 2024-02-19 16:14, Nicola Vetrini wrote:
> > > > The cache clearing and invalidation helpers in x86 and Arm didn't
> > > > comply with MISRA C Rule 17.7: "The value returned by a function
> > > > having non-void return type shall be used". On Arm they
> > > > were always returning 0, while some in x86 returned -EOPNOTSUPP
> > > > and in common/grant_table the return value is saved.
> > > >
> > > > As a consequence, a common helper arch_grant_cache_flush that returns
> > > > an integer is introduced, so that each architecture can choose whether
> > > to
> > > > return an error value on certain conditions, and the helpers have either
> > > > been changed to return void (on Arm) or deleted entirely (on x86).
> > > >
> > > > Signed-off-by: Julien Grall 
> > > > Signed-off-by: Nicola Vetrini 
> > > > ---
> > > > The original refactor idea came from Julien Grall in [1]; I edited that
> > > > proposal
> > > > to fix build errors.
> > > >
> > > > I did introduce a cast to void for the call to flush_area_local on x86,
> > > > because
> > > > even before this patch the return value of that function wasn't checked
> > > in
> > > > all
> > > > but one use in x86/smp.c, and in this context the helper (perhaps
> > > > incidentally)
> > > > ignored the return value of flush_area_local.
> > > >
> > > > [1]
> > > >
> > > https://lore.kernel.org/xen-devel/09589e8f-77b6-47f7-b5bd-cf485e4b6...@xen.org/
> > > > ---
> > > >  xen/arch/arm/include/asm/page.h | 33 ++---
> > > >  xen/arch/x86/include/asm/flushtlb.h | 23 ++--
> > > >  xen/common/grant_table.c|  9 +---
> > > >  3 files changed, 34 insertions(+), 31 deletions(-)
> > > >
> > > 
> > > I'll put this patch in the backlog at the moment: too many intricacies
> > > while
> > > trying to untangle xen/flushtlb from xen/mm.h, and there are easier cases
> > > that
> > > can be done faster. If someone is interested I can post the partial work
> > > I've
> > > done so far, even though it doesn't
> > > build on x86.
> > 
> > I understand that the blocker is:
> > 
> > diff --git a/xen/arch/arm/include/asm/page.h
> > b/xen/arch/arm/include/asm/page.h
> > index 69f817d1e6..e90c9de361 100644
> > --- a/xen/arch/arm/include/asm/page.h
> > +++ b/xen/arch/arm/include/asm/page.h
> > @@ -123,6 +123,7 @@
> > 
> >  #ifndef __ASSEMBLY__
> > 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > 
> > 
> > And the headers disentagling required to solve it, right?
> > 
> > 
> > Let me ask a silly question. public/grant_table.h seems needed by
> > arch_grant_cache_flush. Can we move arch_grant_cache_flush somewhere
> > else? It is not like page.h is a perfect fit for it anyway.
> > 
> > For instance, can we move it to
> > 
> > xen/arch/arm/include/asm/grant_table.h
> > 
> > ?
> 
> Yes, this is what was suggested and what I was trying to accomplish.
> Basically my plan is:
> 
> 1. move the arch_grant_cache_flush helper to asm/grant_table.h for both
> architectures
> 2. pull out of xen/mm.h this hunk (note the inclusion of asm/flushtlb in the
> middle of the file) because there is a build error on tlbflush_current_time()
> induced in some .c file (don't remember which) by the earlier movement

It looks like it would be easier to resolve the build error on
tlbflush_current_time() in another way. What's the build error exactly?
Looking at the implementation of tlbflush_current_time on the various
arches I couldn't find any potential issues.

I just moved the implementation of arch_grant_cache_flush to arch
specific headers and it seemed to have worked for me. See below. Maybe I
am building with a different kconfig.


diff --git a/xen/arch/arm/include/asm/grant_table.h 
b/xen/arch/arm/include/asm/grant_table.h
index d3c518a926..2c5c07e061 100644
--- a/xen/arch/arm/include/asm/grant_table.h
+++ b/xen/arch/arm/include/asm/grant_table.h
@@ -76,6 +76,20 @@ int replace_grant_host_mapping(uint64_t gpaddr, mfn_t frame,
 #define gnttab_need_iommu_mapping(d)\
 (is_domain_direct_mapped(d) && is_iommu_enabled(d))
 
+static inline int arch_grant_cache_flush(unsigned int op, const void *p,
+ unsigned long size)
+{
+if ( (op & GNTTAB_CACHE_INVAL) && (op & GNTTAB_CACHE_CLEAN) )
+clean_and_invalidate_dcache_va_range(p, size);
+else if ( op & GNTTAB_CACHE_INVAL )
+invalidate_dcache_va_range(p, size);
+else if ( op & GNTTAB_CACHE_CLEAN )
+clean_dcache_va_range(p, size);
+
+/* ARM callers assume that dcache_* functions cannot fail. */
+return 0;
+}
+
 #endif /* __ASM_GRANT_TABLE_H__ */
 /*
  * Local variables:
diff --git a/xen/arch/arm/include/asm/page.h b/xen/arch/arm/include/asm/page.h
index 69f817d1e6..aea692a24d 100644
--- a/xen/arch/arm/include/asm/page.h
+++ b/xen/arch/arm/include

Re: [XEN PATCH 3/4] xen/include: add pure and const attributes

2024-02-26 Thread Stefano Stabellini
On Mon, 26 Feb 2024, Jan Beulich wrote:
> On 23.02.2024 23:36, Stefano Stabellini wrote:
> > On Fri, 23 Feb 2024, Jan Beulich wrote:
> >> On 23.02.2024 02:32, Stefano Stabellini wrote:
> >>> On Tue, 24 Oct 2023, Jan Beulich wrote:
>  On 24.10.2023 00:05, Stefano Stabellini wrote:
> > On Mon, 23 Oct 2023, Jan Beulich wrote:
> >> On 23.10.2023 17:23, Simone Ballarin wrote:
> >>> On 23/10/23 15:34, Jan Beulich wrote:
>  On 18.10.2023 16:18, Simone Ballarin wrote:
> > --- a/xen/include/xen/pdx.h
> > +++ b/xen/include/xen/pdx.h
> > @@ -164,7 +164,7 @@ static inline unsigned long pfn_to_pdx(unsigned 
> > long pfn)
> >* @param pdx Page index
> >* @return Obtained pfn after decompressing the pdx
> >*/
> > -static inline unsigned long pdx_to_pfn(unsigned long pdx)
> > +static inline __attribute_pure__ unsigned long pdx_to_pfn(unsigned 
> > long pdx)
> >   {
> >   return (pdx & pfn_pdx_bottom_mask) |
> >  ((pdx << pfn_pdx_hole_shift) & pfn_top_mask);
> 
>  Taking this as an example for what I've said above: The compiler 
>  can't
>  know that the globals used by the functions won't change value. Even
>  within Xen it is only by convention that these variables are assigned
>  their values during boot, and then aren't changed anymore. Which 
>  makes
>  me wonder: Did you check carefully that around the time the variables
>  have their values established, no calls to the functions exist (which
>  might then be subject to folding)?
> >>>
> >>> There is no need to check that, the GCC documentation explicitly says:
> >>>
> >>> However, functions declared with the pure attribute *can safely read 
> >>> any 
> >>> non-volatile objects*, and modify the value of objects in a way that 
> >>> does not affect their return value or the observable state of the 
> >>> program.
> >>
> >> I did quote this same text in response to what Andrew has said, but I 
> >> also
> >> did note there that this needs to be taken with a grain of salt: The
> >> compiler generally assumes a single-threaded environment, i.e. no 
> >> changes
> >> to globals behind the back of the code it is processing.
> >
> > Let's start from the beginning. The reason for Simone to add
> > __attribute_pure__ to pdx_to_pfn and other functions is for
> > documentation purposes. It is OK if it doesn't serve any purpose other
> > than documentation.
> >
> > Andrew, for sure we do not want to lie to the compiler and introduce
> > undefined behavior. If we think there is a risk of it, we should not do
> > it.
> >
> > So, what do we want to document? We want to document that the function
> > does not have side effects according to MISRA's definition of it, which
> > might subtly differ from GCC's definition.
> >
> > Looking at GCC's definition of __attribute_pure__, with the
> > clarification statement copy/pasted above by both Simone and Jan, it
> > seems that __attribute_pure__ matches MISRA's definition of a function
> > without side effects. It also seems that pdx_to_pfn abides to that
> > definition.
> >
> > Jan has a point that GCC might be making other assumptions
> > (single-thread execution) that might not hold true in our case. Given
> > the way the GCC statement is written I think this is low risk. But maybe
> > not all GCC versions we want to support in the project might have the
> > same definition of __attribute_pure__. So we could end up using
> > __attribute_pure__ correctly for the GCC version used for safety (GCC
> > 12.1, see docs/misra/C-language-toolchain.rst) but it might actually
> > break an older GCC version.
> >
> >
> > So Option#1 is to use __attribute_pure__ taking the risk that a GCC or
> > Clang version might interpret __attribute_pure__ differently and
> > potentially misbehave.
> >
> > Option#2 is to avoid this risk, by not using __attribute_pure__.
> > Instead, we can use SAF-xx-safe or deviations.rst to document that
> > pdx_to_pfn and other functions like it are without side effects
> > according to MISRA's definition.
> >
> >
> > Both options have pros and cons. To me the most important factor is how
> > many GCC versions come with the statement "pure attribute can safely
> > read any non-volatile objects, and modify the value of objects in a way
> > that does not affect their return value or the observable state of the
> > program".
> >
> > I checked and these are the results:
> > - gcc 4.0.2: no statement
> > - gcc 5.1.0: no statement
> > - gcc 6.1.0: no statement
> > - gcc 7.1.0: no statement
> > - gcc 8.1.0: alternative statement "The pure

Re: [PATCH v1] automation: remove bin86/dev86 from tumbleweed image

2024-02-26 Thread Stefano Stabellini
On Sat, 24 Feb 2024, Olaf Hering wrote:
> Fri, 23 Feb 2024 15:22:38 -0800 (PST) Stefano Stabellini 
> :
> 
> > Do you have a successful gitlab pipeline with this patch applied that
> > you can give me as proof of testing and success?
> 
> Yes, all of them since the patch went out.

Great thanks!

Acked-by: Stefano Stabellini 



[xen-unstable test] 184767: tolerable FAIL - PUSHED

2024-02-26 Thread osstest service owner
flight 184767 xen-unstable real [real]
flight 184773 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/184767/
http://logs.test-lab.xenproject.org/osstest/logs/184773/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 12 windows-install fail pass in 
184773-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stop  fail in 184773 like 184763
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 184763
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 184763
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 184763
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 184763
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 184763
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 184763
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 184763
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 184763
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 184763
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 184763
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 184763
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  8de3afc0b402bc17f65093a53e5870862707a8c7
baseline version:
 xen  92babc88f67ed0ef3dc575a8b9534040274678ee

Last test of basis   184763  2024-02-26 01:52:19 Z

Re: [XEN PATCH 1/2] xen/console: drop return value from consoled_guest_rx/tx

2024-02-26 Thread Stefano Stabellini
On Mon, 26 Feb 2024, Jan Beulich wrote:
> On 26.02.2024 09:23, Nicola Vetrini wrote:
> > On 2024-02-26 09:00, Jan Beulich wrote:
> >> On 23.02.2024 23:56, Stefano Stabellini wrote:
> >>> On Fri, 23 Feb 2024, Nicola Vetrini wrote:
>  These functions never saw a usage of their return value since
>  they were introduced, so it can be dropped since their usages
>  violate MISRA C Rule 17.7:
>  "The value returned by a function having non-void return type shall 
>  be used".
> 
>  No functional change.
> 
>  Signed-off-by: Nicola Vetrini 
> >>>
> >>> Reviewed-by: Stefano Stabellini 
> >>
> >> The cleanup is certainly okay, but would one of you mind clarifying in 
> >> how
> >> far this code is relevant for certification? I don't expect there are 
> >> plans
> >> to run shim Xen in any projected production uses for which 
> >> certification is
> >> relevant? (The subject prefix is also unnecessarily wide here, when 
> >> it's
> >> only daemon code which is affected, not console code in general.)
> >>
> > 
> > I agree on the subject prefix being too wide. The configuration that 
> > uses consoled_guest_tx is #ifdef-ed for x86, so even in configurations 
> > that may never reach this condition this is relevant, unless its #ifdef 
> > is restricted to cases where the call may actually be reachable.
> 
> Hmm, I see. There are contradicting goals here then: It being just X86 is
> to reduce the risk of someone overlooking a build breakage they may
> introduce. Whereas for certification it's quite the other way around: We'd
> like to "hide" as much code as possible.
> 
> Really I would have been inclined to suggest to drop the #ifdef, if
> possible even without replacing by IS_ENABLED(), but instead leveraging
> that pv_shim ought to be compile-time false whenever CONFIG_PV_SHIM=n.

This is OK


> After all that's a pattern we've been trying to follow. But with your
> observation is becomes questionable whether extending use of IS_ENABLED()
> is actually going to be helpful. Stefano - perhaps something to discuss
> on one of the next meetings?

Yes. I checked with the safety manager and his opinion is that
IS_ENABLED() is OK to use as a way to disable code from a safety
perspective.



[ovmf test] 184772: all pass - PUSHED

2024-02-26 Thread osstest service owner
flight 184772 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184772/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 68238d4f948069fc2c6b9cc13863bdced52a84d0
baseline version:
 ovmf 44fdc4f3983be77dda709d7a0c3fb8905fbf920b

Last test of basis   184768  2024-02-26 15:43:07 Z0 days
Testing same since   184772  2024-02-26 19:43:08 Z0 days1 attempts


People who touched revisions under test:
  Junfeng Guan 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   44fdc4f398..68238d4f94  68238d4f948069fc2c6b9cc13863bdced52a84d0 -> 
xen-tested-master



RE: question about virtio-vsock on xen

2024-02-26 Thread Stefano Stabellini
On Mon, 26 Feb 2024, Peng Fan wrote:
> Hi Stefano,
> 
> > Subject: Re: question about virtio-vsock on xen
> > 
> > Hi Peng,
> > 
> > We haven't tried to setup virtio-vsock yet.
> > 
> > In general, I am very supportive of using QEMU for virtio backends. We use
> > QEMU to provide virtio-net, virtio-block, virtio-console and more.
> 
> Would you mind share how to setup virtio-console using qemu + xen?

Vikram (CCed) has been working on it and should be able to share more
details

 
> > However, typically virtio-vsock comes into play for VM-to-VM
> > communication, which is different. Going via QEMU in Dom0 just to have 1
> > VM communicate with another VM is not an ideal design: it adds latency and
> > uses resources in Dom0 when actually we could do without it.
> > 
> > A better model for VM-to-VM communication would be to have the VM talk
> > to each other directly via grant table or pre-shared memory (see the static
> > shared memory feature) or via Xen hypercalls (see Argo.)
> 
> The goal is to make android trout VM run with XEN + i.MX95, so need vsock.

I am not familiar with the details of Android Trout... Where is vsock
used? Just asking for my own understanding.


> > For a good Xen design, I think the virtio-vsock backend would need to be in
> > Xen itself (the hypervisor).
> > 
> > Of course that is more work and it doesn't help you with the specific 
> > question
> > you had below :-)
> > 
> > For that, I don't have a pointer to help you but maybe others in CC have.
> > 
> > Cheers,
> > 
> > Stefano
> > 
> > 
> > On Fri, 23 Feb 2024, Peng Fan wrote:
> > > Hi All,
> > >
> > > Has anyone make virtio-vsock on xen work? My dm args as below:
> > >
> > > virtio = [
> > >
> > 'backend=0,type=virtio,device,transport=pci,bdf=05:00.0,backend_type=qem
> > u,grant_usage=true'
> > > ]
> > > device_model_args = [
> > > '-D', '/home/root/qemu_log.txt',
> > > '-d',
> > > 'trace:*vsock*,trace:*vhost*,trace:*virtio*,trace:*pci_update*,trace:*
> > > pci_route*,trace:*handle_ioreq*,trace:*xen*',
> > > '-device',
> > > 'vhost-vsock-pci,iommu_platform=false,id=vhost-vsock-pci0,bus=pcie.0,a
> > > ddr=5.0,guest-cid=3']
> > >
> > > During my test, it always return failure in dom0 kernel in below code:
> > >
> > > vhost_transport_do_send_pkt {
> > > ...
> > >nbytes = copy_to_iter(hdr, sizeof(*hdr), &iov_iter);
> > > if (nbytes != sizeof(*hdr)) {
> > > vq_err(vq, "Faulted on copying pkt hdr %x %x %x 
> > > %px\n",
> > nbytes, sizeof(*hdr),
> > > __builtin_object_size(hdr, 0), &iov_iter);
> > > kfree_skb(skb);
> > > break;
> > > }
> > > }
> > >
> > > I checked copy_to_iter, it is copy data to __user addr, but it never
> > > pass, the copy to __user addr always return 0 bytes copied.
> > >
> > > The asm code "sttr x7, [x6]" will trigger data abort, the kernel will
> > > run into do_page_fault, but lock_mm_and_find_vma report it is
> > > VM_FAULT_BADMAP, that means the __user addr is not mapped, no vma
> > has this addr.
> > >
> > > I am not sure what may cause this. Appreciate if any comments.
> > >
> > > BTW: I tested blk pci, it works, so the virtio pci should work on my 
> > > setup.
> > >
> > > Thanks,
> > > Peng.
> > >
> 



Re: question about virtio-vsock on xen

2024-02-26 Thread Oleksandr Tyshchenko


On 26.02.24 05:09, Peng Fan wrote:
> Hi Oleksandr,

Hello Peng


[snip]

>>
>> ... Peng, we have vhost-vsock (and vhost-net) Xen PoC. Although it is 
>> non-
>> upstreamable in its current shape (based on old Linux version, requires some
>> rework and proper integration, most likely requires involving Qemu and
>> protocol changes to pass an additional info to vhost), it works with Linux
>> v5.10 + patched Qemu v7.0, so you can refer to the Yocto meta layer which
>> contains kernel patches for the details [1].
> 
> Thanks for the pointer, I am reading the code.
> 
>>
>> In a nutshell, before accessing the guest data the host module needs to map
>> descriptors in virtio rings which contain either guest grant based DMA
>> addresses (by using Xen grant mappings) or guest pseudo-physical addresses
>> (by using Xen foreign mappings). After accessing the guest data the host
>> module needs to unmap them.
> 
> Ok, I thought  the current xen virtio code already map every ready.
> 

It does, as you said the virtio-blk-pci worked in your environment. But 
vhost(-vsock) is a special case, unlike for virtio-blk-pci where the 
whole backend resides in Qemu, here we have a split model. As I 
understand the Qemu performs only initial setup/configuration then 
offloads the I/O processing to a separate entity which is the Linux 
module in that particular case.

Re: [PATCH v6 1/3] xen: introduce Kconfig function alignment option

2024-02-26 Thread Shawn Anastasio
Hi Roger,

On 2/7/24 8:55 AM, Roger Pau Monne wrote:
> And use it to replace CODE_ALIGN in assembly.  This allows to generalize the
> way the code alignment gets set across all architectures.
> 
> No functional change intended.
> 
> Signed-off-by: Roger Pau Monné 

Acked-by: Shawn Anastasio 

Thanks,
Shawn



Re: [PATCH v2 2/6] xen/ppc: address violations of MISRA C:2012 Rule 11.8.

2024-02-26 Thread Shawn Anastasio
Hi Stefano,

Sorry for the delay on this.

Acked-by: Shawn Anastasio 

Best,
Shawn

On 2/23/24 5:19 PM, Stefano Stabellini wrote:
> Shawn,
> 
> I am thinking of committing this, if you disagree please say something
> in the next couple of days
> 
> 
> On Tue, 19 Dec 2023, Simone Ballarin wrote:
>> From: Maria Celeste Cesario 
>>
>> The xen sources contain violations of MISRA C:2012 Rule 11.8 whose
>> headline states:
>> "A conversion shall not remove any const, volatile or _Atomic qualification
>> from the type pointed to by a pointer".
>>
>> Fix violation by adding missing const qualifier in cast.
>>
>> Signed-off-by: Maria Celeste Cesario  
>> Signed-off-by: Simone Ballarin  
>> Reviewed-by: Stefano Stabellini 
>> ---
>> Adaptation requested by the community to make the code more consistent.
>> ---
>>  xen/arch/ppc/include/asm/atomic.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/ppc/include/asm/atomic.h 
>> b/xen/arch/ppc/include/asm/atomic.h
>> index 64168aa3f1..fe778579fb 100644
>> --- a/xen/arch/ppc/include/asm/atomic.h
>> +++ b/xen/arch/ppc/include/asm/atomic.h
>> @@ -16,7 +16,7 @@
>>  
>>  static inline int atomic_read(const atomic_t *v)
>>  {
>> -return *(volatile int *)&v->counter;
>> +return *(const volatile int *)&v->counter;
>>  }
>>  
>>  static inline int _atomic_read(atomic_t v)
>> -- 
>> 2.40.0
>>



[xen-unstable-smoke test] 184769: tolerable all pass - PUSHED

2024-02-26 Thread osstest service owner
flight 184769 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184769/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  03fb5f503cb5faa0556b23d594496719ced6a11b
baseline version:
 xen  8de3afc0b402bc17f65093a53e5870862707a8c7

Last test of basis   184764  2024-02-26 10:02:04 Z0 days
Testing same since   184769  2024-02-26 16:00:26 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  George Dunlap 
  Jan Beulich 
  Roger Pau Monné 
  Stefano Stabellini 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   8de3afc0b4..03fb5f503c  03fb5f503cb5faa0556b23d594496719ced6a11b -> smoke



[PATCH v5 22/23] xen/riscv: enable full Xen build

2024-02-26 Thread Oleksii Kurochko
Signed-off-by: Oleksii Kurochko 
Reviewed-by: Jan Beulich 
---
Changes in V5:
 - Nothing changed. Only rebase.
---
Changes in V4:
 - drop stubs for irq_actor_none() and irq_actor_none() as common/irq.c is 
compiled now.
 - drop defintion of max_page in stubs.c as common/page_alloc.c is compiled now.
 - drop printk() related changes in riscv/early_printk.c as common version will 
be used.
---
Changes in V3:
 - Reviewed-by: Jan Beulich 
 - unrealted change dropped in tiny64_defconfig
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/Makefile   |  16 +++-
 xen/arch/riscv/arch.mk|   4 -
 xen/arch/riscv/early_printk.c | 168 --
 xen/arch/riscv/stubs.c|  23 -
 4 files changed, 15 insertions(+), 196 deletions(-)

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 60afbc0ad9..81b77b13d6 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -12,10 +12,24 @@ $(TARGET): $(TARGET)-syms
$(OBJCOPY) -O binary -S $< $@
 
 $(TARGET)-syms: $(objtree)/prelink.o $(obj)/xen.lds
-   $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) -o $@
+   $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< \
+   $(objtree)/common/symbols-dummy.o -o $(dot-target).0
+   $(NM) -pa --format=sysv $(dot-target).0 \
+   | $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
+   > $(dot-target).0.S
+   $(MAKE) $(build)=$(@D) $(dot-target).0.o
+   $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< \
+   $(dot-target).0.o -o $(dot-target).1
+   $(NM) -pa --format=sysv $(dot-target).1 \
+   | $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
+   > $(dot-target).1.S
+   $(MAKE) $(build)=$(@D) $(dot-target).1.o
+   $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) \
+   $(dot-target).1.o -o $@
$(NM) -pa --format=sysv $@ \
| $(objtree)/tools/symbols --all-symbols --xensyms --sysv 
--sort \
> $@.map
+   rm -f $(@D)/.$(@F).[0-9]*
 
 $(obj)/xen.lds: $(src)/xen.lds.S FORCE
$(call if_changed_dep,cpp_lds_S)
diff --git a/xen/arch/riscv/arch.mk b/xen/arch/riscv/arch.mk
index fabe323ec5..197d5e1893 100644
--- a/xen/arch/riscv/arch.mk
+++ b/xen/arch/riscv/arch.mk
@@ -19,7 +19,3 @@ riscv-march-$(CONFIG_RISCV_ISA_C)   := $(riscv-march-y)c
 # -mcmodel=medlow would force Xen into the lower half.
 
 CFLAGS += -march=$(riscv-march-y)$(has_zihintpause) -mstrict-align 
-mcmodel=medany
-
-# TODO: Drop override when more of the build is working
-override ALL_OBJS-y = arch/$(SRCARCH)/built_in.o
-override ALL_LIBS-y =
diff --git a/xen/arch/riscv/early_printk.c b/xen/arch/riscv/early_printk.c
index 60742a042d..610c814f54 100644
--- a/xen/arch/riscv/early_printk.c
+++ b/xen/arch/riscv/early_printk.c
@@ -40,171 +40,3 @@ void early_printk(const char *str)
 str++;
 }
 }
-
-/*
- * The following #if 1 ... #endif should be removed after printk
- * and related stuff are ready.
- */
-#if 1
-
-#include 
-#include 
-
-/**
- * strlen - Find the length of a string
- * @s: The string to be sized
- */
-size_t (strlen)(const char * s)
-{
-const char *sc;
-
-for (sc = s; *sc != '\0'; ++sc)
-/* nothing */;
-return sc - s;
-}
-
-/**
- * memcpy - Copy one area of memory to another
- * @dest: Where to copy to
- * @src: Where to copy from
- * @count: The size of the area.
- *
- * You should not use this function to access IO space, use memcpy_toio()
- * or memcpy_fromio() instead.
- */
-void *(memcpy)(void *dest, const void *src, size_t count)
-{
-char *tmp = (char *) dest, *s = (char *) src;
-
-while (count--)
-*tmp++ = *s++;
-
-return dest;
-}
-
-int vsnprintf(char* str, size_t size, const char* format, va_list args)
-{
-size_t i = 0; /* Current position in the output string */
-size_t written = 0; /* Total number of characters written */
-char* dest = str;
-
-while ( format[i] != '\0' && written < size - 1 )
-{
-if ( format[i] == '%' )
-{
-i++;
-
-if ( format[i] == '\0' )
-break;
-
-if ( format[i] == '%' )
-{
-if ( written < size - 1 )
-{
-dest[written] = '%';
-written++;
-}
-i++;
-continue;
-}
-
-/*
- * Handle format specifiers.
- * For simplicity, only %s and %d are implemented here.
- */
-
-if ( format[i] == 's' )
-{
-char* arg = va_arg(args, char*);
-size_t arglen = strlen(arg);
-
-size_t remaining = size - written - 1;
-
-if ( arglen > remaining )
-arglen = remaining;
-
-memcpy(dest + written, arg, arglen);
-
-written += arglen;
- 

[PATCH v5 18/23] xen/riscv: add minimal stuff to processor.h to build full Xen

2024-02-26 Thread Oleksii Kurochko
Signed-off-by: Oleksii Kurochko 
---
Changes in V5:
 - Code style fixes.
 - drop introduced TOOLCHAIN_HAS_ZIHINTPAUSE and use as-insn instead and use
   as-insn istead.
---
Changes in V4:
 - Change message -> subject in "Changes in V3"
 - Documentation about system requirement was added. In the future, it can be 
checked if the extension is supported
   by system __riscv_isa_extension_available() ( 
https://gitlab.com/xen-project/people/olkur/xen/-/commit/737998e89ed305eb92059300c374dfa53d2143fa
 )
 - update cpu_relax() function to check if __riscv_zihintpause is supported by 
a toolchain
 - add conditional _zihintpause to -march if it is supported by a toolchain
Changes in V3:
 - update the commit subject
 - rename get_processor_id to smp_processor_id
 - code style fixes
 - update the cpu_relax instruction: use pause instruction instead of div %0, 
%0, zero
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 docs/misc/riscv/booting.txt|  8 
 xen/arch/riscv/arch.mk |  8 +++-
 xen/arch/riscv/include/asm/processor.h | 23 +++
 3 files changed, 38 insertions(+), 1 deletion(-)
 create mode 100644 docs/misc/riscv/booting.txt

diff --git a/docs/misc/riscv/booting.txt b/docs/misc/riscv/booting.txt
new file mode 100644
index 00..38fad74956
--- /dev/null
+++ b/docs/misc/riscv/booting.txt
@@ -0,0 +1,8 @@
+System requirements
+===
+
+The following extensions are expected to be supported by a system on which
+Xen is run:
+- Zihintpause:
+  On a system that doesn't have this extension, cpu_relax() should be
+  implemented properly. Otherwise, an illegal instruction exception will arise.
diff --git a/xen/arch/riscv/arch.mk b/xen/arch/riscv/arch.mk
index 8403f96b6f..fabe323ec5 100644
--- a/xen/arch/riscv/arch.mk
+++ b/xen/arch/riscv/arch.mk
@@ -5,6 +5,12 @@ $(call cc-options-add,CFLAGS,CC,$(EMBEDDED_EXTRA_CFLAGS))
 
 CFLAGS-$(CONFIG_RISCV_64) += -mabi=lp64
 
+ifeq ($(CONFIG_RISCV_64),y)
+has_zihintpause = $(call as-insn,$(CC) -mabi=lp64 -march=rv64i_zihintpause, 
"pause",_zihintpause,)
+else
+has_zihintpause = $(call as-insn,$(CC) -mabi=ilp32 -march=rv32i_zihintpause, 
"pause",_zihintpause,)
+endif
+
 riscv-march-$(CONFIG_RISCV_ISA_RV64G) := rv64g
 riscv-march-$(CONFIG_RISCV_ISA_C)   := $(riscv-march-y)c
 
@@ -12,7 +18,7 @@ riscv-march-$(CONFIG_RISCV_ISA_C)   := $(riscv-march-y)c
 # into the upper half _or_ the lower half of the address space.
 # -mcmodel=medlow would force Xen into the lower half.
 
-CFLAGS += -march=$(riscv-march-y) -mstrict-align -mcmodel=medany
+CFLAGS += -march=$(riscv-march-y)$(has_zihintpause) -mstrict-align 
-mcmodel=medany
 
 # TODO: Drop override when more of the build is working
 override ALL_OBJS-y = arch/$(SRCARCH)/built_in.o
diff --git a/xen/arch/riscv/include/asm/processor.h 
b/xen/arch/riscv/include/asm/processor.h
index 6db681d805..b96af07660 100644
--- a/xen/arch/riscv/include/asm/processor.h
+++ b/xen/arch/riscv/include/asm/processor.h
@@ -12,6 +12,9 @@
 
 #ifndef __ASSEMBLY__
 
+/* TODO: need to be implemeted */
+#define smp_processor_id() 0
+
 /* On stack VCPU state */
 struct cpu_user_regs
 {
@@ -53,6 +56,26 @@ struct cpu_user_regs
 unsigned long pregs;
 };
 
+/* TODO: need to implement */
+#define cpu_to_core(cpu)   (0)
+#define cpu_to_socket(cpu) (0)
+
+static inline void cpu_relax(void)
+{
+#ifdef __riscv_zihintpause
+/*
+ * Reduce instruction retirement.
+ * This assumes the PC changes.
+ */
+__asm__ __volatile__ ( "pause" );
+#else
+/* Encoding of the pause instruction */
+__asm__ __volatile__ ( ".insn 0x10F" );
+#endif
+
+barrier();
+}
+
 static inline void wfi(void)
 {
 __asm__ __volatile__ ("wfi");
-- 
2.43.0




[PATCH v5 14/23] xen/riscv: introduce monitor.h

2024-02-26 Thread Oleksii Kurochko
Signed-off-by: Oleksii Kurochko 
---
 Waiting for dependency to be merged: [PATCH v6 0/9] Introduce generic headers
 
(https://lore.kernel.org/xen-devel/84568b0c24a5ec96244f3f34537e9a148367facf.1707499278.git.oleksii.kuroc...@gmail.com/)
---
Changes in V4/V5:
 - Nothing changed. Only rebase.
---
Changes in V3:
 - new patch.
---
 xen/arch/riscv/include/asm/monitor.h | 26 ++
 1 file changed, 26 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/monitor.h

diff --git a/xen/arch/riscv/include/asm/monitor.h 
b/xen/arch/riscv/include/asm/monitor.h
new file mode 100644
index 00..f4fe2c0690
--- /dev/null
+++ b/xen/arch/riscv/include/asm/monitor.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_RISCV_MONITOR_H__
+#define __ASM_RISCV_MONITOR_H__
+
+#include 
+
+#include 
+
+struct domain;
+
+static inline uint32_t arch_monitor_get_capabilities(struct domain *d)
+{
+BUG_ON("unimplemented");
+return 0;
+}
+
+#endif /* __ASM_RISCV_MONITOR_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0




[PATCH v5 21/23] xen/rirscv: add minimal amount of stubs to build full Xen

2024-02-26 Thread Oleksii Kurochko
Signed-off-by: Oleksii Kurochko 
---
Changes in V5:
 - drop unrelated changes
 - assert_failed("unimplmented...") change to BUG_ON()
---
Changes in V4:
  - added new stubs which are necessary for compilation after rebase: 
__cpu_up(), __cpu_disable(), __cpu_die()
from smpboot.c
  - back changes related to printk() in early_printk() as they should be 
removed in the next patch to avoid
compilation error.
  - update definition of cpu_khz: __read_mostly -> __ro_after_init.
  - drop vm_event_reset_vmtrace(). It is defibed in asm-generic/vm_event.h.
  - move vm_event_*() functions from stubs.c to riscv/vm_event.c.
  - s/BUG/BUG_ON("unimplemented") in stubs.c
  - back irq_actor_none() and irq_actor_none() as common/irq.c isn't compiled 
at this moment,
so this function are needed to avoid compilation error.
  - defined max_page to avoid compilation error, it will be removed as soon as 
common/page_alloc.c will
be compiled.
---
Changes in V3:
 - code style fixes.
 - update attribute for frametable_base_pdx  and frametable_virt_end to 
__ro_after_init.
   insteaf of read_mostly.
 - use BUG() instead of assert_failed/WARN for newly introduced stubs.
 - drop "#include " in stubs.c and use forward declaration 
instead.
 - drop ack_node() and end_node() as they aren't used now.
---
Changes in V2:
 - define udelay stub
 - remove 'select HAS_PDX' from RISC-V Kconfig because of
   
https://lore.kernel.org/xen-devel/20231006144405.1078260-1-andrew.coop...@citrix.com/
---
 xen/arch/riscv/Makefile |   1 +
 xen/arch/riscv/mm.c |  50 +
 xen/arch/riscv/setup.c  |   8 +
 xen/arch/riscv/stubs.c  | 438 
 xen/arch/riscv/traps.c  |  25 +++
 5 files changed, 522 insertions(+)
 create mode 100644 xen/arch/riscv/stubs.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 1ed1a8369b..60afbc0ad9 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -4,6 +4,7 @@ obj-y += mm.o
 obj-$(CONFIG_RISCV_64) += riscv64/
 obj-y += sbi.o
 obj-y += setup.o
+obj-y += stubs.o
 obj-y += traps.o
 obj-y += vm_event.o
 
diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c
index fe3a43be20..2c3fb7d72e 100644
--- a/xen/arch/riscv/mm.c
+++ b/xen/arch/riscv/mm.c
@@ -1,5 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
+#include 
 #include 
 #include 
 #include 
@@ -14,6 +15,9 @@
 #include 
 #include 
 
+unsigned long __ro_after_init frametable_base_pdx;
+unsigned long __ro_after_init frametable_virt_end;
+
 struct mmu_desc {
 unsigned int num_levels;
 unsigned int pgtbl_count;
@@ -294,3 +298,49 @@ unsigned long __init calc_phys_offset(void)
 phys_offset = load_start - XEN_VIRT_START;
 return phys_offset;
 }
+
+void put_page(struct page_info *page)
+{
+BUG_ON("unimplemented");
+}
+
+unsigned long get_upper_mfn_bound(void)
+{
+/* No memory hotplug yet, so current memory limit is the final one. */
+return max_page - 1;
+}
+
+void arch_dump_shared_mem_info(void)
+{
+BUG_ON("unimplemented");
+}
+
+int populate_pt_range(unsigned long virt, unsigned long nr_mfns)
+{
+BUG_ON("unimplemented");
+return -1;
+}
+
+int xenmem_add_to_physmap_one(struct domain *d, unsigned int space,
+  union add_to_physmap_extra extra,
+  unsigned long idx, gfn_t gfn)
+{
+BUG_ON("unimplemented");
+
+return 0;
+}
+
+int destroy_xen_mappings(unsigned long s, unsigned long e)
+{
+BUG_ON("unimplemented");
+return -1;
+}
+
+int map_pages_to_xen(unsigned long virt,
+ mfn_t mfn,
+ unsigned long nr_mfns,
+ unsigned int flags)
+{
+BUG_ON("unimplemented");
+return -1;
+}
diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
index 98a94c4c48..8bb5bdb2ae 100644
--- a/xen/arch/riscv/setup.c
+++ b/xen/arch/riscv/setup.c
@@ -1,11 +1,19 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
+#include 
 #include 
 #include 
 #include 
 
+#include 
+
 #include 
 
+void arch_get_xen_caps(xen_capabilities_info_t *info)
+{
+BUG_ON("unimplemented");
+}
+
 /* Xen stack for bringing up the first CPU. */
 unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
 __aligned(STACK_SIZE);
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
new file mode 100644
index 00..529f1dbe52
--- /dev/null
+++ b/xen/arch/riscv/stubs.c
@@ -0,0 +1,438 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/* smpboot.c */
+
+cpumask_t cpu_online_map;
+cpumask_t cpu_present_map;
+cpumask_t cpu_possible_map;
+
+/* ID of the PCPU we're running on */
+DEFINE_PER_CPU(unsigned int, cpu_id);
+/* XXX these seem awfully x86ish... */
+/* representing HT siblings of each logical CPU */
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_mask);
+/* representing HT and core siblings of each logical CPU */
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_mask);
+

[ovmf test] 184768: all pass - PUSHED

2024-02-26 Thread osstest service owner
flight 184768 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184768/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 44fdc4f3983be77dda709d7a0c3fb8905fbf920b
baseline version:
 ovmf ba96acd963e794e86633ffd5e7f96cbc03db673c

Last test of basis   184761  2024-02-25 20:12:59 Z0 days
Testing same since   184768  2024-02-26 15:43:07 Z0 days1 attempts


People who touched revisions under test:
  Yi Li 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   ba96acd963..44fdc4f398  44fdc4f3983be77dda709d7a0c3fb8905fbf920b -> 
xen-tested-master



[PATCH v5 23/23] xen/README: add compiler and binutils versions for RISC-V64

2024-02-26 Thread Oleksii Kurochko
This patch doesn't represent a strict lower bound for GCC and
GNU Binutils; rather, these versions are specifically employed by
the Xen RISC-V container and are anticipated to undergo continuous
testing.

While it is feasible to utilize Clang, it's important to note that,
currently, there is no Xen RISC-V CI job in place to verify the
seamless functioning of the build with Clang.

Signed-off-by: Oleksii Kurochko 
---
 Changes in V5:
  - update the commit message and README file with additional explanation about 
GCC and
GNU Binutils version. Additionally, it was added information about Clang.
---
 Changes in V4:
  - Update version of GCC (12.2) and GNU Binutils (2.39) to the version
which are in Xen's contrainter for RISC-V
---
 Changes in V3:
  - new patch
---
 README | 9 +
 1 file changed, 9 insertions(+)

diff --git a/README b/README
index c8a108449e..7fd4173743 100644
--- a/README
+++ b/README
@@ -48,6 +48,15 @@ provided by your OS distributor:
   - For ARM 64-bit:
 - GCC 5.1 or later
 - GNU Binutils 2.24 or later
+  - For RISC-V 64-bit:
+- GCC 12.2 or later
+- GNU Binutils 2.39 or later
+This doesn't represent a strict lower bound for GCC and GNU Binutils;
+rather, these versions are specifically employed by the Xen RISC-V
+container and are anticipated to undergo continuous testing.
+While it is feasible to utilize Clang, it's important to note that,
+currently, there is no Xen RISC-V CI job in place to verify the
+seamless functioning of the build with Clang.
 * POSIX compatible awk
 * Development install of zlib (e.g., zlib-dev)
 * Development install of Python 2.7 or later (e.g., python-dev)
-- 
2.43.0




[PATCH v5 13/23] xen/riscv: introduce atomic.h

2024-02-26 Thread Oleksii Kurochko
Initially the patch was introduced by Bobby, who takes the header from
Linux kernel.

The following changes were done on top of Linux kernel header:
 - atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated
 to use__*xchg_generic()
 - drop casts in write_atomic() as they are unnecessary
 - drop introduction of WRITE_ONCE() and READ_ONCE().
   Xen provides ACCESS_ONCE()
 - remove zero-length array access in read_atomic()
 - drop defines similar to pattern
 - #define atomic_add_return_relaxed   atomic_add_return_relaxed
 - move not RISC-V specific functions to asm-generic/atomics-ops.h

Signed-off-by: Bobby Eshleman 
Signed-off-by: Oleksii Kurochko 
---
Changes in V5:
 - fence.h changes were moved to separate patch as patches related to io.h and 
cmpxchg.h,
   which are dependecies for this patch, also needed changes in fence.h
 - remove accessing of zero-length array
 - drops cast in write_atomic()
 - drop introduction of WRITE_ONCE() and READ_ONCE().
 - drop defines similar to pattern #define atomic_add_return_relaxed   
atomic_add_return_relaxed
 - Xen code style fixes
 - move not RISC-V specific functions to asm-generic/atomics-ops.h
---
Changes in V4:
 - do changes related to the updates of [PATCH v3 13/34] xen/riscv: introduce 
cmpxchg.h
 - drop casts in read_atomic_size(), write_atomic(), add_sized()
 - tabs -> spaces
 - drop #ifdef CONFIG_SMP ... #endif in fence.ha as it is simpler to handle 
NR_CPUS=1
   the same as NR_CPUS>1 with accepting less than ideal performance.
---
Changes in V3:
  - update the commit message
  - add SPDX for fence.h
  - code style fixes
  - Remove /* TODO: ... */ for add_sized macros. It looks correct to me.
  - re-order the patch
  - merge to this patch fence.h
---
Changes in V2:
 - Change an author of commit. I got this header from Bobby's old repo.
---
 xen/arch/riscv/include/asm/atomic.h  | 296 +++
 xen/include/asm-generic/atomic-ops.h |  92 +
 2 files changed, 388 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/atomic.h
 create mode 100644 xen/include/asm-generic/atomic-ops.h

diff --git a/xen/arch/riscv/include/asm/atomic.h 
b/xen/arch/riscv/include/asm/atomic.h
new file mode 100644
index 00..8007ae4c90
--- /dev/null
+++ b/xen/arch/riscv/include/asm/atomic.h
@@ -0,0 +1,296 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Taken and modified from Linux.
+ *
+ * The following changes were done:
+ * - * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated
+ * to use__*xchg_generic()
+ * - drop casts in write_atomic() as they are unnecessary
+ * - drop introduction of WRITE_ONCE() and READ_ONCE().
+ *   Xen provides ACCESS_ONCE()
+ * - remove zero-length array access in read_atomic()
+ * - drop defines similar to pattern
+ *   #define atomic_add_return_relaxed   atomic_add_return_relaxed
+ * - move not RISC-V specific functions to asm-generic/atomics-ops.h
+ * 
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ * Copyright (C) 2024 Vates SAS
+ */
+
+#ifndef _ASM_RISCV_ATOMIC_H
+#define _ASM_RISCV_ATOMIC_H
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+void __bad_atomic_size(void);
+
+/*
+ * Legacy from Linux kernel. For some reason they wanted to have ordered
+ * read/write access. Thereby read* is used instead of read_cpu()
+ */
+static always_inline void read_atomic_size(const volatile void *p,
+   void *res,
+   unsigned int size)
+{
+switch ( size )
+{
+case 1: *(uint8_t *)res = readb(p); break;
+case 2: *(uint16_t *)res = readw(p); break;
+case 4: *(uint32_t *)res = readl(p); break;
+case 8: *(uint32_t *)res  = readq(p); break;
+default: __bad_atomic_size(); break;
+}
+}
+
+#define read_atomic(p) ({   \
+union { typeof(*p) val; char c[sizeof(*p)]; } x_;   \
+read_atomic_size(p, x_.c, sizeof(*p));  \
+x_.val; \
+})
+
+#define write_atomic(p, x)  \
+({  \
+typeof(*p) x__ = (x);   \
+switch ( sizeof(*p) )   \
+{   \
+case 1: writeb(x__,  p); break; \
+case 2: writew(x__, p); break;  \
+case 4: writel(x__, p); break;  \
+case 8: writeq(x__, p); break;  \
+default: __bad_atomic_size(); break;\
+}   \
+x__;\
+})
+
+#define add_sized(p, x) \
+({  \
+typeo

[PATCH v5 17/23] xen/riscv: add minimal stuff to page.h to build full Xen

2024-02-26 Thread Oleksii Kurochko
Signed-off-by: Oleksii Kurochko 
Acked-by: Jan Beulich 
---
Changes in V5:
 - Nothing changed. Only rebase.
---
Changes in V4:
---
 - Change message -> subject in "Changes in V3"
 - s/BUG/BUG_ON("...")
 - Do proper rebase ( pfn_to_paddr() and paddr_to_pfn() aren't removed ).
---
Changes in V3:
 - update the commit subject
 - add implemetation of PAGE_HYPERVISOR macros
 - add Acked-by: Jan Beulich 
 - drop definition of pfn_to_addr, and paddr_to_pfn in 
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/page.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/xen/arch/riscv/include/asm/page.h 
b/xen/arch/riscv/include/asm/page.h
index 95074e29b3..c831e16417 100644
--- a/xen/arch/riscv/include/asm/page.h
+++ b/xen/arch/riscv/include/asm/page.h
@@ -6,6 +6,7 @@
 #ifndef __ASSEMBLY__
 
 #include 
+#include 
 #include 
 
 #include 
@@ -32,6 +33,10 @@
 #define PTE_LEAF_DEFAULT(PTE_VALID | PTE_READABLE | PTE_WRITABLE)
 #define PTE_TABLE   (PTE_VALID)
 
+#define PAGE_HYPERVISOR_RW  (PTE_VALID | PTE_READABLE | PTE_WRITABLE)
+
+#define PAGE_HYPERVISOR PAGE_HYPERVISOR_RW
+
 /* Calculate the offsets into the pagetables for a given VA */
 #define pt_linear_offset(lvl, va)   ((va) >> XEN_PT_LEVEL_SHIFT(lvl))
 
@@ -62,6 +67,20 @@ static inline bool pte_is_valid(pte_t p)
 return p.pte & PTE_VALID;
 }
 
+static inline void invalidate_icache(void)
+{
+BUG_ON("unimplemented");
+}
+
+#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
+#define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE)
+
+/* TODO: Flush the dcache for an entire page. */
+static inline void flush_page_to_ram(unsigned long mfn, bool sync_icache)
+{
+BUG_ON("unimplemented");
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_RISCV_PAGE_H */
-- 
2.43.0




[PATCH v5 20/23] xen/riscv: introduce vm_event_*() functions

2024-02-26 Thread Oleksii Kurochko
Signed-off-by: Oleksii Kurochko 
---
Changes in V5:
 - Only rebase was done.
---
Changes in V4:
  - New patch.
---
 xen/arch/riscv/Makefile   |  1 +
 xen/arch/riscv/vm_event.c | 19 +++
 2 files changed, 20 insertions(+)
 create mode 100644 xen/arch/riscv/vm_event.c

diff --git a/xen/arch/riscv/Makefile b/xen/arch/riscv/Makefile
index 2fefe14e7c..1ed1a8369b 100644
--- a/xen/arch/riscv/Makefile
+++ b/xen/arch/riscv/Makefile
@@ -5,6 +5,7 @@ obj-$(CONFIG_RISCV_64) += riscv64/
 obj-y += sbi.o
 obj-y += setup.o
 obj-y += traps.o
+obj-y += vm_event.o
 
 $(TARGET): $(TARGET)-syms
$(OBJCOPY) -O binary -S $< $@
diff --git a/xen/arch/riscv/vm_event.c b/xen/arch/riscv/vm_event.c
new file mode 100644
index 00..bb1fc73bc1
--- /dev/null
+++ b/xen/arch/riscv/vm_event.c
@@ -0,0 +1,19 @@
+#include 
+
+struct vm_event_st;
+struct vcpu;
+
+void vm_event_fill_regs(struct vm_event_st *req)
+{
+BUG_ON("unimplemented");
+}
+
+void vm_event_set_registers(struct vcpu *v, struct vm_event_st *rsp)
+{
+BUG_ON("unimplemented");
+}
+
+void vm_event_monitor_next_interrupt(struct vcpu *v)
+{
+/* Not supported on RISCV. */
+}
-- 
2.43.0




[PATCH v5 15/23] xen/riscv: add definition of __read_mostly

2024-02-26 Thread Oleksii Kurochko
The definition of __read_mostly should be removed in:
https://lore.kernel.org/xen-devel/f25eb5c9-7c14-6e23-8535-2c66772b3...@suse.com/

The patch introduces it in arch-specific header to not
block enabling of full Xen build for RISC-V.

Signed-off-by: Oleksii Kurochko 
---
- [PATCH] move __read_mostly to xen/cache.h  [2]

Right now, the patch series doesn't have a direct dependency on [2] and it
provides __read_mostly in the patch:
[PATCH v3 26/34] xen/riscv: add definition of __read_mostly
However, it will be dropped as soon as [2] is merged or at least when the
final version of the patch [2] is provided.

[2] 
https://lore.kernel.org/xen-devel/f25eb5c9-7c14-6e23-8535-2c66772b3...@suse.com/
---
Changes in V4-V6:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/cache.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/riscv/include/asm/cache.h 
b/xen/arch/riscv/include/asm/cache.h
index 69573eb051..94bd94db53 100644
--- a/xen/arch/riscv/include/asm/cache.h
+++ b/xen/arch/riscv/include/asm/cache.h
@@ -3,4 +3,6 @@
 #ifndef _ASM_RISCV_CACHE_H
 #define _ASM_RISCV_CACHE_H
 
+#define __read_mostly __section(".data.read_mostly")
+
 #endif /* _ASM_RISCV_CACHE_H */
-- 
2.43.0




[PATCH v5 19/23] xen/riscv: add minimal stuff to mm.h to build full Xen

2024-02-26 Thread Oleksii Kurochko
Signed-off-by: Oleksii Kurochko 
---
Changes in V5:
 - update the comment around "struct domain *domain;" : zero -> NULL
 - fix ident. for unsigned long val;
 - put page_to_virt() and virt_to_page() close to each other.
 - drop unnessary leading underscore
 - drop a space before the comment: /* Count of uses of this frame as its 
current type. */
 - drop comment about a page 'not as a shadow'. it is not necessary for RISC-V
---
Changes in V4:
 - update an argument name of PFN_ORDERN macros.
 - drop pad at the end of 'struct page_info'.
 - Change message -> subject in "Changes in V3"
 - delete duplicated macros from riscv/mm.h
 - fix identation in struct page_info
 - align comment for PGC_ macros
 - update definitions of domain_set_alloc_bitsize() and 
domain_clamp_alloc_bitsize()
 - drop unnessary comments.
 - s/BUG/BUG_ON("...")
 - define __virt_to_maddr, __maddr_to_virt as stubs
 - add inclusion of xen/mm-frame.h for mfn_x and others
 - include "xen/mm.h" instead of "asm/mm.h" to fix compilation issues:
 In file included from arch/riscv/setup.c:7:
./arch/riscv/include/asm/mm.h:60:28: error: field 'list' has incomplete 
type
   60 | struct page_list_entry list;
  |^~~~
./arch/riscv/include/asm/mm.h:81:43: error: 'MAX_ORDER' undeclared here 
(not in a function)
   81 | unsigned long first_dirty:MAX_ORDER + 1;
  |   ^
./arch/riscv/include/asm/mm.h:81:31: error: bit-field 'first_dirty' 
width not an integer constant
   81 | unsigned long first_dirty:MAX_ORDER + 1;
 - Define __virt_to_mfn() and __mfn_to_virt() using maddr_to_mfn() and 
mfn_to_maddr().
---
Changes in V3:
 - update the commit title
 - introduce DIRECTMAP_VIRT_START.
 - drop changes related pfn_to_paddr() and paddr_to_pfn as they were remvoe in
   [PATCH v2 32/39] xen/riscv: add minimal stuff to asm/page.h to build full Xen
 - code style fixes.
 - drop get_page_nr  and put_page_nr as they don't need for time being
 - drop CONFIG_STATIC_MEMORY related things
 - code style fixes
---
Changes in V2:
 - define stub for arch_get_dma_bitsize(void)
---
 xen/arch/riscv/include/asm/mm.h | 246 
 xen/arch/riscv/mm.c |   2 +-
 xen/arch/riscv/setup.c  |   2 +-
 3 files changed, 248 insertions(+), 2 deletions(-)

diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index 07c7a0abba..2f13c1c3c2 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -3,11 +3,252 @@
 #ifndef _ASM_RISCV_MM_H
 #define _ASM_RISCV_MM_H
 
+#include 
+#include 
+#include 
+#include 
+#include 
+
 #include 
 
 #define pfn_to_paddr(pfn) ((paddr_t)(pfn) << PAGE_SHIFT)
 #define paddr_to_pfn(pa)  ((unsigned long)((pa) >> PAGE_SHIFT))
 
+#define paddr_to_pdx(pa)mfn_to_pdx(maddr_to_mfn(pa))
+#define gfn_to_gaddr(gfn)   pfn_to_paddr(gfn_x(gfn))
+#define gaddr_to_gfn(ga)_gfn(paddr_to_pfn(ga))
+#define mfn_to_maddr(mfn)   pfn_to_paddr(mfn_x(mfn))
+#define maddr_to_mfn(ma)_mfn(paddr_to_pfn(ma))
+#define vmap_to_mfn(va) maddr_to_mfn(virt_to_maddr((vaddr_t)va))
+#define vmap_to_page(va)mfn_to_page(vmap_to_mfn(va))
+
+static inline unsigned long __virt_to_maddr(unsigned long va)
+{
+BUG_ON("unimplemented");
+return 0;
+}
+
+static inline void *__maddr_to_virt(unsigned long ma)
+{
+BUG_ON("unimplemented");
+return NULL;
+}
+
+#define virt_to_maddr(va) __virt_to_maddr((unsigned long)(va))
+#define maddr_to_virt(pa) __maddr_to_virt((unsigned long)(pa))
+
+/* Convert between Xen-heap virtual addresses and machine frame numbers. */
+#define __virt_to_mfn(va)  mfn_x(maddr_to_mfn(virt_to_maddr(va)))
+#define __mfn_to_virt(mfn) maddr_to_virt(mfn_to_maddr(_mfn(mfn)))
+
+/*
+ * We define non-underscored wrappers for above conversion functions.
+ * These are overriden in various source files while underscored version
+ * remain intact.
+ */
+#define virt_to_mfn(va) __virt_to_mfn(va)
+#define mfn_to_virt(mfn)__mfn_to_virt(mfn)
+
+struct page_info
+{
+/* Each frame can be threaded onto a doubly-linked list. */
+struct page_list_entry list;
+
+/* Reference count and various PGC_xxx flags and fields. */
+unsigned long count_info;
+
+/* Context-dependent fields follow... */
+union {
+/* Page is in use: ((count_info & PGC_count_mask) != 0). */
+struct {
+/* Type reference count and various PGT_xxx flags and fields. */
+unsigned long type_info;
+} inuse;
+/* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
+union {
+struct {
+/*
+ * Index of the first *possibly* unscrubbed page in the buddy.
+ * One more bit than maximum possible order to accommodate
+ * INVALID_DIRTY_IDX.
+ */
+#define

[PATCH v5 09/23] xen/riscv: introduce bitops.h

2024-02-26 Thread Oleksii Kurochko
Taken from Linux-6.4.0-rc1

Xen's bitops.h consists of several Linux's headers:
* linux/arch/include/asm/bitops.h:
  * The following function were removed as they aren't used in Xen:
* test_and_set_bit_lock
* clear_bit_unlock
* __clear_bit_unlock
  * The following functions were renamed in the way how they are
used by common code:
* __test_and_set_bit
* __test_and_clear_bit
  * The declaration and implementation of the following functios
were updated to make Xen build happy:
* clear_bit
* set_bit
* __test_and_clear_bit
* __test_and_set_bit
  * linux/include/asm-generic/bitops/generic-non-atomic.h with the
following changes:
 * Only functions that can be reused in Xen were left;
   others were removed.
 * it was updated the message inside #ifndef ... #endif.
 * __always_inline -> always_inline to be align with definition in
   xen/compiler.h.
 * update function prototypes from
   generic___test_and_*(unsigned long nr nr, volatile unsigned long *addr)
   to
   generic___test_and_*(unsigned long nr, volatile void *addr) to be
   consistent with other related macros/defines.
 * convert identations from tabs to spaces.
 * inside generic__test_and_* use 'bitops_uint_t' instead of 'unsigned long'
to be generic.

Signed-off-by: Oleksii Kurochko 
---
  Patches 04 - 08 of this patch series are prerequisite for this patch.
---
Changes in V5:
 - Code style fixes
 - s/__NOP/NOP/g
 - s/__NOT/NOT/g
 - update the comments above functions: test_and_set_bit, test_and_clear_bit, 
set_bit, clear_bit as all of them
   are using atomic operation and a memory barrier, so the operation in it 
cannot be reordered.
 - s/volatile uint32_t/volatile bitops_uint_t in  test_and_set_bit, 
test_and_clear_bit, set_bit, clear_bit.
 - update the commit message
 - split introduction of asm-generic functions to separate patches:
   Patches 04 - 08 of this patch series are prerequisite for this patch.
---
Changes in V4:
  - updated the commit message: dropped the message about what was taken from 
linux/include/asm-generic/bitops/find.h
as related changes now are located in xen/bitops.h. Also these changes were 
removed from riscv/bitops.h
  - switch tabs to spaces.
  - update return type of __ffs function, format __ffs according to Xen code 
style. Move the function to
respective asm-generic header.
  - format ffsl() according to Xen code style, update the type of num: int -> 
unsigned to be align with
return type of the function. Move the function to respective asm-generic 
header.
  - add new line for the files:
  asm-generic/bitops-bits.h
  asm-generic/ffz.h
  asm-generic/find-first-bit-set.h
  asm-generic/fls.h
  asm-generic/flsl.h
  asm-generic/test-bit.h
  - rename asm-generic/find-first-bit-set.h to asm-generic/find-first-set-bit.h 
to be aligned with the function
name implemented inside.
  - introduce generic___test_and*() operation for non-atomic bitops.
  - rename current __test_and_*() -> test_and_*() as their implementation are 
atomic aware.
  - define __test_and_*() to generic___test_and_*().
  - introduce test_and_change_bit().
  - update asm-generic/bitops/bitops-bits.h to give possoibility to change 
BITOP_*() macros by architecture.
Also, it was introduced bitops_uint_t type to make generic___test_and_*() 
generic.
  - "include asm-generic/bitops/bitops-bits.h" to files which use its 
definitions.
  - add comment why generic ffz is defined as __ffs().
  - update the commit message.
  - swtich ffsl() to generic_ffsl().
---
Changes in V3:
 - update the commit message
 - Introduce the following asm-generic bitops headers:
create mode 100644 xen/arch/riscv/include/asm/bitops.h
create mode 100644 xen/include/asm-generic/bitops/bitops-bits.h
create mode 100644 xen/include/asm-generic/bitops/ffs.h
create mode 100644 xen/include/asm-generic/bitops/ffz.h
create mode 100644 xen/include/asm-generic/bitops/find-first-bit-set.h
create mode 100644 xen/include/asm-generic/bitops/fls.h
create mode 100644 xen/include/asm-generic/bitops/flsl.h
create mode 100644 xen/include/asm-generic/bitops/hweight.h
create mode 100644 xen/include/asm-generic/bitops/test-bit.h
 - switch some bitops functions to asm-generic's versions.
 - re-sync some macros with Linux kernel version mentioned in the commit 
message.
 - Xen code style fixes.
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/bitops.h | 152 
 xen/arch/riscv/include/asm/config.h |   2 +
 2 files changed, 154 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/bitops.h

diff --git a/xen/arch/riscv/include/asm/bitops.h 
b/xen/arch/riscv/include/asm/bitops.h
new file mode 100644
index 00..17b3cf5be5
--- /dev/null
+++ b/xen/arch/riscv/include/asm/bitops.h
@@ -0,0 +1,152 

[PATCH v5 11/23] xen/riscv: introduce cmpxchg.h

2024-02-26 Thread Oleksii Kurochko
The header was taken from Linux kernl 6.4.0-rc1.

Addionally, were updated:
* add emulation of {cmp}xchg for 1/2 byte types using 32-bit atomic
  access.
* replace tabs with spaces
* replace __* variale with *__
* introduce generic version of xchg_* and cmpxchg_*.

Implementation of 4- and 8-byte cases were left as it is done in
Linux kernel as according to the RISC-V spec:
```
Table A.5 ( only part of the table was copied here )

Linux Construct   RVWMO Mapping
atomic  relaxedamo.{w|d}
atomic  acquireamo.{w|d}.aq
atomic  releaseamo.{w|d}.rl
atomic amo.{w|d}.aqrl

Linux Construct   RVWMO LR/SC Mapping
atomic  relaxedloop: lr.{w|d}; ; sc.{w|d}; bnez loop
atomic  acquireloop: lr.{w|d}.aq; ; sc.{w|d}; bnez loop
atomic  releaseloop: lr.{w|d}; ; sc.{w|d}.aqrl∗ ; bnez loop OR
   fence.tso; loop: lr.{w|d}; ; sc.{w|d}∗ ; bnez loop
atomic loop: lr.{w|d}.aq; ; sc.{w|d}.aqrl; bnez loop

The Linux mappings for release operations may seem stronger than necessary,
but these mappings are needed to cover some cases in which Linux requires
stronger orderings than the more intuitive mappings would provide.
In particular, as of the time this text is being written, Linux is actively
debating whether to require load-load, load-store, and store-store orderings
between accesses in one critical section and accesses in a subsequent critical
section in the same hart and protected by the same synchronization object.
Not all combinations of FENCE RW,W/FENCE R,RW mappings with aq/rl mappings
combine to provide such orderings.
There are a few ways around this problem, including:
1. Always use FENCE RW,W/FENCE R,RW, and never use aq/rl. This suffices
   but is undesirable, as it defeats the purpose of the aq/rl modifiers.
2. Always use aq/rl, and never use FENCE RW,W/FENCE R,RW. This does not
   currently work due to the lack of load and store opcodes with aq and rl
   modifiers.
3. Strengthen the mappings of release operations such that they would
   enforce sufficient orderings in the presence of either type of acquire 
mapping.
   This is the currently-recommended solution, and the one shown in Table A.5.
```

But in Linux kenrel atomics were strengthen with fences:
```
Atomics present the same issue with locking: release and acquire
variants need to be strengthened to meet the constraints defined
by the Linux-kernel memory consistency model [1].

Atomics present a further issue: implementations of atomics such
as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
which do not give full-ordering with .aqrl; for example, current
implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
below to end up with the state indicated in the "exists" clause.

In order to "synchronize" LKMM and RISC-V's implementation, this
commit strengthens the implementations of the atomics operations
by replacing .rl and .aq with the use of ("lightweigth") fences,
and by replacing .aqrl LR/SC pairs in sequences such as:

0:  lr.w.aqrl  %0, %addr
bne%0, %old, 1f
...
sc.w.aqrl  %1, %new, %addr
bnez   %1, 0b
1:

with sequences of the form:

0:  lr.w   %0, %addr
bne%0, %old, 1f
  ...
sc.w.rl%1, %new, %addr   /* SC-release   */
bnez   %1, 0b
fence  rw, rw/* "full" fence */
1:

following Daniel's suggestion.

These modifications were validated with simulation of the RISC-V
memory consistency model.

C lr-sc-aqrl-pair-vs-full-barrier

{}

P0(int *x, int *y, atomic_t *u)
{
int r0;
int r1;

WRITE_ONCE(*x, 1);
r0 = atomic_cmpxchg(u, 0, 1);
r1 = READ_ONCE(*y);
}

P1(int *x, int *y, atomic_t *v)
{
int r0;
int r1;

WRITE_ONCE(*y, 1);
r0 = atomic_cmpxchg(v, 0, 1);
r1 = READ_ONCE(*x);
}

exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)

[1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2
https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM
https://marc.info/?l=linux-kernel&m=151633436614259&w=2
```

Signed-off-by: Oleksii Kurochko 
---
Changes in V5:
 - update the commit message.
 - drop ALIGN_DOWN().
 - update the definition of emulate_xchg_1_2(): 
   - lr.d -> lr.w, sc.d -> sc.w.
   - drop ret argument.
   - code style fixes around asm volatile.
   - update prototype.
   - use asm named operands.
   - rename local variables.
   - add comment above the macros
 - update the definition of __xchg_generic:
   - drop local ptr__ variable.
   - code style fixes around switch()
   - update prototype.
 - introduce RISCV_FULL_BARRIES.
 - redefine cmpxchg()
 - update emulate_cmpxchg_1_2():
   - update prototype
   - update local variables names and usage of them
   - use name asm operands.
   - add comment above the macros
---
Changes in V4:
 - Code style fixes.
 - enforce in __xchg_*() has the same type for new and *ptr, also "\n"
   was removed at the en

[PATCH v5 03/23] xen/riscv: introduce nospec.h

2024-02-26 Thread Oleksii Kurochko
>From the unpriviliged doc:
  No standard hints are presently defined.
  We anticipate standard hints to eventually include memory-system spatial
  and temporal locality hints, branch prediction hints, thread-scheduling
  hints, security tags, and instrumentation flags for simulation/emulation.

Also, there are no speculation execution barriers.

Therefore, functions evaluate_nospec() and block_speculation() should
remain empty until a specific platform has an extension to deal with
speculation execution.

Signed-off-by: Oleksii Kurochko 
---
 Changes in V5:
   - new patch
---
 xen/arch/riscv/include/asm/nospec.h | 25 +
 1 file changed, 25 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/nospec.h

diff --git a/xen/arch/riscv/include/asm/nospec.h 
b/xen/arch/riscv/include/asm/nospec.h
new file mode 100644
index 00..4fb404a0a2
--- /dev/null
+++ b/xen/arch/riscv/include/asm/nospec.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2024 Vates */
+
+#ifndef _ASM_GENERIC_NOSPEC_H
+#define _ASM_GENERIC_NOSPEC_H
+
+static inline bool evaluate_nospec(bool condition)
+{
+return condition;
+}
+
+static inline void block_speculation(void)
+{
+}
+
+#endif /* _ASM_GENERIC_NOSPEC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.43.0




[PATCH v5 16/23] xen/riscv: add required things to current.h

2024-02-26 Thread Oleksii Kurochko
Add minimal requied things to be able to build full Xen.

Signed-off-by: Oleksii Kurochko 
Acked-by: Jan Beulich 
---
Changes in V5:
 - Nothing changed. Only rebase.
---
Changes in V4:
 - BUG() was changed to BUG_ON("unimplemented");
 - Change "xen/bug.h" to "xen/lib.h" as BUG_ON is defined in xen/lib.h.
 - Add Acked-by: Jan Beulich 
---
Changes in V3:
 - add SPDX
 - drop a forward declaration of struct vcpu;
 - update guest_cpu_user_regs() macros
 - replace get_processor_id with smp_processor_id
 - update the commit message
 - code style fixes
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/current.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/xen/arch/riscv/include/asm/current.h 
b/xen/arch/riscv/include/asm/current.h
index d84f15dc50..aedb6dc732 100644
--- a/xen/arch/riscv/include/asm/current.h
+++ b/xen/arch/riscv/include/asm/current.h
@@ -3,6 +3,21 @@
 #ifndef __ASM_CURRENT_H
 #define __ASM_CURRENT_H
 
+#include 
+#include 
+#include 
+
+#ifndef __ASSEMBLY__
+
+/* Which VCPU is "current" on this PCPU. */
+DECLARE_PER_CPU(struct vcpu *, curr_vcpu);
+
+#define currentthis_cpu(curr_vcpu)
+#define set_current(vcpu)  do { current = (vcpu); } while (0)
+#define get_cpu_current(cpu)  per_cpu(curr_vcpu, cpu)
+
+#define guest_cpu_user_regs() ({ BUG_ON("unimplemented"); NULL; })
+
 #define switch_stack_and_jump(stack, fn) do {   \
 asm volatile (  \
 "mv sp, %0\n"   \
@@ -10,4 +25,8 @@
 unreachable();  \
 } while ( false )
 
+#define get_per_cpu_offset() __per_cpu_offset[smp_processor_id()]
+
+#endif /* __ASSEMBLY__ */
+
 #endif /* __ASM_CURRENT_H */
-- 
2.43.0




[PATCH v5 07/23] xen/asm-generic: introduce generic hweight64()

2024-02-26 Thread Oleksii Kurochko
The generic hweight() function can be useful for architectures
that don't have corresponding arch-specific instructions.

Signed-off-by: Oleksii Kurochko 
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/hweight.h | 13 +
 1 file changed, 13 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/hweight.h

diff --git a/xen/include/asm-generic/bitops/hweight.h 
b/xen/include/asm-generic/bitops/hweight.h
new file mode 100644
index 00..0d7577054e
--- /dev/null
+++ b/xen/include/asm-generic/bitops/hweight.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_HWEIGHT_H_
+#define _ASM_GENERIC_BITOPS_HWEIGHT_H_
+
+/*
+ * hweightN - returns the hamming weight of a N-bit word
+ * @x: the word to weigh
+ *
+ * The Hamming Weight of a number is the total number of bits set in it.
+ */
+#define hweight64(x) generic_hweight64(x)
+
+#endif /* _ASM_GENERIC_BITOPS_HWEIGHT_H_ */
-- 
2.43.0




[PATCH v5 12/23] xen/riscv: introduce io.h

2024-02-26 Thread Oleksii Kurochko
The header taken form Linux 6.4.0-rc1 and is based on
arch/riscv/include/asm/mmio.h with the following changes:
- drop forcing of endianess for read*(), write*() functions as
  no matter what CPU endianness, what endianness a particular device
  (and hence its MMIO region(s)) is using is entirely independent.
  Hence conversion, where necessary, needs to occur at a layer up.
  Another one reason to drop endianess conversion here is:
  
https://patchwork.kernel.org/project/linux-riscv/patch/2019045623.5749-3-...@lst.de/
  One of the answers of the author of the commit:
And we don't know if Linux will be around if that ever changes.
The point is:
 a) the current RISC-V spec is LE only
 b) the current linux port is LE only except for this little bit
There is no point in leaving just this bitrotting code around.  It
just confuses developers, (very very slightly) slows down compiles
and will bitrot.  It also won't be any significant help to a future
developer down the road doing a hypothetical BE RISC-V Linux port.
- drop unused argument of __io_ar() macros.
- drop "#define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}"
  as they are unnessary.
- Adopt the Xen code style for this header, considering that significant changes
  are not anticipated in the future.
  In the event of any issues, adapting them to Xen style should be easily
  manageable.
- drop unnessary __r variables in macros read*_cpu()

Addionally, to the header was added definions of ioremap_*().

Signed-off-by: Oleksii Kurochko 
---
Changes in V5:
 - Xen code style related fixes
 - drop #define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}
 - drop cpu_to_le16()
 - remove unuused argument in _io_ar()
 - update the commit message 
 - drop unnessary __r variables in macros read*_cpu()
 - update the comments at the top of the header.
---
Changes in V4:
 - delete inner parentheses in macros.
 - s/u/uint.
---
Changes in V3:
 - re-sync with linux kernel
 - update the commit message
---
Changes in V2:
 - Nothing changed. Only rebase.
---
 xen/arch/riscv/include/asm/io.h | 157 
 1 file changed, 157 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/io.h

diff --git a/xen/arch/riscv/include/asm/io.h b/xen/arch/riscv/include/asm/io.h
new file mode 100644
index 00..95a459432c
--- /dev/null
+++ b/xen/arch/riscv/include/asm/io.h
@@ -0,0 +1,157 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ *  The header taken form Linux 6.4.0-rc1 and is based on
+ *  arch/riscv/include/asm/mmio.h with the following changes:
+ *   - drop forcing of endianess for read*(), write*() functions as
+ * no matter what CPU endianness, what endianness a particular device
+ * (and hence its MMIO region(s)) is using is entirely independent.
+ * Hence conversion, where necessary, needs to occur at a layer up.
+ * Another one reason to drop endianess conversion is:
+ * 
https://patchwork.kernel.org/project/linux-riscv/patch/2019045623.5749-3-...@lst.de/
+ * One of the answers of the author of the commit:
+ *   And we don't know if Linux will be around if that ever changes.
+ *   The point is:
+ *a) the current RISC-V spec is LE only
+ *b) the current linux port is LE only except for this little bit
+ *   There is no point in leaving just this bitrotting code around.  It
+ *   just confuses developers, (very very slightly) slows down compiles
+  *  and will bitrot.  It also won't be any significant help to a future
+ *   developer down the road doing a hypothetical BE RISC-V Linux port.
+ *   - drop unused argument of __io_ar() macros.
+ *   - drop "#define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}"
+ * as they are unnessary.
+ *   - Adopt the Xen code style for this header, considering that significant 
changes
+ * are not anticipated in the future.
+ * In the event of any issues, adapting them to Xen style should be easily
+ * manageable.
+ *   - drop unnessary __r variables in macros read*_cpu()
+ *
+ * Copyright (C) 1996-2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2014 Regents of the University of California
+ * Copyright (C) 2024 Vates
+ */
+
+#ifndef _ASM_RISCV_IO_H
+#define _ASM_RISCV_IO_H
+
+#include 
+
+/*
+ * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
+ * change the properties of memory regions.  This should be fixed by the
+ * upcoming platform spec.
+ */
+#define ioremap_nocache(addr, size) ioremap(addr, size)
+#define ioremap_wc(addr, size) ioremap(addr, size)
+#define ioremap_wt(addr, size) ioremap(addr, size)
+
+/* Generic IO read/write.  These perform native-endian accesses. */
+static inline void __raw_writeb(uint8_t val, volatile void __iomem *addr)
+{
+asm volatile ( "sb %0, 0(%1)" : : "r" (val), "r" (addr) );
+}
+
+static inline void __raw_writew(uint16_t val, volatile void __iomem *addr)
+{
+as

[PATCH v5 05/23] xen/asm-generic: introduce generic find first set bit functions

2024-02-26 Thread Oleksii Kurochko
These functions can be useful for architectures that don't
have corresponding arch-specific instructions.

Signed-off-by: Oleksii Kurochko 
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/__ffs.h| 47 +++
 xen/include/asm-generic/bitops/ffs.h  |  9 
 xen/include/asm-generic/bitops/ffsl.h | 16 +++
 .../asm-generic/bitops/find-first-set-bit.h   | 17 +++
 4 files changed, 89 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/__ffs.h
 create mode 100644 xen/include/asm-generic/bitops/ffs.h
 create mode 100644 xen/include/asm-generic/bitops/ffsl.h
 create mode 100644 xen/include/asm-generic/bitops/find-first-set-bit.h

diff --git a/xen/include/asm-generic/bitops/__ffs.h 
b/xen/include/asm-generic/bitops/__ffs.h
new file mode 100644
index 00..fecb4484d9
--- /dev/null
+++ b/xen/include/asm-generic/bitops/__ffs.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS___FFS_H_
+#define _ASM_GENERIC_BITOPS___FFS_H_
+
+/**
+ * ffs - find first bit in word.
+ * @word: The word to search
+ *
+ * Returns 0 if no bit exists, otherwise returns 1-indexed bit location.
+ */
+static inline unsigned int __ffs(unsigned long word)
+{
+unsigned int num = 0;
+
+#if BITS_PER_LONG == 64
+if ( (word & 0x) == 0 )
+{
+num += 32;
+word >>= 32;
+}
+#endif
+if ( (word & 0x) == 0 )
+{
+num += 16;
+word >>= 16;
+}
+if ( (word & 0xff) == 0 )
+{
+num += 8;
+word >>= 8;
+}
+if ( (word & 0xf) == 0 )
+{
+num += 4;
+word >>= 4;
+}
+if ( (word & 0x3) == 0 )
+{
+num += 2;
+word >>= 2;
+}
+if ( (word & 0x1) == 0 )
+num += 1;
+return num;
+}
+
+#endif /* _ASM_GENERIC_BITOPS___FFS_H_ */
diff --git a/xen/include/asm-generic/bitops/ffs.h 
b/xen/include/asm-generic/bitops/ffs.h
new file mode 100644
index 00..3f75fded14
--- /dev/null
+++ b/xen/include/asm-generic/bitops/ffs.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FFS_H_
+#define _ASM_GENERIC_BITOPS_FFS_H_
+
+#include 
+
+#define ffs(x) ({ unsigned int t_ = (x); fls(ISOLATE_LSB(t_)); })
+
+#endif /* _ASM_GENERIC_BITOPS_FFS_H_ */
diff --git a/xen/include/asm-generic/bitops/ffsl.h 
b/xen/include/asm-generic/bitops/ffsl.h
new file mode 100644
index 00..d0996808f5
--- /dev/null
+++ b/xen/include/asm-generic/bitops/ffsl.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FFSL_H_
+#define _ASM_GENERIC_BITOPS_FFSL_H_
+
+/**
+ * ffsl - find first bit in long.
+ * @word: The word to search
+ *
+ * Returns 0 if no bit exists, otherwise returns 1-indexed bit location.
+ */
+static inline unsigned int ffsl(unsigned long word)
+{
+return generic_ffsl(word);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FFSL_H_ */
diff --git a/xen/include/asm-generic/bitops/find-first-set-bit.h 
b/xen/include/asm-generic/bitops/find-first-set-bit.h
new file mode 100644
index 00..7d28b8a89b
--- /dev/null
+++ b/xen/include/asm-generic/bitops/find-first-set-bit.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FIND_FIRST_SET_BIT_H_
+#define _ASM_GENERIC_BITOPS_FIND_FIRST_SET_BIT_H_
+
+/**
+ * find_first_set_bit - find the first set bit in @word
+ * @word: the word to search
+ *
+ * Returns the bit-number of the first set bit (first bit being 0).
+ * The input must *not* be zero.
+ */
+static inline unsigned int find_first_set_bit(unsigned long word)
+{
+return ffsl(word) - 1;
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FIND_FIRST_SET_BIT_H_ */
-- 
2.43.0




[PATCH v5 01/23] xen/riscv: disable unnecessary configs

2024-02-26 Thread Oleksii Kurochko
This patch disables unnecessary configs for two cases:
1. By utilizing EXTRA_FIXED_RANDCONFIG for randconfig builds (GitLab CI jobs).
2. By using tiny64_defconfig for non-randconfig builds.

Signed-off-by: Oleksii Kurochko 
---
Changes in V5:
 - Rebase and drop duplicated configs in EXTRA_FIXED_RANDCONFIG list
 - Update the commit message
---
Changes in V4:
 - Nothing changed. Only rebase
---
Changes in V3:
 - Remove EXTRA_FIXED_RANDCONFIG for non-randconfig jobs.
   For non-randconfig jobs, it is sufficient to disable configs by using the 
defconfig.
 - Remove double blank lines in build.yaml file before 
archlinux-current-gcc-riscv64-debug
---
Changes in V2:
 - update the commit message.
 - remove xen/arch/riscv/Kconfig changes.
---
 automation/gitlab-ci/build.yaml | 24 
 xen/arch/riscv/configs/tiny64_defconfig | 17 +
 2 files changed, 41 insertions(+)

diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml
index aac29ee13a..3b3d2c47dc 100644
--- a/automation/gitlab-ci/build.yaml
+++ b/automation/gitlab-ci/build.yaml
@@ -519,6 +519,30 @@ alpine-3.18-gcc-debug-arm64-boot-cpupools:
   CONFIG_EXPERT=y
   CONFIG_GRANT_TABLE=n
   CONFIG_MEM_ACCESS=n
+  CONFIG_SCHED_CREDIT=n
+  CONFIG_SCHED_CREDIT2=n
+  CONFIG_SCHED_RTDS=n
+  CONFIG_SCHED_NULL=n
+  CONFIG_SCHED_ARINC653=n
+  CONFIG_TRACEBUFFER=n
+  CONFIG_HYPFS=n
+  CONFIG_SPECULATIVE_HARDEN_ARRAY=n
+  CONFIG_ARGO=n
+  CONFIG_HYPFS_CONFIG=n
+  CONFIG_CORE_PARKING=n
+  CONFIG_DEBUG_TRACE=n
+  CONFIG_IOREQ_SERVER=n
+  CONFIG_CRASH_DEBUG=n
+  CONFIG_KEXEC=n
+  CONFIG_LIVEPATCH=n
+  CONFIG_NUMA=n
+  CONFIG_PERF_COUNTERS=n
+  CONFIG_HAS_PMAP=n
+  CONFIG_XENOPROF=n
+  CONFIG_COMPAT=n
+  CONFIG_UBSAN=n
+  CONFIG_NEEDS_LIBELF=n
+  CONFIG_XSM=n
 
 archlinux-current-gcc-riscv64:
   extends: .gcc-riscv64-cross-build
diff --git a/xen/arch/riscv/configs/tiny64_defconfig 
b/xen/arch/riscv/configs/tiny64_defconfig
index 09defe236b..35915255e6 100644
--- a/xen/arch/riscv/configs/tiny64_defconfig
+++ b/xen/arch/riscv/configs/tiny64_defconfig
@@ -7,6 +7,23 @@
 # CONFIG_GRANT_TABLE is not set
 # CONFIG_SPECULATIVE_HARDEN_ARRAY is not set
 # CONFIG_MEM_ACCESS is not set
+# CONFIG_ARGO is not set
+# CONFIG_HYPFS_CONFIG is not set
+# CONFIG_CORE_PARKING is not set
+# CONFIG_DEBUG_TRACE is not set
+# CONFIG_IOREQ_SERVER is not set
+# CONFIG_CRASH_DEBUG is not setz
+# CONFIG_KEXEC is not set
+# CONFIG_LIVEPATCH is not set
+# CONFIG_NUMA is not set
+# CONFIG_PERF_COUNTERS is not set
+# CONFIG_HAS_PMAP is not set
+# CONFIG_TRACEBUFFER is not set
+# CONFIG_XENOPROF is not set
+# CONFIG_COMPAT is not set
+# CONFIG_COVERAGE is not set
+# CONFIG_UBSAN is not set
+# CONFIG_NEEDS_LIBELF is not set
 
 CONFIG_RISCV_64=y
 CONFIG_DEBUG=y
-- 
2.43.0




[PATCH v5 10/23] xen/riscv: introduces acrquire, release and full barriers

2024-02-26 Thread Oleksii Kurochko
Signed-off-by: Oleksii Kurochko 
---
 Changes in V5:
   - new patch
---
 xen/arch/riscv/include/asm/fence.h | 9 +
 1 file changed, 9 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/fence.h

diff --git a/xen/arch/riscv/include/asm/fence.h 
b/xen/arch/riscv/include/asm/fence.h
new file mode 100644
index 00..27f46fa897
--- /dev/null
+++ b/xen/arch/riscv/include/asm/fence.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ASM_RISCV_FENCE_H
+#define _ASM_RISCV_FENCE_H
+
+#define RISCV_ACQUIRE_BARRIER   "\tfence r , rw\n"
+#define RISCV_RELEASE_BARRIER   "\tfence rw, w\n"
+#define RISCV_FULL_BARRIER  "\tfence rw, rw\n"
+
+#endif /* _ASM_RISCV_FENCE_H */
-- 
2.43.0




[PATCH v5 08/23] xen/asm-generic: introduce generic non-atomic test_*bit()

2024-02-26 Thread Oleksii Kurochko
The patch introduces the following generic functions:
* test_bit
* generic___test_and_set_bit
* generic___test_and_clear_bit
* generic___test_and_change_bit

Also, the patch introduces the following generics which are
used by the functions mentioned above:
* BITOP_BITS_PER_WORD
* BITOP_MASK
* BITOP_WORD
* BITOP_TYPE

These functions and macros can be useful for architectures
that don't have corresponding arch-specific instructions.

Signed-off-by: Oleksii Kurochko 
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/bitops-bits.h  | 21 +
 .../asm-generic/bitops/generic-non-atomic.h   | 89 +++
 xen/include/asm-generic/bitops/test-bit.h | 18 
 3 files changed, 128 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/bitops-bits.h
 create mode 100644 xen/include/asm-generic/bitops/generic-non-atomic.h
 create mode 100644 xen/include/asm-generic/bitops/test-bit.h

diff --git a/xen/include/asm-generic/bitops/bitops-bits.h 
b/xen/include/asm-generic/bitops/bitops-bits.h
new file mode 100644
index 00..4ece2affd6
--- /dev/null
+++ b/xen/include/asm-generic/bitops/bitops-bits.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_BITS_H_
+#define _ASM_GENERIC_BITOPS_BITS_H_
+
+#ifndef BITOP_BITS_PER_WORD
+#define BITOP_BITS_PER_WORD 32
+#endif
+
+#ifndef BITOP_MASK
+#define BITOP_MASK(nr)  (1U << ((nr) % BITOP_BITS_PER_WORD))
+#endif
+
+#ifndef BITOP_WORD
+#define BITOP_WORD(nr)  ((nr) / BITOP_BITS_PER_WORD)
+#endif
+
+#ifndef BITOP_TYPE
+typedef uint32_t bitops_uint_t;
+#endif
+
+#endif /* _ASM_GENERIC_BITOPS_BITS_H_ */
diff --git a/xen/include/asm-generic/bitops/generic-non-atomic.h 
b/xen/include/asm-generic/bitops/generic-non-atomic.h
new file mode 100644
index 00..42569d0d7c
--- /dev/null
+++ b/xen/include/asm-generic/bitops/generic-non-atomic.h
@@ -0,0 +1,89 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * The file is based on Linux ( 6.4.0 ) header:
+ *   include/asm-generic/bitops/generic-non-atomic.h
+ * 
+ * Only functions that can be reused in Xen were left; others were removed.
+ * 
+ * Also, the following changes were done:
+ *  - it was updated the message inside #ifndef ... #endif.
+ *  - __always_inline -> always_inline to be align with definition in
+ *xen/compiler.h.
+ *  - update function prototypes from
+ *generic___test_and_*(unsigned long nr nr, volatile unsigned long *addr) 
to
+ *generic___test_and_*(unsigned long nr, volatile void *addr) to be
+ *consistent with other related macros/defines.
+ *  - convert identations from tabs to spaces.
+ *  - inside generic__test_and_* use 'bitops_uint_t' instead of 'unsigned long'
+ *to be generic.
+ */
+
+#ifndef __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H
+#define __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H
+
+#include 
+
+#include 
+
+#ifndef _LINUX_BITOPS_H
+#error only  can be included directly
+#endif
+
+/*
+ * Generic definitions for bit operations, should not be used in regular code
+ * directly.
+ */
+
+/**
+ * generic___test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static always_inline bool
+generic___test_and_set_bit(unsigned long nr, volatile void *addr)
+{
+bitops_uint_t mask = BITOP_MASK(nr);
+bitops_uint_t *p = ((bitops_uint_t *)addr) + BITOP_WORD(nr);
+bitops_uint_t old = *p;
+
+*p = old | mask;
+return (old & mask) != 0;
+}
+
+/**
+ * generic___test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static always_inline bool
+generic___test_and_clear_bit(bitops_uint_t nr, volatile void *addr)
+{
+bitops_uint_t mask = BITOP_MASK(nr);
+bitops_uint_t *p = ((bitops_uint_t *)addr) + BITOP_WORD(nr);
+bitops_uint_t old = *p;
+
+*p = old & ~mask;
+return (old & mask) != 0;
+}
+
+/* WARNING: non atomic and it can be reordered! */
+static always_inline bool
+generic___test_and_change_bit(unsigned long nr, volatile void *addr)
+{
+bitops_uint_t mask = BITOP_MASK(nr);
+bitops_uint_t *p = ((bitops_uint_t *)addr) + BITOP_WORD(nr);
+bitops_uint_t old = *p;
+
+*p = old ^ mask;
+return (old & mask) != 0;
+}
+
+#endif /* __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H */
diff --git a/xen/include/asm-generic/bitops/test-bit.h 
b/xen/include/asm-generic/bitops/test-bit.h
new file mode 100644
index 00..6fb414d808
--- /dev/null
+++ b/xen/include/asm-generic/bitops/test-bit.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0

[PATCH v5 06/23] xen/asm-generic: introduce generic ffz()

2024-02-26 Thread Oleksii Kurochko
The generic ffz() can be useful for architectures
that don't have corresponding arch-specific instruction.

Signed-off-by: Oleksii Kurochko 
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/ffz.h | 18 ++
 1 file changed, 18 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/ffz.h

diff --git a/xen/include/asm-generic/bitops/ffz.h 
b/xen/include/asm-generic/bitops/ffz.h
new file mode 100644
index 00..5932fe6695
--- /dev/null
+++ b/xen/include/asm-generic/bitops/ffz.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FFZ_H_
+#define _ASM_GENERIC_BITOPS_FFZ_H_
+
+/*
+ * ffz - find first zero in word.
+ * @word: The word to search
+ *
+ * Undefined if no zero exists, so code should check against ~0UL first.
+ *
+ * ffz() is defined as __ffs() and not as ffs() as it is defined in such
+ * a way in Linux kernel (6.4.0 ) from where this header was taken, so this
+ * header is supposed to be aligned with Linux kernel version.
+ * Also, most architectures are defined in the same way in Xen.
+ */
+#define ffz(x)  __ffs(~(x))
+
+#endif /* _ASM_GENERIC_BITOPS_FFZ_H_ */
-- 
2.43.0




[PATCH v5 04/23] xen/asm-generic: introduce generic fls() and flsl() functions

2024-02-26 Thread Oleksii Kurochko
These functions can be useful for architectures that don't
have corresponding arch-specific instructions.

Signed-off-by: Oleksii Kurochko 
---
 Changes in V5:
   - new patch
---
 xen/include/asm-generic/bitops/fls.h  | 18 ++
 xen/include/asm-generic/bitops/flsl.h | 10 ++
 2 files changed, 28 insertions(+)
 create mode 100644 xen/include/asm-generic/bitops/fls.h
 create mode 100644 xen/include/asm-generic/bitops/flsl.h

diff --git a/xen/include/asm-generic/bitops/fls.h 
b/xen/include/asm-generic/bitops/fls.h
new file mode 100644
index 00..369a4c790c
--- /dev/null
+++ b/xen/include/asm-generic/bitops/fls.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FLS_H_
+#define _ASM_GENERIC_BITOPS_FLS_H_
+
+/**
+ * fls - find last (most-significant) bit set
+ * @x: the word to search
+ *
+ * This is defined the same way as ffs.
+ * Note fls(0) = 0, fls(1) = 1, fls(0x8000) = 32.
+ */
+
+static inline int fls(unsigned int x)
+{
+return generic_fls(x);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FLS_H_ */
diff --git a/xen/include/asm-generic/bitops/flsl.h 
b/xen/include/asm-generic/bitops/flsl.h
new file mode 100644
index 00..d0a2e9c729
--- /dev/null
+++ b/xen/include/asm-generic/bitops/flsl.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_BITOPS_FLSL_H_
+#define _ASM_GENERIC_BITOPS_FLSL_H_
+
+static inline int flsl(unsigned long x)
+{
+return generic_flsl(x);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FLSL_H_ */
-- 
2.43.0




[PATCH v5 02/23] xen/riscv: use some asm-generic headers

2024-02-26 Thread Oleksii Kurochko
The following headers end up the same as asm-generic's version:
* altp2m.h
* device.h
* div64.h
* hardirq.h
* hypercall.h
* iocap.h
* paging.h
* percpu.h
* random.h
* softirq.h
* vm_event.h

RISC-V should utilize the asm-generic's version of the mentioned
headers instead of introducing them in the arch-specific folder.

Signed-off-by: Oleksii Kurochko 
Acked-by: Jan Beulich 
---
Changes in V5:
 - Nothing changed. Only rebase.
 - update the commit message.
 - drop the message above revision log as there is no depenency for this patch
   from other patch series.
---
Changes in V4:
- removed numa.h from asm/include/Makefile because of the patch: [PATCH v2] 
NUMA: no need for asm/numa.h when !NUMA
- updated the commit message
---
Changes in V3:
 - remove monitor.h from the RISC-V asm/Makefile list.
 - add Acked-by: Jan Beulich 
---
Changes in V2:
 - New commit introduced in V2.
---
 xen/arch/riscv/include/asm/Makefile | 12 
 1 file changed, 12 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/Makefile

diff --git a/xen/arch/riscv/include/asm/Makefile 
b/xen/arch/riscv/include/asm/Makefile
new file mode 100644
index 00..ced02e26ed
--- /dev/null
+++ b/xen/arch/riscv/include/asm/Makefile
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+generic-y += altp2m.h
+generic-y += device.h
+generic-y += div64.h
+generic-y += hardirq.h
+generic-y += hypercall.h
+generic-y += iocap.h
+generic-y += paging.h
+generic-y += percpu.h
+generic-y += random.h
+generic-y += softirq.h
+generic-y += vm_event.h
-- 
2.43.0




[PATCH v5 00/23] [PATCH v4 00/30] Enable build of full Xen for RISC-V

2024-02-26 Thread Oleksii Kurochko
This patch series performs all of the additions necessary to drop the
build overrides for RISCV and enable the full Xen build. Except in cases
where compatibile implementations already exist (e.g. atomic.h and
bitops.h), the newly added definitions are simple.

The patch series is based on the following patch series:
-   [PATCH v6 0/9] Introduce generic headers   [1]
- [PATCH] move __read_mostly to xen/cache.h  [2]
- [XEN PATCH v2 1/3] xen: introduce STATIC_ASSERT_UNREACHABLE() [3]
- [PATCH] xen/lib: introduce generic find next bit operations [4]

Right now, the patch series doesn't have a direct dependency on [2] and it
provides __read_mostly in the patch:
[PATCH v3 26/34] xen/riscv: add definition of __read_mostly
However, it will be dropped as soon as [2] is merged or at least when the
final version of the patch [2] is provided.

[1] 
https://lore.kernel.org/xen-devel/cover.1703072575.git.oleksii.kuroc...@gmail.com/
[2] 
https://lore.kernel.org/xen-devel/f25eb5c9-7c14-6e23-8535-2c66772b3...@suse.com/
[3] 
https://lore.kernel.org/xen-devel/42fc6ae8d3eb802429d29c774502ff232340dc84.1706259490.git.federico.seraf...@bugseng.com/
[4] 
https://lore.kernel.org/xen-devel/52730e6314210ba4164a9934a720c4fda201447b.1706266854.git.oleksii.kuroc...@gmail.com/

---
Changes in V5:
 - Update the cover letter as one of the dependencies were merged to staging.
 - Was introduced asm-generic for atomic ops and separate patches for 
asm-generic bit ops
 - Moved fence.h to separate patch to deal with some patches dependecies on 
fence.h
 - Patches were dropped as they were merged to staging:
   * [PATCH v4 03/30] xen: add support in public/hvm/save.h for PPC and RISC-V
   * [PATCH v4 04/30] xen/riscv: introduce cpufeature.h
   * [PATCH v4 05/30] xen/riscv: introduce guest_atomics.h
   * [PATCH v4 06/30] xen: avoid generation of empty asm/iommu.h
   * [PATCH v4 08/30] xen/riscv: introduce setup.h
   * [PATCH v4 10/30] xen/riscv: introduce flushtlb.h
   * [PATCH v4 11/30] xen/riscv: introduce smp.h
   * [PATCH v4 15/30] xen/riscv: introduce irq.h
   * [PATCH v4 16/30] xen/riscv: introduce p2m.h
   * [PATCH v4 17/30] xen/riscv: introduce regs.h
   * [PATCH v4 18/30] xen/riscv: introduce time.h
   * [PATCH v4 19/30] xen/riscv: introduce event.h
   * [PATCH v4 22/30] xen/riscv: define an address of frame table
 - Other changes are specific to specific patches. please look at specific patch
---
Changes in V4:
 - Update the cover letter message: new patch series dependencies.
 - Some patches were merged to staging, so they were dropped in this patch 
series:
 [PATCH v3 09/34] xen/riscv: introduce system.h
 [PATCH v3 18/34] xen/riscv: introduce domain.h
 [PATCH v3 19/34] xen/riscv: introduce guest_access.h
 - Was sent out of this patch series:
 [PATCH v3 16/34] xen/lib: introduce generic find next bit operations
 - [PATCH v3 17/34] xen/riscv: add compilation of generic find-next-bit.c was
   droped as CONFIG_GENERIC_FIND_NEXT_BIT was dropped.
 - All other changes are specific to a specific patch.
---
Changes in V3:
 - Update the cover letter message
 - The following patches were dropped as they were merged to staging:
[PATCH v2 03/39] xen/riscv:introduce asm/byteorder.h
[PATCH v2 04/39] xen/riscv: add public arch-riscv.h
[PATCH v2 05/39] xen/riscv: introduce spinlock.h
[PATCH v2 20/39] xen/riscv: define bug frame tables in xen.lds.S
[PATCH v2 34/39] xen: add RISCV support for pmu.h
[PATCH v2 35/39] xen: add necessary headers to common to build full Xen for 
RISC-V
 - Instead of the following patches were introduced new:
[PATCH v2 10/39] xen/riscv: introduce asm/iommu.h
[PATCH v2 11/39] xen/riscv: introduce asm/nospec.h
 - remove "asm/"  for commit messages which start with "xen/riscv:"
 - code style updates.
 - add emulation of {cmp}xchg_* for 1 and 2 bytes types.
 - code style fixes.
 - add SPDX and footer for the newly added headers.
 - introduce generic find-next-bit.c.
 - some other mionor changes. ( details please find in a patch )
---
Changes in V2:
  - Drop the following patches as they are the part of [2]:
  [PATCH v1 06/57] xen/riscv: introduce paging.h
  [PATCH v1 08/57] xen/riscv: introduce asm/device.h
  [PATCH v1 10/57] xen/riscv: introduce asm/grant_table.h
  [PATCH v1 12/57] xen/riscv: introduce asm/hypercall.h
  [PATCH v1 13/57] xen/riscv: introduce asm/iocap.h
  [PATCH v1 15/57] xen/riscv: introduce asm/mem_access.h
  [PATCH v1 18/57] xen/riscv: introduce asm/random.h
  [PATCH v1 21/57] xen/riscv: introduce asm/xenoprof.h
  [PATCH v1 24/57] xen/riscv: introduce asm/percpu.h
  [PATCH v1 29/57] xen/riscv: introduce asm/hardirq.h
  [PATCH v1 33/57] xen/riscv: introduce asm/altp2m.h
  [PATCH v1 38/57] xen/riscv: introduce asm/monitor.h
  [PATCH v1 39/57] xen/riscv: introduce asm/numa.h
  [PATCH v1 42/57] xen/riscv: introduce asm/softirq.h
  - xen/lib.h in most of the cases were changed to xen/bug.h as
mostly

Re: [PATCH] docs/sphinx: Start an FAQ, and add Kconfig/CET details

2024-02-26 Thread Andrew Cooper
On 26/02/2024 4:52 pm, Jan Beulich wrote:
> On 26.02.2024 17:25, Andrew Cooper wrote:
>> This is long overdue, and we need to start somewhere.
>>
>> Signed-off-by: Andrew Cooper 
> Acked-by: Jan Beulich 

Thanks.

> perhaps (nit) with ...
>
>> --- /dev/null
>> +++ b/docs/faq.rst
>> @@ -0,0 +1,71 @@
>> +.. SPDX-License-Identifier: CC-BY-4.0
>> +
>> +Frequently Asked Questions
>> +==
>> +
>> +How do I...
>> +---
>> +
>> +... check whether a Kconfig option is active?
>> +^
>> +
>> +  Kconfig is a build time configuration system, combining inherent 
>> knowledge,
>> +  the capabilities of the toolchain, and explicit user choice to form a
>> +  configuration of a build of Xen.
>> +
>> +  A file, by default ``.config``, is produced by the build identifying the
>> +  configuration used.  Kconfig symbols all start with ``CONFIG_``, and come 
>> in
>> +  a variety of types including strings, integers and booleans.  Booleans are
>> +  the most common, and when active are expressed with ``...=y``.  e.g.::
>> +
>> +xen.git/xen$ grep CONFIG_FOO .config
>> +CONFIG_FOO_BOOLEAN=y
>> +CONFIG_FOO_STRING="lorem ipsum"
>> +CONFIG_FOO_INTEGER=42
>> +
>> +  Symbols which are either absent, or expressed as ``... is not set`` are
>> +  disabled.  e.g.::
>> +
>> +xen.git/xen$ grep CONFIG_BAR .config
>> +# CONFIG_BAR is not set
>> +
>> +  Builds of Xen configured with ``CONFIG_HYPFS_CONFIG=y`` embed their own
>> +  ``.config`` at build time, and can provide it to the :term:`control 
>> domain`
>> +  upon requested.  e.g.::
>> +
>> +[root@host ~]# xenhypfs cat /buildinfo/config | grep -e FOO -e BAR
>> +CONFIG_FOO=y
>> +# CONFIG_BAR is not set
>> +
>> +
>> +... tell if CET is active?
>> +^^
>> +
>> +  Control-flow Enforcement Technology support was added to Xen 4.14.  It is
>> +  build time conditional, dependent on both having a new-enough toolchain 
>> and
>> +  an explicit Kconfig option, and also requires capable hardware.  See
>> +  :term:`CET`.
>> +
>> +  For CET-SS, Shadow Stacks, the minimum toolchain requirements are 
>> ``binutils
>> +  >= 2.29`` or ``LLVM >= 6``.  No specific compiler support is required.
>> +  Check for ``CONFIG_XEN_SHSTK`` being active.
>> +
>> +  For CET-IBT, Indirect Branch Tracking, the minimum toolchain requirements
>> +  are ``GCC >= 9`` and ``binutils >= 2.29``.  Xen relies on a compiler 
>> feature
>> +  which is specific to GCC at the time of writing.  Check for
>> +  ``CONFIG_XEN_IBT`` being active.
>> +
>> +  If a capable Xen with booted on capable hardware, and CET is not disabled 
>> by
> ... s/with/is/ (or "was").

Oops yes.  Will fix.

~Andrew



Re: [PATCH] docs/sphinx: Start an FAQ, and add Kconfig/CET details

2024-02-26 Thread Jan Beulich
On 26.02.2024 17:25, Andrew Cooper wrote:
> This is long overdue, and we need to start somewhere.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Jan Beulich 
perhaps (nit) with ...

> --- /dev/null
> +++ b/docs/faq.rst
> @@ -0,0 +1,71 @@
> +.. SPDX-License-Identifier: CC-BY-4.0
> +
> +Frequently Asked Questions
> +==
> +
> +How do I...
> +---
> +
> +... check whether a Kconfig option is active?
> +^
> +
> +  Kconfig is a build time configuration system, combining inherent knowledge,
> +  the capabilities of the toolchain, and explicit user choice to form a
> +  configuration of a build of Xen.
> +
> +  A file, by default ``.config``, is produced by the build identifying the
> +  configuration used.  Kconfig symbols all start with ``CONFIG_``, and come 
> in
> +  a variety of types including strings, integers and booleans.  Booleans are
> +  the most common, and when active are expressed with ``...=y``.  e.g.::
> +
> +xen.git/xen$ grep CONFIG_FOO .config
> +CONFIG_FOO_BOOLEAN=y
> +CONFIG_FOO_STRING="lorem ipsum"
> +CONFIG_FOO_INTEGER=42
> +
> +  Symbols which are either absent, or expressed as ``... is not set`` are
> +  disabled.  e.g.::
> +
> +xen.git/xen$ grep CONFIG_BAR .config
> +# CONFIG_BAR is not set
> +
> +  Builds of Xen configured with ``CONFIG_HYPFS_CONFIG=y`` embed their own
> +  ``.config`` at build time, and can provide it to the :term:`control domain`
> +  upon requested.  e.g.::
> +
> +[root@host ~]# xenhypfs cat /buildinfo/config | grep -e FOO -e BAR
> +CONFIG_FOO=y
> +# CONFIG_BAR is not set
> +
> +
> +... tell if CET is active?
> +^^
> +
> +  Control-flow Enforcement Technology support was added to Xen 4.14.  It is
> +  build time conditional, dependent on both having a new-enough toolchain and
> +  an explicit Kconfig option, and also requires capable hardware.  See
> +  :term:`CET`.
> +
> +  For CET-SS, Shadow Stacks, the minimum toolchain requirements are 
> ``binutils
> +  >= 2.29`` or ``LLVM >= 6``.  No specific compiler support is required.
> +  Check for ``CONFIG_XEN_SHSTK`` being active.
> +
> +  For CET-IBT, Indirect Branch Tracking, the minimum toolchain requirements
> +  are ``GCC >= 9`` and ``binutils >= 2.29``.  Xen relies on a compiler 
> feature
> +  which is specific to GCC at the time of writing.  Check for
> +  ``CONFIG_XEN_IBT`` being active.
> +
> +  If a capable Xen with booted on capable hardware, and CET is not disabled 
> by

... s/with/is/ (or "was").

Jan



[PATCH v3 10/12] VMX: convert vmx_vmentry_control

2024-02-26 Thread Jan Beulich
... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -162,7 +162,6 @@ static int cf_check parse_ept_param_runt
 
 /* Dynamic (run-time adjusted) execution control flags. */
 struct vmx_caps __ro_after_init vmx_caps;
-u32 vmx_vmentry_control __read_mostly;
 u64 vmx_ept_vpid_cap __read_mostly;
 static uint64_t __read_mostly vmx_vmfunc;
 
@@ -259,7 +258,6 @@ static int vmx_init_vmcs_config(bool bsp
 struct vmx_caps caps = {};
 u64 _vmx_ept_vpid_cap = 0;
 u64 _vmx_misc_cap = 0;
-u32 _vmx_vmentry_control;
 u64 _vmx_vmfunc = 0;
 bool mismatch = false;
 
@@ -480,7 +478,7 @@ static int vmx_init_vmcs_config(bool bsp
 min = 0;
 opt = (VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_GUEST_EFER |
VM_ENTRY_LOAD_BNDCFGS);
-_vmx_vmentry_control = adjust_vmx_controls(
+caps.vmentry_control = adjust_vmx_controls(
 "VMEntry Control", min, opt, MSR_IA32_VMX_ENTRY_CTLS, &mismatch);
 
 if ( mismatch )
@@ -491,7 +489,6 @@ static int vmx_init_vmcs_config(bool bsp
 /* First time through. */
 vmx_caps = caps;
 vmx_ept_vpid_cap   = _vmx_ept_vpid_cap;
-vmx_vmentry_control= _vmx_vmentry_control;
 vmx_caps.basic_msr = ((uint64_t)vmx_basic_msr_high << 32) |
  vmx_basic_msr_low;
 vmx_vmfunc = _vmx_vmfunc;
@@ -531,7 +528,7 @@ static int vmx_init_vmcs_config(bool bsp
 vmx_caps.vmexit_control, caps.vmexit_control);
 mismatch |= cap_check(
 "VMEntry Control",
-vmx_vmentry_control, _vmx_vmentry_control);
+vmx_caps.vmentry_control, caps.vmentry_control);
 mismatch |= cap_check(
 "EPT and VPID Capability",
 vmx_ept_vpid_cap, _vmx_ept_vpid_cap);
@@ -1091,7 +1088,7 @@ static int construct_vmcs(struct vcpu *v
 {
 struct domain *d = v->domain;
 uint32_t vmexit_ctl = vmx_caps.vmexit_control;
-u32 vmentry_ctl = vmx_vmentry_control;
+u32 vmentry_ctl = vmx_caps.vmentry_control;
 int rc = 0;
 
 vmx_vmcs_enter(v);
@@ -2202,7 +2199,6 @@ int __init vmx_vmcs_init(void)
  * Make sure all dependent features are off as well.
  */
 memset(&vmx_caps, 0, sizeof(vmx_caps));
-vmx_vmentry_control= 0;
 vmx_ept_vpid_cap   = 0;
 vmx_vmfunc = 0;
 }
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -235,7 +235,6 @@ void vmx_vmcs_reload(struct vcpu *v);
 #define VM_ENTRY_LOAD_GUEST_PAT 0x4000
 #define VM_ENTRY_LOAD_GUEST_EFER0x8000
 #define VM_ENTRY_LOAD_BNDCFGS   0x0001
-extern u32 vmx_vmentry_control;
 
 #define SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES 0x0001U
 #define SECONDARY_EXEC_ENABLE_EPT   0x0002U
@@ -297,6 +296,7 @@ struct vmx_caps {
 uint32_t secondary_exec_control;
 uint64_t tertiary_exec_control;
 uint32_t vmexit_control;
+uint32_t vmentry_control;
 };
 extern struct vmx_caps vmx_caps;
 
@@ -325,9 +325,9 @@ extern struct vmx_caps vmx_caps;
 #define cpu_has_monitor_trap_flag \
 (vmx_caps.cpu_based_exec_control & CPU_BASED_MONITOR_TRAP_FLAG)
 #define cpu_has_vmx_pat \
-(vmx_vmentry_control & VM_ENTRY_LOAD_GUEST_PAT)
+(vmx_caps.vmentry_control & VM_ENTRY_LOAD_GUEST_PAT)
 #define cpu_has_vmx_efer \
-(vmx_vmentry_control & VM_ENTRY_LOAD_GUEST_EFER)
+(vmx_caps.vmentry_control & VM_ENTRY_LOAD_GUEST_EFER)
 #define cpu_has_vmx_unrestricted_guest \
 (vmx_caps.secondary_exec_control & SECONDARY_EXEC_UNRESTRICTED_GUEST)
 #define vmx_unrestricted_guest(v)   \
@@ -355,7 +355,7 @@ extern struct vmx_caps vmx_caps;
 (vmx_caps.secondary_exec_control & SECONDARY_EXEC_ENABLE_PML)
 #define cpu_has_vmx_mpx \
 ((vmx_caps.vmexit_control & VM_EXIT_CLEAR_BNDCFGS) && \
- (vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS))
+ (vmx_caps.vmentry_control & VM_ENTRY_LOAD_BNDCFGS))
 #define cpu_has_vmx_xsaves \
 (vmx_caps.secondary_exec_control & SECONDARY_EXEC_XSAVES)
 #define cpu_has_vmx_tsc_scaling \




Re: [PATCH 5/6] nestedsvm: Remove bogus debug message from nestedsvm_check_intercepts

2024-02-26 Thread Andrew Cooper
On 06/02/2024 1:20 am, George Dunlap wrote:
> Changeset ef3e8db8068 ("x86/hvm: Corrections and improvements to
> unhandled vmexit logging") introduced a printk to the default path of
> the switch statement in nestedsvm_check_intercepts(), complaining of
> an unknown exit reason.
>
> Unfortunately, the "core" switch statement which is meant to handle
> all vmexit reasons is in nsvm_vmcb_guest_intercepts_exitcode(); the
> switch statement in nestedsvm_check_intercepts() is only meant to
> superimpose on top of that some special-casing for how to interaction
> between L1 and L0 vmexits.
>
> Remove the printk, and add a comment to prevent future confusion.
>
> Signed-off-by: George Dunlap 

Erm...   The addition of this printk was very deliberate, to point out
where security fixes are needed.

It's not bogus in the slightest.  It is an error for exit reasons to not
be inspected for safety in this path.

Please revert this patch.

~Andrew



[PATCH v3 12/12] VMX: convert vmx_vmfunc

2024-02-26 Thread Jan Beulich
... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -162,7 +162,6 @@ static int cf_check parse_ept_param_runt
 
 /* Dynamic (run-time adjusted) execution control flags. */
 struct vmx_caps __ro_after_init vmx_caps;
-static uint64_t __read_mostly vmx_vmfunc;
 
 static DEFINE_PER_CPU_READ_MOSTLY(paddr_t, vmxon_region);
 static DEFINE_PER_CPU(paddr_t, current_vmcs);
@@ -256,7 +255,6 @@ static int vmx_init_vmcs_config(bool bsp
 u32 vmx_basic_msr_low, vmx_basic_msr_high, min, opt;
 struct vmx_caps caps = {};
 u64 _vmx_misc_cap = 0;
-u64 _vmx_vmfunc = 0;
 bool mismatch = false;
 
 rdmsr(MSR_IA32_VMX_BASIC, vmx_basic_msr_low, vmx_basic_msr_high);
@@ -458,14 +456,14 @@ static int vmx_init_vmcs_config(bool bsp
 /* The IA32_VMX_VMFUNC MSR exists only when VMFUNC is available */
 if ( caps.secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS )
 {
-rdmsrl(MSR_IA32_VMX_VMFUNC, _vmx_vmfunc);
+rdmsrl(MSR_IA32_VMX_VMFUNC, caps.vmfunc);
 
 /*
  * VMFUNC leaf 0 (EPTP switching) must be supported.
  *
  * Or we just don't use VMFUNC.
  */
-if ( !(_vmx_vmfunc & VMX_VMFUNC_EPTP_SWITCHING) )
+if ( !(caps.vmfunc & VMX_VMFUNC_EPTP_SWITCHING) )
 caps.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
 }
 
@@ -488,7 +486,6 @@ static int vmx_init_vmcs_config(bool bsp
 vmx_caps = caps;
 vmx_caps.basic_msr = ((uint64_t)vmx_basic_msr_high << 32) |
  vmx_basic_msr_low;
-vmx_vmfunc = _vmx_vmfunc;
 
 vmx_display_features();
 
@@ -530,7 +527,7 @@ static int vmx_init_vmcs_config(bool bsp
 mismatch |= cap_check("VPID Capability", vmx_caps.vpid, caps.vpid);
 mismatch |= cap_check(
 "VMFUNC Capability",
-vmx_vmfunc, _vmx_vmfunc);
+vmx_caps.vmfunc, caps.vmfunc);
 if ( cpu_has_vmx_ins_outs_instr_info !=
  !!(vmx_basic_msr_high & (VMX_BASIC_INS_OUT_INFO >> 32)) )
 {
@@ -2195,7 +2192,6 @@ int __init vmx_vmcs_init(void)
  * Make sure all dependent features are off as well.
  */
 memset(&vmx_caps, 0, sizeof(vmx_caps));
-vmx_vmfunc = 0;
 }
 
 return ret;
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -298,6 +298,7 @@ struct vmx_caps {
 uint32_t vmentry_control;
 uint32_t ept;
 uint32_t vpid;
+uint64_t vmfunc;
 };
 extern struct vmx_caps vmx_caps;
 




[PATCH v3 11/12] VMX: convert vmx_ept_vpid_cap

2024-02-26 Thread Jan Beulich
... to fields in the capability/controls struct: Take the opportunity
and split the two halves into separate EPT and VPID fields.

Signed-off-by: Jan Beulich 
---
v3: Re-base.
v2: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -162,7 +162,6 @@ static int cf_check parse_ept_param_runt
 
 /* Dynamic (run-time adjusted) execution control flags. */
 struct vmx_caps __ro_after_init vmx_caps;
-u64 vmx_ept_vpid_cap __read_mostly;
 static uint64_t __read_mostly vmx_vmfunc;
 
 static DEFINE_PER_CPU_READ_MOSTLY(paddr_t, vmxon_region);
@@ -256,7 +255,6 @@ static int vmx_init_vmcs_config(bool bsp
 {
 u32 vmx_basic_msr_low, vmx_basic_msr_high, min, opt;
 struct vmx_caps caps = {};
-u64 _vmx_ept_vpid_cap = 0;
 u64 _vmx_misc_cap = 0;
 u64 _vmx_vmfunc = 0;
 bool mismatch = false;
@@ -365,10 +363,10 @@ static int vmx_init_vmcs_config(bool bsp
 if ( caps.secondary_exec_control & (SECONDARY_EXEC_ENABLE_EPT |
 SECONDARY_EXEC_ENABLE_VPID) )
 {
-rdmsrl(MSR_IA32_VMX_EPT_VPID_CAP, _vmx_ept_vpid_cap);
+rdmsr(MSR_IA32_VMX_EPT_VPID_CAP, caps.ept, caps.vpid);
 
 if ( !opt_ept_ad )
-_vmx_ept_vpid_cap &= ~VMX_EPT_AD_BIT;
+caps.ept &= ~VMX_EPT_AD_BIT;
 
 /*
  * Additional sanity checking before using EPT:
@@ -381,9 +379,9 @@ static int vmx_init_vmcs_config(bool bsp
  *
  * Or we just don't use EPT.
  */
-if ( !(_vmx_ept_vpid_cap & VMX_EPT_MEMORY_TYPE_WB) ||
- !(_vmx_ept_vpid_cap & VMX_EPT_WALK_LENGTH_4_SUPPORTED) ||
- !(_vmx_ept_vpid_cap & VMX_EPT_INVEPT_ALL_CONTEXT) )
+if ( !(caps.ept & VMX_EPT_MEMORY_TYPE_WB) ||
+ !(caps.ept & VMX_EPT_WALK_LENGTH_4_SUPPORTED) ||
+ !(caps.ept & VMX_EPT_INVEPT_ALL_CONTEXT) )
 caps.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_EPT;
 
 /*
@@ -392,11 +390,11 @@ static int vmx_init_vmcs_config(bool bsp
  *
  * Or we just don't use VPID.
  */
-if ( !(_vmx_ept_vpid_cap & VMX_VPID_INVVPID_ALL_CONTEXT) )
+if ( !(caps.vpid & VMX_VPID_INVVPID_ALL_CONTEXT) )
 caps.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VPID;
 
 /* EPT A/D bits is required for PML */
-if ( !(_vmx_ept_vpid_cap & VMX_EPT_AD_BIT) )
+if ( !(caps.ept & VMX_EPT_AD_BIT) )
 caps.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
 }
 
@@ -488,7 +486,6 @@ static int vmx_init_vmcs_config(bool bsp
 {
 /* First time through. */
 vmx_caps = caps;
-vmx_ept_vpid_cap   = _vmx_ept_vpid_cap;
 vmx_caps.basic_msr = ((uint64_t)vmx_basic_msr_high << 32) |
  vmx_basic_msr_low;
 vmx_vmfunc = _vmx_vmfunc;
@@ -529,9 +526,8 @@ static int vmx_init_vmcs_config(bool bsp
 mismatch |= cap_check(
 "VMEntry Control",
 vmx_caps.vmentry_control, caps.vmentry_control);
-mismatch |= cap_check(
-"EPT and VPID Capability",
-vmx_ept_vpid_cap, _vmx_ept_vpid_cap);
+mismatch |= cap_check("EPT Capability", vmx_caps.ept, caps.ept);
+mismatch |= cap_check("VPID Capability", vmx_caps.vpid, caps.vpid);
 mismatch |= cap_check(
 "VMFUNC Capability",
 vmx_vmfunc, _vmx_vmfunc);
@@ -2199,7 +2195,6 @@ int __init vmx_vmcs_init(void)
  * Make sure all dependent features are off as well.
  */
 memset(&vmx_caps, 0, sizeof(vmx_caps));
-vmx_ept_vpid_cap   = 0;
 vmx_vmfunc = 0;
 }
 
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -274,12 +274,11 @@ void vmx_vmcs_reload(struct vcpu *v);
 #define VMX_EPT_AD_BIT  0x0020
 #define VMX_EPT_INVEPT_SINGLE_CONTEXT   0x0200
 #define VMX_EPT_INVEPT_ALL_CONTEXT  0x0400
-#define VMX_VPID_INVVPID_INSTRUCTION 0x001ULL
-#define VMX_VPID_INVVPID_INDIVIDUAL_ADDR 0x100ULL
-#define VMX_VPID_INVVPID_SINGLE_CONTEXT  0x200ULL
-#define VMX_VPID_INVVPID_ALL_CONTEXT 0x400ULL
-#define VMX_VPID_INVVPID_SINGLE_CONTEXT_RETAINING_GLOBAL 0x800ULL
-extern u64 vmx_ept_vpid_cap;
+#define VMX_VPID_INVVPID_INSTRUCTION0x0001
+#define VMX_VPID_INVVPID_INDIVIDUAL_ADDR0x0100
+#define VMX_VPID_INVVPID_SINGLE_CONTEXT 0x0200
+#define VMX_VPID_INVVPID_ALL_CONTEXT0x0400
+#define VMX_VPID_INVVPID_SINGLE_CONTEXT_RETAINING_GLOBAL0x0800
 
 #define VMX_MISC_ACTIVITY_MASK  0x01c0
 #define VMX_MISC_PROC_TRACE 0x4000
@@ -297,6 +296,8 @@ struct vmx_caps {
 uint64_t 

[PATCH v3 09/12] VMX: convert vmx_vmexit_control

2024-02-26 Thread Jan Beulich
... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -162,7 +162,6 @@ static int cf_check parse_ept_param_runt
 
 /* Dynamic (run-time adjusted) execution control flags. */
 struct vmx_caps __ro_after_init vmx_caps;
-u32 vmx_vmexit_control __read_mostly;
 u32 vmx_vmentry_control __read_mostly;
 u64 vmx_ept_vpid_cap __read_mostly;
 static uint64_t __read_mostly vmx_vmfunc;
@@ -260,7 +259,6 @@ static int vmx_init_vmcs_config(bool bsp
 struct vmx_caps caps = {};
 u64 _vmx_ept_vpid_cap = 0;
 u64 _vmx_misc_cap = 0;
-u32 _vmx_vmexit_control;
 u32 _vmx_vmentry_control;
 u64 _vmx_vmfunc = 0;
 bool mismatch = false;
@@ -442,7 +440,7 @@ static int vmx_init_vmcs_config(bool bsp
 opt = (VM_EXIT_SAVE_GUEST_PAT | VM_EXIT_LOAD_HOST_PAT |
VM_EXIT_LOAD_HOST_EFER | VM_EXIT_CLEAR_BNDCFGS);
 min |= VM_EXIT_IA32E_MODE;
-_vmx_vmexit_control = adjust_vmx_controls(
+caps.vmexit_control = adjust_vmx_controls(
 "VMExit Control", min, opt, MSR_IA32_VMX_EXIT_CTLS, &mismatch);
 
 /*
@@ -493,7 +491,6 @@ static int vmx_init_vmcs_config(bool bsp
 /* First time through. */
 vmx_caps = caps;
 vmx_ept_vpid_cap   = _vmx_ept_vpid_cap;
-vmx_vmexit_control = _vmx_vmexit_control;
 vmx_vmentry_control= _vmx_vmentry_control;
 vmx_caps.basic_msr = ((uint64_t)vmx_basic_msr_high << 32) |
  vmx_basic_msr_low;
@@ -531,7 +528,7 @@ static int vmx_init_vmcs_config(bool bsp
 vmx_caps.tertiary_exec_control, caps.tertiary_exec_control);
 mismatch |= cap_check(
 "VMExit Control",
-vmx_vmexit_control, _vmx_vmexit_control);
+vmx_caps.vmexit_control, caps.vmexit_control);
 mismatch |= cap_check(
 "VMEntry Control",
 vmx_vmentry_control, _vmx_vmentry_control);
@@ -1093,7 +1090,7 @@ void nocall vmx_asm_vmexit_handler(void)
 static int construct_vmcs(struct vcpu *v)
 {
 struct domain *d = v->domain;
-u32 vmexit_ctl = vmx_vmexit_control;
+uint32_t vmexit_ctl = vmx_caps.vmexit_control;
 u32 vmentry_ctl = vmx_vmentry_control;
 int rc = 0;
 
@@ -2205,7 +2202,6 @@ int __init vmx_vmcs_init(void)
  * Make sure all dependent features are off as well.
  */
 memset(&vmx_caps, 0, sizeof(vmx_caps));
-vmx_vmexit_control = 0;
 vmx_vmentry_control= 0;
 vmx_ept_vpid_cap   = 0;
 vmx_vmfunc = 0;
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1060,7 +1060,7 @@ static void load_shadow_control(struct v
 nvmx_update_pin_control(v, vmx_caps.pin_based_exec_control);
 vmx_update_cpu_exec_control(v);
 vmx_update_secondary_exec_control(v);
-nvmx_update_exit_control(v, vmx_vmexit_control);
+nvmx_update_exit_control(v, vmx_caps.vmexit_control);
 nvmx_update_entry_control(v);
 vmx_update_exception_bitmap(v);
 nvmx_update_apic_access_address(v);
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -227,7 +227,6 @@ void vmx_vmcs_reload(struct vcpu *v);
 #define VM_EXIT_LOAD_HOST_EFER  0x0020
 #define VM_EXIT_SAVE_PREEMPT_TIMER  0x0040
 #define VM_EXIT_CLEAR_BNDCFGS   0x0080
-extern u32 vmx_vmexit_control;
 
 #define VM_ENTRY_IA32E_MODE 0x0200
 #define VM_ENTRY_SMM0x0400
@@ -297,6 +296,7 @@ struct vmx_caps {
 uint32_t cpu_based_exec_control;
 uint32_t secondary_exec_control;
 uint64_t tertiary_exec_control;
+uint32_t vmexit_control;
 };
 extern struct vmx_caps vmx_caps;
 
@@ -354,7 +354,7 @@ extern struct vmx_caps vmx_caps;
 #define cpu_has_vmx_pml \
 (vmx_caps.secondary_exec_control & SECONDARY_EXEC_ENABLE_PML)
 #define cpu_has_vmx_mpx \
-((vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) && \
+((vmx_caps.vmexit_control & VM_EXIT_CLEAR_BNDCFGS) && \
  (vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS))
 #define cpu_has_vmx_xsaves \
 (vmx_caps.secondary_exec_control & SECONDARY_EXEC_XSAVES)




[PATCH v3 08/12] VMX: convert vmx_tertiary_exec_control

2024-02-26 Thread Jan Beulich
... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich 
---
v3: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -162,7 +162,6 @@ static int cf_check parse_ept_param_runt
 
 /* Dynamic (run-time adjusted) execution control flags. */
 struct vmx_caps __ro_after_init vmx_caps;
-uint64_t vmx_tertiary_exec_control __read_mostly;
 u32 vmx_vmexit_control __read_mostly;
 u32 vmx_vmentry_control __read_mostly;
 u64 vmx_ept_vpid_cap __read_mostly;
@@ -259,7 +258,6 @@ static int vmx_init_vmcs_config(bool bsp
 {
 u32 vmx_basic_msr_low, vmx_basic_msr_high, min, opt;
 struct vmx_caps caps = {};
-uint64_t _vmx_tertiary_exec_control = 0;
 u64 _vmx_ept_vpid_cap = 0;
 u64 _vmx_misc_cap = 0;
 u32 _vmx_vmexit_control;
@@ -362,7 +360,7 @@ static int vmx_init_vmcs_config(bool bsp
 {
 uint64_t opt = 0;
 
-_vmx_tertiary_exec_control = adjust_vmx_controls2(
+caps.tertiary_exec_control = adjust_vmx_controls2(
 "Tertiary Exec Control", 0, opt,
 MSR_IA32_VMX_PROCBASED_CTLS3, &mismatch);
 }
@@ -494,7 +492,6 @@ static int vmx_init_vmcs_config(bool bsp
 {
 /* First time through. */
 vmx_caps = caps;
-vmx_tertiary_exec_control  = _vmx_tertiary_exec_control;
 vmx_ept_vpid_cap   = _vmx_ept_vpid_cap;
 vmx_vmexit_control = _vmx_vmexit_control;
 vmx_vmentry_control= _vmx_vmentry_control;
@@ -531,7 +528,7 @@ static int vmx_init_vmcs_config(bool bsp
 vmx_caps.secondary_exec_control, caps.secondary_exec_control);
 mismatch |= cap_check(
 "Tertiary Exec Control",
-vmx_tertiary_exec_control, _vmx_tertiary_exec_control);
+vmx_caps.tertiary_exec_control, caps.tertiary_exec_control);
 mismatch |= cap_check(
 "VMExit Control",
 vmx_vmexit_control, _vmx_vmexit_control);
@@ -1110,7 +1107,7 @@ static int construct_vmcs(struct vcpu *v
 v->arch.hvm.vmx.exec_control |= CPU_BASED_RDTSC_EXITING;
 
 v->arch.hvm.vmx.secondary_exec_control = vmx_caps.secondary_exec_control;
-v->arch.hvm.vmx.tertiary_exec_control  = vmx_tertiary_exec_control;
+v->arch.hvm.vmx.tertiary_exec_control  = vmx_caps.tertiary_exec_control;
 
 /*
  * Disable features which we don't want active by default:
@@ -2208,7 +2205,6 @@ int __init vmx_vmcs_init(void)
  * Make sure all dependent features are off as well.
  */
 memset(&vmx_caps, 0, sizeof(vmx_caps));
-vmx_tertiary_exec_control  = 0;
 vmx_vmexit_control = 0;
 vmx_vmentry_control= 0;
 vmx_ept_vpid_cap   = 0;
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -265,7 +265,6 @@ extern u32 vmx_vmentry_control;
 #define TERTIARY_EXEC_GUEST_PAGING_VERIFY   BIT(3, UL)
 #define TERTIARY_EXEC_IPI_VIRT  BIT(4, UL)
 #define TERTIARY_EXEC_VIRT_SPEC_CTRLBIT(7, UL)
-extern uint64_t vmx_tertiary_exec_control;
 
 #define VMX_EPT_EXEC_ONLY_SUPPORTED 0x0001
 #define VMX_EPT_WALK_LENGTH_4_SUPPORTED 0x0040
@@ -297,6 +296,7 @@ struct vmx_caps {
 uint32_t pin_based_exec_control;
 uint32_t cpu_based_exec_control;
 uint32_t secondary_exec_control;
+uint64_t tertiary_exec_control;
 };
 extern struct vmx_caps vmx_caps;
 




[PATCH v3 07/12] VMX: convert vmx_secondary_exec_control

2024-02-26 Thread Jan Beulich
... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -162,7 +162,6 @@ static int cf_check parse_ept_param_runt
 
 /* Dynamic (run-time adjusted) execution control flags. */
 struct vmx_caps __ro_after_init vmx_caps;
-u32 vmx_secondary_exec_control __read_mostly;
 uint64_t vmx_tertiary_exec_control __read_mostly;
 u32 vmx_vmexit_control __read_mostly;
 u32 vmx_vmentry_control __read_mostly;
@@ -260,7 +259,6 @@ static int vmx_init_vmcs_config(bool bsp
 {
 u32 vmx_basic_msr_low, vmx_basic_msr_high, min, opt;
 struct vmx_caps caps = {};
-u32 _vmx_secondary_exec_control = 0;
 uint64_t _vmx_tertiary_exec_control = 0;
 u64 _vmx_ept_vpid_cap = 0;
 u64 _vmx_misc_cap = 0;
@@ -355,7 +353,7 @@ static int vmx_init_vmcs_config(bool bsp
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
 
-_vmx_secondary_exec_control = adjust_vmx_controls(
+caps.secondary_exec_control = adjust_vmx_controls(
 "Secondary Exec Control", min, opt,
 MSR_IA32_VMX_PROCBASED_CTLS2, &mismatch);
 }
@@ -370,7 +368,7 @@ static int vmx_init_vmcs_config(bool bsp
 }
 
 /* The IA32_VMX_EPT_VPID_CAP MSR exists only when EPT or VPID available */
-if ( _vmx_secondary_exec_control & (SECONDARY_EXEC_ENABLE_EPT |
+if ( caps.secondary_exec_control & (SECONDARY_EXEC_ENABLE_EPT |
 SECONDARY_EXEC_ENABLE_VPID) )
 {
 rdmsrl(MSR_IA32_VMX_EPT_VPID_CAP, _vmx_ept_vpid_cap);
@@ -392,7 +390,7 @@ static int vmx_init_vmcs_config(bool bsp
 if ( !(_vmx_ept_vpid_cap & VMX_EPT_MEMORY_TYPE_WB) ||
  !(_vmx_ept_vpid_cap & VMX_EPT_WALK_LENGTH_4_SUPPORTED) ||
  !(_vmx_ept_vpid_cap & VMX_EPT_INVEPT_ALL_CONTEXT) )
-_vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_EPT;
+caps.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_EPT;
 
 /*
  * the CPU must support INVVPID all context invalidation, because we
@@ -401,14 +399,14 @@ static int vmx_init_vmcs_config(bool bsp
  * Or we just don't use VPID.
  */
 if ( !(_vmx_ept_vpid_cap & VMX_VPID_INVVPID_ALL_CONTEXT) )
-_vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VPID;
+caps.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VPID;
 
 /* EPT A/D bits is required for PML */
 if ( !(_vmx_ept_vpid_cap & VMX_EPT_AD_BIT) )
-_vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
+caps.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
 }
 
-if ( _vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT )
+if ( caps.secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT )
 {
 /*
  * To use EPT we expect to be able to clear certain intercepts.
@@ -421,25 +419,25 @@ static int vmx_init_vmcs_config(bool bsp
 if ( must_be_one & (CPU_BASED_INVLPG_EXITING |
 CPU_BASED_CR3_LOAD_EXITING |
 CPU_BASED_CR3_STORE_EXITING) )
-_vmx_secondary_exec_control &=
+caps.secondary_exec_control &=
 ~(SECONDARY_EXEC_ENABLE_EPT |
   SECONDARY_EXEC_UNRESTRICTED_GUEST);
 }
 
 /* PML cannot be supported if EPT is not used */
-if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT) )
-_vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
+if ( !(caps.secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT) )
+caps.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
 
 /* Turn off opt_ept_pml if PML feature is not present. */
-if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_PML) )
+if ( !(caps.secondary_exec_control & SECONDARY_EXEC_ENABLE_PML) )
 opt_ept_pml = false;
 
-if ( (_vmx_secondary_exec_control & SECONDARY_EXEC_PAUSE_LOOP_EXITING) &&
+if ( (caps.secondary_exec_control & SECONDARY_EXEC_PAUSE_LOOP_EXITING) &&
   ple_gap == 0 )
 {
 if ( !vmx_caps.pin_based_exec_control )
 printk(XENLOG_INFO "Disable Pause-Loop Exiting.\n");
-_vmx_secondary_exec_control &= ~ SECONDARY_EXEC_PAUSE_LOOP_EXITING;
+caps.secondary_exec_control &= ~ SECONDARY_EXEC_PAUSE_LOOP_EXITING;
 }
 
 min = VM_EXIT_ACK_INTR_ON_EXIT;
@@ -454,7 +452,7 @@ static int vmx_init_vmcs_config(bool bsp
  * delivery" and "acknowledge interrupt on exit" is set. For the latter
  * is a minimal requirement, only check the former, which is optional.
  */
-if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) 
)
+if ( !(caps.secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) 
)
 caps.pin_based_exec_control &= ~PIN_BASED_POSTED_INTERRUPT;
 
 if ( iommu_intpost &&
@@ -466,7 +464,7 @@ static int vmx_init_

[PATCH v3 06/12] VMX: convert vmx_cpu_based_exec_control

2024-02-26 Thread Jan Beulich
... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -162,7 +162,6 @@ static int cf_check parse_ept_param_runt
 
 /* Dynamic (run-time adjusted) execution control flags. */
 struct vmx_caps __ro_after_init vmx_caps;
-u32 vmx_cpu_based_exec_control __read_mostly;
 u32 vmx_secondary_exec_control __read_mostly;
 uint64_t vmx_tertiary_exec_control __read_mostly;
 u32 vmx_vmexit_control __read_mostly;
@@ -261,7 +260,6 @@ static int vmx_init_vmcs_config(bool bsp
 {
 u32 vmx_basic_msr_low, vmx_basic_msr_high, min, opt;
 struct vmx_caps caps = {};
-u32 _vmx_cpu_based_exec_control;
 u32 _vmx_secondary_exec_control = 0;
 uint64_t _vmx_tertiary_exec_control = 0;
 u64 _vmx_ept_vpid_cap = 0;
@@ -299,12 +297,12 @@ static int vmx_init_vmcs_config(bool bsp
CPU_BASED_MONITOR_TRAP_FLAG |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
CPU_BASED_ACTIVATE_TERTIARY_CONTROLS);
-_vmx_cpu_based_exec_control = adjust_vmx_controls(
+caps.cpu_based_exec_control = adjust_vmx_controls(
 "CPU-Based Exec Control", min, opt,
 MSR_IA32_VMX_PROCBASED_CTLS, &mismatch);
-_vmx_cpu_based_exec_control &= ~CPU_BASED_RDTSC_EXITING;
-if ( _vmx_cpu_based_exec_control & CPU_BASED_TPR_SHADOW )
-_vmx_cpu_based_exec_control &=
+caps.cpu_based_exec_control &= ~CPU_BASED_RDTSC_EXITING;
+if ( caps.cpu_based_exec_control & CPU_BASED_TPR_SHADOW )
+caps.cpu_based_exec_control &=
 ~(CPU_BASED_CR8_LOAD_EXITING | CPU_BASED_CR8_STORE_EXITING);
 
 rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
@@ -321,7 +319,7 @@ static int vmx_init_vmcs_config(bool bsp
 return -EINVAL;
 }
 
-if ( _vmx_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS )
+if ( caps.cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS )
 {
 min = 0;
 opt = (SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
@@ -351,7 +349,7 @@ static int vmx_init_vmcs_config(bool bsp
  * "APIC Register Virtualization" and "Virtual Interrupt Delivery"
  * can be set only when "use TPR shadow" is set
  */
-if ( (_vmx_cpu_based_exec_control & CPU_BASED_TPR_SHADOW) &&
+if ( (caps.cpu_based_exec_control & CPU_BASED_TPR_SHADOW) &&
  opt_apicv_enabled )
 opt |= SECONDARY_EXEC_APIC_REGISTER_VIRT |
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
@@ -362,7 +360,7 @@ static int vmx_init_vmcs_config(bool bsp
 MSR_IA32_VMX_PROCBASED_CTLS2, &mismatch);
 }
 
-if ( _vmx_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS )
+if ( caps.cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS )
 {
 uint64_t opt = 0;
 
@@ -498,7 +496,6 @@ static int vmx_init_vmcs_config(bool bsp
 {
 /* First time through. */
 vmx_caps = caps;
-vmx_cpu_based_exec_control = _vmx_cpu_based_exec_control;
 vmx_secondary_exec_control = _vmx_secondary_exec_control;
 vmx_tertiary_exec_control  = _vmx_tertiary_exec_control;
 vmx_ept_vpid_cap   = _vmx_ept_vpid_cap;
@@ -531,7 +528,7 @@ static int vmx_init_vmcs_config(bool bsp
 vmx_caps.pin_based_exec_control, caps.pin_based_exec_control);
 mismatch |= cap_check(
 "CPU-Based Exec Control",
-vmx_cpu_based_exec_control, _vmx_cpu_based_exec_control);
+vmx_caps.cpu_based_exec_control, caps.cpu_based_exec_control);
 mismatch |= cap_check(
 "Secondary Exec Control",
 vmx_secondary_exec_control, _vmx_secondary_exec_control);
@@ -,7 +1108,7 @@ static int construct_vmcs(struct vcpu *v
 /* VMCS controls. */
 __vmwrite(PIN_BASED_VM_EXEC_CONTROL, vmx_caps.pin_based_exec_control);
 
-v->arch.hvm.vmx.exec_control = vmx_cpu_based_exec_control;
+v->arch.hvm.vmx.exec_control = vmx_caps.cpu_based_exec_control;
 if ( d->arch.vtsc && !cpu_has_vmx_tsc_scaling )
 v->arch.hvm.vmx.exec_control |= CPU_BASED_RDTSC_EXITING;
 
@@ -2214,7 +2211,6 @@ int __init vmx_vmcs_init(void)
  * Make sure all dependent features are off as well.
  */
 memset(&vmx_caps, 0, sizeof(vmx_caps));
-vmx_cpu_based_exec_control = 0;
 vmx_secondary_exec_control = 0;
 vmx_tertiary_exec_control  = 0;
 vmx_vmexit_control = 0;
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -210,7 +210,6 @@ void vmx_vmcs_reload(struct vcpu *v);
 #define CPU_BASED_MONITOR_EXITING 0x2000U
 #define CPU_BASED_PAUSE_EXITING   0x4000U
 #define CPU_BASED_ACTIVATE_SECONDARY_CONTROLS 0x8000U
-extern u32 vmx_cpu_based_exec_control;
 
 #define PIN_BASED_EXT_INTR_MASK 0x0001
 #define PIN_BASED_NMI_EXITING   0x0008
@@ -297,6 +296,7 @@ extern u64 vmx_ep

[PATCH v3 05/12] VMX: convert vmx_pin_based_exec_control

2024-02-26 Thread Jan Beulich
... to a field in the capability/controls struct. Use an instance of
that struct also in vmx_init_vmcs_config().

Signed-off-by: Jan Beulich 
---
v3: Add initializer to vmx_init_vmcs_config() new local var.
v2: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -162,7 +162,6 @@ static int cf_check parse_ept_param_runt
 
 /* Dynamic (run-time adjusted) execution control flags. */
 struct vmx_caps __ro_after_init vmx_caps;
-u32 vmx_pin_based_exec_control __read_mostly;
 u32 vmx_cpu_based_exec_control __read_mostly;
 u32 vmx_secondary_exec_control __read_mostly;
 uint64_t vmx_tertiary_exec_control __read_mostly;
@@ -261,7 +260,7 @@ static bool cap_check(
 static int vmx_init_vmcs_config(bool bsp)
 {
 u32 vmx_basic_msr_low, vmx_basic_msr_high, min, opt;
-u32 _vmx_pin_based_exec_control;
+struct vmx_caps caps = {};
 u32 _vmx_cpu_based_exec_control;
 u32 _vmx_secondary_exec_control = 0;
 uint64_t _vmx_tertiary_exec_control = 0;
@@ -278,7 +277,7 @@ static int vmx_init_vmcs_config(bool bsp
PIN_BASED_NMI_EXITING);
 opt = (PIN_BASED_VIRTUAL_NMIS |
PIN_BASED_POSTED_INTERRUPT);
-_vmx_pin_based_exec_control = adjust_vmx_controls(
+caps.pin_based_exec_control = adjust_vmx_controls(
 "Pin-Based Exec Control", min, opt,
 MSR_IA32_VMX_PINBASED_CTLS, &mismatch);
 
@@ -440,7 +439,7 @@ static int vmx_init_vmcs_config(bool bsp
 if ( (_vmx_secondary_exec_control & SECONDARY_EXEC_PAUSE_LOOP_EXITING) &&
   ple_gap == 0 )
 {
-if ( !vmx_pin_based_exec_control )
+if ( !vmx_caps.pin_based_exec_control )
 printk(XENLOG_INFO "Disable Pause-Loop Exiting.\n");
 _vmx_secondary_exec_control &= ~ SECONDARY_EXEC_PAUSE_LOOP_EXITING;
 }
@@ -458,10 +457,10 @@ static int vmx_init_vmcs_config(bool bsp
  * is a minimal requirement, only check the former, which is optional.
  */
 if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) 
)
-_vmx_pin_based_exec_control &= ~PIN_BASED_POSTED_INTERRUPT;
+caps.pin_based_exec_control &= ~PIN_BASED_POSTED_INTERRUPT;
 
 if ( iommu_intpost &&
- !(_vmx_pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT) )
+ !(caps.pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT) )
 {
 printk("Intel VT-d Posted Interrupt is disabled for CPU-side Posted "
"Interrupt is not enabled\n");
@@ -495,10 +494,10 @@ static int vmx_init_vmcs_config(bool bsp
 if ( mismatch )
 return -EINVAL;
 
-if ( !vmx_pin_based_exec_control )
+if ( !vmx_caps.pin_based_exec_control )
 {
 /* First time through. */
-vmx_pin_based_exec_control = _vmx_pin_based_exec_control;
+vmx_caps = caps;
 vmx_cpu_based_exec_control = _vmx_cpu_based_exec_control;
 vmx_secondary_exec_control = _vmx_secondary_exec_control;
 vmx_tertiary_exec_control  = _vmx_tertiary_exec_control;
@@ -529,7 +528,7 @@ static int vmx_init_vmcs_config(bool bsp
 vmcs_revision_id, vmx_basic_msr_low & VMX_BASIC_REVISION_MASK);
 mismatch |= cap_check(
 "Pin-Based Exec Control",
-vmx_pin_based_exec_control, _vmx_pin_based_exec_control);
+vmx_caps.pin_based_exec_control, caps.pin_based_exec_control);
 mismatch |= cap_check(
 "CPU-Based Exec Control",
 vmx_cpu_based_exec_control, _vmx_cpu_based_exec_control);
@@ -1110,7 +1109,7 @@ static int construct_vmcs(struct vcpu *v
 vmx_vmcs_enter(v);
 
 /* VMCS controls. */
-__vmwrite(PIN_BASED_VM_EXEC_CONTROL, vmx_pin_based_exec_control);
+__vmwrite(PIN_BASED_VM_EXEC_CONTROL, vmx_caps.pin_based_exec_control);
 
 v->arch.hvm.vmx.exec_control = vmx_cpu_based_exec_control;
 if ( d->arch.vtsc && !cpu_has_vmx_tsc_scaling )
@@ -2136,7 +2135,7 @@ void vmcs_dump_vcpu(struct vcpu *v)
 printk("TSC Offset = 0x%016lx  TSC Multiplier = 0x%016lx\n",
vmr(TSC_OFFSET), vmr(TSC_MULTIPLIER));
 if ( (v->arch.hvm.vmx.exec_control & CPU_BASED_TPR_SHADOW) ||
- (vmx_pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT) )
+ (vmx_caps.pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT) )
 printk("TPR Threshold = 0x%02x  PostedIntrVec = 0x%02x\n",
vmr32(TPR_THRESHOLD), vmr16(POSTED_INTR_NOTIFICATION_VECTOR));
 if ( (v->arch.hvm.vmx.secondary_exec_control &
@@ -2215,7 +2214,6 @@ int __init vmx_vmcs_init(void)
  * Make sure all dependent features are off as well.
  */
 memset(&vmx_caps, 0, sizeof(vmx_caps));
-vmx_pin_based_exec_control = 0;
 vmx_cpu_based_exec_control = 0;
 vmx_secondary_exec_control = 0;
 vmx_tertiary_exec_control  = 0;
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1057,7 +1057,7 @@ static void load_shadow_control(struct v
  * and EXCEPTION
  * Enforce the removed features
  */
-  

[PATCH v3 04/12] VMX: convert vmx_basic_msr

2024-02-26 Thread Jan Beulich
... to a struct field, which is then going to be accompanied by other
capability/control data presently living in individual variables. As
this structure isn't supposed to be altered post-boot, put it in
.data.ro_after_init right away.

Suggested-by: Roger Pau Monné 
Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -161,6 +161,7 @@ static int cf_check parse_ept_param_runt
 #endif
 
 /* Dynamic (run-time adjusted) execution control flags. */
+struct vmx_caps __ro_after_init vmx_caps;
 u32 vmx_pin_based_exec_control __read_mostly;
 u32 vmx_cpu_based_exec_control __read_mostly;
 u32 vmx_secondary_exec_control __read_mostly;
@@ -175,8 +176,7 @@ static DEFINE_PER_CPU(paddr_t, current_v
 static DEFINE_PER_CPU(struct list_head, active_vmcs_list);
 DEFINE_PER_CPU(bool, vmxon);
 
-#define vmcs_revision_id (vmx_basic_msr & VMX_BASIC_REVISION_MASK)
-u64 __read_mostly vmx_basic_msr;
+#define vmcs_revision_id (vmx_caps.basic_msr & VMX_BASIC_REVISION_MASK)
 
 static void __init vmx_display_features(void)
 {
@@ -505,8 +505,8 @@ static int vmx_init_vmcs_config(bool bsp
 vmx_ept_vpid_cap   = _vmx_ept_vpid_cap;
 vmx_vmexit_control = _vmx_vmexit_control;
 vmx_vmentry_control= _vmx_vmentry_control;
-vmx_basic_msr  = ((u64)vmx_basic_msr_high << 32) |
- vmx_basic_msr_low;
+vmx_caps.basic_msr = ((uint64_t)vmx_basic_msr_high << 32) |
+ vmx_basic_msr_low;
 vmx_vmfunc = _vmx_vmfunc;
 
 vmx_display_features();
@@ -560,7 +560,7 @@ static int vmx_init_vmcs_config(bool bsp
 mismatch = 1;
 }
 if ( (vmx_basic_msr_high & (VMX_BASIC_VMCS_SIZE_MASK >> 32)) !=
- ((vmx_basic_msr & VMX_BASIC_VMCS_SIZE_MASK) >> 32) )
+ ((vmx_caps.basic_msr & VMX_BASIC_VMCS_SIZE_MASK) >> 32) )
 {
 printk("VMX: CPU%d unexpected VMCS size %Lu\n",
smp_processor_id(),
@@ -2214,7 +2214,7 @@ int __init vmx_vmcs_init(void)
  * _vmx_vcpu_up() may have made it past feature identification.
  * Make sure all dependent features are off as well.
  */
-vmx_basic_msr  = 0;
+memset(&vmx_caps, 0, sizeof(vmx_caps));
 vmx_pin_based_exec_control = 0;
 vmx_cpu_based_exec_control = 0;
 vmx_secondary_exec_control = 0;
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -294,6 +294,12 @@ extern u64 vmx_ept_vpid_cap;
 
 #define VMX_TSC_MULTIPLIER_MAX  0xULL
 
+/* Capabilities and dynamic (run-time adjusted) execution control flags. */
+struct vmx_caps {
+uint64_t basic_msr;
+};
+extern struct vmx_caps vmx_caps;
+
 #define cpu_has_wbinvd_exiting \
 (vmx_secondary_exec_control & SECONDARY_EXEC_WBINVD_EXITING)
 #define cpu_has_vmx_virtualize_apic_accesses \
@@ -379,9 +385,8 @@ extern u64 vmx_ept_vpid_cap;
  */
 #define VMX_BASIC_DEFAULT1_ZERO(1ULL << 55)
 
-extern u64 vmx_basic_msr;
 #define cpu_has_vmx_ins_outs_instr_info \
-(!!(vmx_basic_msr & VMX_BASIC_INS_OUT_INFO))
+(!!(vmx_caps.basic_msr & VMX_BASIC_INS_OUT_INFO))
 
 /* Guest interrupt status */
 #define VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK  0x0FF
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1561,7 +1561,7 @@ static int nvmx_handle_vmxon(struct cpu_
 rc = hvm_copy_from_guest_phys(&nvmcs_revid, gpa, sizeof(nvmcs_revid));
 if ( rc != HVMTRANS_okay ||
  (nvmcs_revid & ~VMX_BASIC_REVISION_MASK) ||
- ((nvmcs_revid ^ vmx_basic_msr) & VMX_BASIC_REVISION_MASK) )
+ ((nvmcs_revid ^ vmx_caps.basic_msr) & VMX_BASIC_REVISION_MASK) )
 {
 vmfail_invalid(regs);
 return X86EMUL_OKAY;
@@ -1799,7 +1799,7 @@ static int nvmx_handle_vmptrld(struct cp
 {
 struct vmcs_struct *vvmcs = vvmcx;
 
-if ( ((vvmcs->revision_id ^ vmx_basic_msr) &
+if ( ((vvmcs->revision_id ^ vmx_caps.basic_msr) &
   VMX_BASIC_REVISION_MASK) ||
  (!cpu_has_vmx_vmcs_shadowing &&
   (vvmcs->revision_id & ~VMX_BASIC_REVISION_MASK)) )
@@ -2192,7 +2192,7 @@ int nvmx_msr_read_intercept(unsigned int
 case MSR_IA32_VMX_TRUE_PROCBASED_CTLS:
 case MSR_IA32_VMX_TRUE_EXIT_CTLS:
 case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
-if ( !(vmx_basic_msr & VMX_BASIC_DEFAULT1_ZERO) )
+if ( !(vmx_caps.basic_msr & VMX_BASIC_DEFAULT1_ZERO) )
 return 0;
 break;
 




[PATCH v3 03/12] VMX: drop vmcs_revision_id

2024-02-26 Thread Jan Beulich
It's effectively redundant with vmx_basic_msr. For the #define
replacement to work, struct vmcs_struct's respective field name also
needs to change: Drop the not really meaningful "vmcs_" prefix from it.

Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -175,7 +175,7 @@ static DEFINE_PER_CPU(paddr_t, current_v
 static DEFINE_PER_CPU(struct list_head, active_vmcs_list);
 DEFINE_PER_CPU(bool, vmxon);
 
-static u32 vmcs_revision_id __read_mostly;
+#define vmcs_revision_id (vmx_basic_msr & VMX_BASIC_REVISION_MASK)
 u64 __read_mostly vmx_basic_msr;
 
 static void __init vmx_display_features(void)
@@ -498,7 +498,6 @@ static int vmx_init_vmcs_config(bool bsp
 if ( !vmx_pin_based_exec_control )
 {
 /* First time through. */
-vmcs_revision_id   = vmx_basic_msr_low & 
VMX_BASIC_REVISION_MASK;
 vmx_pin_based_exec_control = _vmx_pin_based_exec_control;
 vmx_cpu_based_exec_control = _vmx_cpu_based_exec_control;
 vmx_secondary_exec_control = _vmx_secondary_exec_control;
@@ -610,7 +609,7 @@ static paddr_t vmx_alloc_vmcs(void)
 
 vmcs = __map_domain_page(pg);
 clear_page(vmcs);
-vmcs->vmcs_revision_id = vmcs_revision_id;
+vmcs->revision_id = vmcs_revision_id;
 unmap_domain_page(vmcs);
 
 return page_to_maddr(pg);
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1166,7 +1166,7 @@ static void nvmx_set_vmcs_pointer(struct
 paddr_t vvmcs_maddr = v->arch.hvm.vmx.vmcs_shadow_maddr;
 
 __vmpclear(vvmcs_maddr);
-vvmcs->vmcs_revision_id |= VMCS_RID_TYPE_MASK;
+vvmcs->revision_id |= VMCS_RID_TYPE_MASK;
 v->arch.hvm.vmx.secondary_exec_control |=
 SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
 __vmwrite(SECONDARY_VM_EXEC_CONTROL,
@@ -1181,7 +1181,7 @@ static void nvmx_clear_vmcs_pointer(stru
 paddr_t vvmcs_maddr = v->arch.hvm.vmx.vmcs_shadow_maddr;
 
 __vmpclear(vvmcs_maddr);
-vvmcs->vmcs_revision_id &= ~VMCS_RID_TYPE_MASK;
+vvmcs->revision_id &= ~VMCS_RID_TYPE_MASK;
 v->arch.hvm.vmx.secondary_exec_control &=
 ~SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
 __vmwrite(SECONDARY_VM_EXEC_CONTROL,
@@ -1799,10 +1799,10 @@ static int nvmx_handle_vmptrld(struct cp
 {
 struct vmcs_struct *vvmcs = vvmcx;
 
-if ( ((vvmcs->vmcs_revision_id ^ vmx_basic_msr) &
- VMX_BASIC_REVISION_MASK) ||
+if ( ((vvmcs->revision_id ^ vmx_basic_msr) &
+  VMX_BASIC_REVISION_MASK) ||
  (!cpu_has_vmx_vmcs_shadowing &&
-  (vvmcs->vmcs_revision_id & ~VMX_BASIC_REVISION_MASK)) )
+  (vvmcs->revision_id & ~VMX_BASIC_REVISION_MASK)) )
 {
 hvm_unmap_guest_frame(vvmcx, 1);
 vmfail(regs, VMX_INSN_VMPTRLD_INCORRECT_VMCS_ID);
@@ -2214,7 +2214,7 @@ int nvmx_msr_read_intercept(unsigned int
 map_domain_page(_mfn(PFN_DOWN(v->arch.hvm.vmx.vmcs_pa)));
 
 data = (host_data & (~0ul << 32)) |
-   (vmcs->vmcs_revision_id & 0x7fff);
+   (vmcs->revision_id & 0x7fff);
 unmap_domain_page(vmcs);
 
 if ( !cpu_has_vmx_vmcs_shadowing )
--- a/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
+++ b/xen/arch/x86/include/asm/hvm/vmx/vmcs.h
@@ -17,7 +17,7 @@ int cf_check vmx_cpu_up(void);
 void cf_check vmx_cpu_down(void);
 
 struct vmcs_struct {
-u32 vmcs_revision_id;
+uint32_t revision_id;
 unsigned char data [0]; /* vmcs size is read from MSR */
 };
 




[PATCH v3 02/12] x86/HVM: improve CET-IBT pruning of ENDBR

2024-02-26 Thread Jan Beulich
__init{const,data}_cf_clobber can have an effect only for pointers
actually populated in the respective tables. While not the case for SVM
right now, VMX installs a number of pointers only under certain
conditions. Hence the respective functions would have their ENDBR purged
only when those conditions are met. Invoke "pruning" functions after
having copied the respective tables, for them to install any "missing"
pointers.

Signed-off-by: Jan Beulich 
---
This is largely cosmetic for present hardware, which when supporting
CET-IBT likely also supports all of the advanced VMX features for which
hook pointers are installed conditionally. The only case this would make
a difference there is when use of respective features was suppressed via
command line option (where available). For future hooks it may end up
relevant even by default, and it also would be if AMD started supporting
CET-IBT; right now it matters only for .pi_update_irte, as iommu_intpost
continues to default to off.

Originally I had meant to put the SVM and VMX functions in presmp-
initcalls, but hvm/{svm,vmx}/built_in.o are linked into hvm/built_in.o
before hvm/hvm.o. And I don't think I want to fiddle with link order
here.
---
v3: Re-base.
v2: Use cpu_has_xen_ibt in prune_{svm,vmx}().

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -161,10 +161,15 @@ static int __init cf_check hvm_enable(vo
 else if ( cpu_has_svm )
 fns = start_svm();
 
+if ( fns )
+hvm_funcs = *fns;
+
+prune_vmx();
+prune_svm();
+
 if ( fns == NULL )
 return 0;
 
-hvm_funcs = *fns;
 hvm_enabled = 1;
 
 printk("HVM: %s enabled\n", fns->name);
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2587,6 +2587,19 @@ const struct hvm_function_table * __init
 return &svm_function_table;
 }
 
+void __init prune_svm(void)
+{
+/*
+ * Now that svm_function_table was copied, populate all function pointers
+ * which may have been left at NULL, for __initdata_cf_clobber to have as
+ * much of an effect as possible.
+ */
+if ( !cpu_has_xen_ibt )
+return;
+
+/* Nothing at present. */
+}
+
 void asmlinkage svm_vmexit_handler(void)
 {
 struct cpu_user_regs *regs = guest_cpu_user_regs();
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3042,6 +3042,30 @@ const struct hvm_function_table * __init
 return &vmx_function_table;
 }
 
+void __init prune_vmx(void)
+{
+/*
+ * Now that vmx_function_table was copied, populate all function pointers
+ * which may have been left at NULL, for __initdata_cf_clobber to have as
+ * much of an effect as possible.
+ */
+if ( !cpu_has_xen_ibt )
+return;
+
+vmx_function_table.set_descriptor_access_exiting =
+vmx_set_descriptor_access_exiting;
+
+vmx_function_table.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap;
+vmx_function_table.process_isr= vmx_process_isr;
+vmx_function_table.handle_eoi = vmx_handle_eoi;
+
+vmx_function_table.pi_update_irte = vmx_pi_update_irte;
+
+vmx_function_table.deliver_posted_intr = vmx_deliver_posted_intr;
+vmx_function_table.sync_pir_to_irr = vmx_sync_pir_to_irr;
+vmx_function_table.test_pir= vmx_test_pir;
+}
+
 /*
  * Not all cases receive valid value in the VM-exit instruction length field.
  * Callers must know what they're doing!
--- a/xen/arch/x86/include/asm/hvm/hvm.h
+++ b/xen/arch/x86/include/asm/hvm/hvm.h
@@ -250,6 +250,9 @@ extern s8 hvm_port80_allowed;
 extern const struct hvm_function_table *start_svm(void);
 extern const struct hvm_function_table *start_vmx(void);
 
+void prune_svm(void);
+void prune_vmx(void);
+
 int hvm_domain_initialise(struct domain *d,
   const struct xen_domctl_createdomain *config);
 void hvm_domain_relinquish_resources(struct domain *d);




[PATCH v3 01/12] VMX: don't run with CR4.VMXE set when VMX could not be enabled

2024-02-26 Thread Jan Beulich
While generally benign, doing so is still at best misleading.

Signed-off-by: Jan Beulich 
---
Using set_in_cr4() seems favorable over updating mmu_cr4_features
despite the resulting redundant CR4 update. But I certainly could be
talked into going the alternative route.
---
v2: Actually clear CR4.VMXE for the BSP on the error path.

--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2961,14 +2961,18 @@ static bool __init has_if_pschange_mc(vo
 
 const struct hvm_function_table * __init start_vmx(void)
 {
-set_in_cr4(X86_CR4_VMXE);
+write_cr4(read_cr4() | X86_CR4_VMXE);
 
 if ( vmx_vmcs_init() )
 {
+write_cr4(read_cr4() & ~X86_CR4_VMXE);
 printk("VMX: failed to initialise.\n");
 return NULL;
 }
 
+/* Arrange for APs to have CR4.VMXE set early on. */
+set_in_cr4(X86_CR4_VMXE);
+
 vmx_function_table.singlestep_supported = cpu_has_monitor_trap_flag;
 
 if ( cpu_has_vmx_dt_exiting )




[PATCH v3 00/12] x86/HVM: misc tidying

2024-02-26 Thread Jan Beulich
01: VMX: don't run with CR4.VMXE set when VMX could not be enabled
02: x86/HVM: improve CET-IBT pruning of ENDBR
03: VMX: drop vmcs_revision_id
04: VMX: convert vmx_basic_msr
05: VMX: convert vmx_pin_based_exec_control
06: VMX: convert vmx_cpu_based_exec_control
07: VMX: convert vmx_secondary_exec_control
08: VMX: convert vmx_tertiary_exec_control
09: VMX: convert vmx_vmexit_control
10: VMX: convert vmx_vmentry_control
11: VMX: convert vmx_ept_vpid_cap
12: VMX: convert vmx_vmfunc

Jan



[PATCH] docs/sphinx: Start an FAQ, and add Kconfig/CET details

2024-02-26 Thread Andrew Cooper
This is long overdue, and we need to start somewhere.

Signed-off-by: Andrew Cooper 
---
CC: George Dunlap 
CC: Jan Beulich 
CC: Stefano Stabellini 
CC: Wei Liu 
CC: Julien Grall 
---
 docs/faq.rst  | 71 +++
 docs/glossary.rst | 15 ++
 docs/index.rst|  1 +
 3 files changed, 87 insertions(+)
 create mode 100644 docs/faq.rst

diff --git a/docs/faq.rst b/docs/faq.rst
new file mode 100644
index ..75cd70328a5c
--- /dev/null
+++ b/docs/faq.rst
@@ -0,0 +1,71 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Frequently Asked Questions
+==
+
+How do I...
+---
+
+... check whether a Kconfig option is active?
+^
+
+  Kconfig is a build time configuration system, combining inherent knowledge,
+  the capabilities of the toolchain, and explicit user choice to form a
+  configuration of a build of Xen.
+
+  A file, by default ``.config``, is produced by the build identifying the
+  configuration used.  Kconfig symbols all start with ``CONFIG_``, and come in
+  a variety of types including strings, integers and booleans.  Booleans are
+  the most common, and when active are expressed with ``...=y``.  e.g.::
+
+xen.git/xen$ grep CONFIG_FOO .config
+CONFIG_FOO_BOOLEAN=y
+CONFIG_FOO_STRING="lorem ipsum"
+CONFIG_FOO_INTEGER=42
+
+  Symbols which are either absent, or expressed as ``... is not set`` are
+  disabled.  e.g.::
+
+xen.git/xen$ grep CONFIG_BAR .config
+# CONFIG_BAR is not set
+
+  Builds of Xen configured with ``CONFIG_HYPFS_CONFIG=y`` embed their own
+  ``.config`` at build time, and can provide it to the :term:`control domain`
+  upon requested.  e.g.::
+
+[root@host ~]# xenhypfs cat /buildinfo/config | grep -e FOO -e BAR
+CONFIG_FOO=y
+# CONFIG_BAR is not set
+
+
+... tell if CET is active?
+^^
+
+  Control-flow Enforcement Technology support was added to Xen 4.14.  It is
+  build time conditional, dependent on both having a new-enough toolchain and
+  an explicit Kconfig option, and also requires capable hardware.  See
+  :term:`CET`.
+
+  For CET-SS, Shadow Stacks, the minimum toolchain requirements are ``binutils
+  >= 2.29`` or ``LLVM >= 6``.  No specific compiler support is required.
+  Check for ``CONFIG_XEN_SHSTK`` being active.
+
+  For CET-IBT, Indirect Branch Tracking, the minimum toolchain requirements
+  are ``GCC >= 9`` and ``binutils >= 2.29``.  Xen relies on a compiler feature
+  which is specific to GCC at the time of writing.  Check for
+  ``CONFIG_XEN_IBT`` being active.
+
+  If a capable Xen with booted on capable hardware, and CET is not disabled by
+  command line option or errata, Xen will print some details early on boot
+  about which CET facilities have been turned on::
+
+...
+(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 143 (0x8f), Stepping 8 (raw 
000806f8)
+(XEN) Enabling Supervisor Shadow Stacks
+(XEN) Enabling Indirect Branch Tracking
+(XEN)   - IBT disabled in UEFI Runtime Services
+(XEN) EFI RAM map:
+...
+
+  This can be obtained from the control domain with ``xl dmesg``, but remember
+  to confirm that the console ring hasn't wrapped.
diff --git a/docs/glossary.rst b/docs/glossary.rst
index 8ddbdab160a1..6adeec77e14c 100644
--- a/docs/glossary.rst
+++ b/docs/glossary.rst
@@ -28,6 +28,21 @@ Glossary
  single instance of Xen, used as the identifier in various APIs, and is
  typically allocated sequentially from 0.
 
+   CET
+ Control-flow Enforcement Technology is a facility in x86 CPUs for
+ defending against memory safety vulnerabilities.  It is formed of two
+ independent features:
+
+ * CET-SS, Shadow Stacks, are designed to protect against Return Oriented
+   Programming (ROP) attacks.
+
+ * CET-IBT, Indirect Branch Tracking, is designed to protect against Call
+   or Jump Oriented Programming (COP/JOP) attacks.
+
+ Intel support CET-SS and CET-IBT from the Tiger Lake (Client, 2020) and
+ Sapphire Rapids (Server, 2023) CPUs.  AMD support only CET-SS, starting
+ with Zen3 (Both client and server, 2020) CPUs.
+
guest
  The term 'guest' has two different meanings, depending on context, and
  should not be confused with :term:`domain`.
diff --git a/docs/index.rst b/docs/index.rst
index 22fdde80590c..ab051a0f3833 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -72,4 +72,5 @@ Miscellanea
 
 .. toctree::
 
+   faq
glossary

base-commit: 60e00f77a5cc671d30c5ef3318f5b8e9b74e4aa3
-- 
2.30.2




[REGRESSION] Re: [XEN PATCH v4 4/4] eclair: move function and macro properties outside ECLAIR

2024-02-26 Thread Andrew Cooper
On 02/02/2024 3:16 pm, Simone Ballarin wrote:
> From: Maria Celeste Cesario 
>
> Function and macro properties contained in ECLAIR/call_properties.ecl are of
> general interest: this patch moves these annotations in a generaric JSON file
> in docs. In this way, they can be exploited for other purposes (i.e. 
> documentation,
> other tools).
>
> Add rst file containing explanation on how to update 
> function_macro_properties.json.
> Add script to convert the JSON file in ECL configurations.
> Remove ECLAIR/call_properties.ecl: the file is now automatically generated 
> from
> the JSON file.
>
> Signed-off-by: Maria Celeste Cesario  
> Signed-off-by: Simone Ballarin  
>
> ---
> Changes in v4:
> - add missing script for converting the JSON file in ECL configurations;
> - improve commit message;
> - remove call_properties.ecs.
> ---
>  .../eclair_analysis/ECLAIR/analysis.ecl   |   1 +
>  .../ECLAIR/call_properties.ecl| 128 ---
>  automation/eclair_analysis/prepare.sh |   2 +
>  automation/eclair_analysis/propertyparser.py  |  37 +
>  docs/function_macro_properties.json   | 841 ++
>  docs/function_macro_properties.rst|  58 ++
>  6 files changed, 939 insertions(+), 128 deletions(-)
>  delete mode 100644 automation/eclair_analysis/ECLAIR/call_properties.ecl
>  create mode 100644 automation/eclair_analysis/propertyparser.py
>  create mode 100644 docs/function_macro_properties.json
>  create mode 100644 docs/function_macro_properties.rst

This breaks the Sphinx build.

checking consistency...
/local/xen.git/docs/function_macro_properties.rst: WARNING: document
isn't included in any toctree

Also, the top level of docs really isn't an appropriate place to put
it.  Everything else is in docs/misra/.

When you've moved the files, you'll need to edit docs/misra/index.rst to
fix the build.

~Andrew



Re: [XEN PATCH] automation: Rework "build-each-commit-gcc" test

2024-02-26 Thread Jan Beulich
On 26.02.2024 15:49, Anthony PERARD wrote:
> On Mon, Feb 26, 2024 at 10:23:37AM +0100, Jan Beulich wrote:
>> On 20.02.2024 15:07, Anthony PERARD wrote:
>>> --- a/automation/scripts/build-test.sh
>>> +++ b/automation/scripts/build-test.sh
>>> @@ -9,6 +9,37 @@
>>>  # Set NON_SYMBOLIC_REF=1 if you want to use this script in detached HEAD 
>>> state.
>>>  # This is currently used by automated test system.
>>>  
>>> +# Colors with ANSI escape sequences
>>> +txt_info=''
>>> +txt_err=''
>>> +txt_clr=''
>>> +
>>> +# $GITLAB_CI should be "true" or "false".
>>> +if [ "$GITLAB_CI" != true ]; then
>>> +GITLAB_CI=false
>>> +fi
>>> +
>>> +gitlab_log_section() {
>>> +if $GITLAB_CI; then
>>> +echo -n "section_$1:$(date +%s):$2
>>> "
>>
>> ... there was either corruption on transmit here, or there's an embedded
>> newline that I don't know how to deal with.
> 
> No corruption here, there is a \r in this string. There's also \e a few
> times. Most tools can deal with these characters just fine, so I didn't
> even think there could be an issue.

Okay, in an entirely different 2nd attempt I got it committed fine (I
hope). It's probably too old fashioned of me / my scripts that I
demand for every patch to pass a "patch --dry-run" first.

Jan

> If that byte is really an issue, I could rewrite the patch to use
> printf, and I think it would read as:
> 
> printf "\e[0Ksection_$1:$(date +%s):$2\r\e[0K"
> 
> Thanks,
> 




Re: [PATCH] tools/xentop: Add VBD3 support to xentop

2024-02-26 Thread Andrew Cooper
On 26/02/2024 2:22 pm, Jan Beulich wrote:
> On 26.02.2024 15:12, Fouad Hilly wrote:
>> From: Pritha Srivastava 
>>
>> xl now knows how to drive tapdisk, so modified libxenstat to
>> understand vbd3 statistics.
>>
>> Signed-off-by: Jorge Martin 
>> Signed-off-by: Pritha Srivastava 
> Just a formal question (I'm not really qualified to review this change):
> With the above two S-o-b and the earlier From: - who is the original
> author of this patch? The general expectation is for the 1st S-o-b to
> be the author's.

This patch has been sat in the XenServer patchqueue for a decade. 
Neither Pritha nor Jorge are with us any more.

Sadly the review system we used back then is also no longer with us. 
>From ticketing, I think it was co-developed at the same time.

This is the form the patch has existed in the patchqueue for that time,
so I'm tempted to say we reorder the SoB chain to make it match.  That's
the best I can figure out.

~Andrew



Re: [PATCH 3/3] x86/entry: Introduce EFRAME_* constants

2024-02-26 Thread Andrew Cooper
On 26/02/2024 2:32 pm, Jan Beulich wrote:
> On 26.02.2024 13:55, Andrew Cooper wrote:
>> restore_all_guest() does a lot of manipulation of the stack after popping the
>> GPRs, and uses raw %rsp displacements to do so.  Also, almost all entrypaths
>> use raw %rsp displacements prior to pushing GPRs.
>>
>> Provide better mnemonics, to aid readability and reduce the chance of errors
>> when editing.
>>
>> No functional change.  The resulting binary is identical.
>>
>> Signed-off-by: Andrew Cooper 
> Reviewed-by: Jan Beulich 

Thanks.

> with one small request:
>
>> --- a/xen/arch/x86/x86_64/asm-offsets.c
>> +++ b/xen/arch/x86/x86_64/asm-offsets.c
>> @@ -51,6 +51,23 @@ void __dummy__(void)
>>  OFFSET(UREGS_kernel_sizeof, struct cpu_user_regs, es);
>>  BLANK();
>>  
>> +/*
>> + * EFRAME_* is for the entry/exit logic where %rsp is pointing at
>> + * UREGS_error_code and GPRs are still guest values.
>> + */
> "still/already" or some such to match "entry/exit"?

Ok.

~Andrew



Re: [XEN PATCH] automation: Rework "build-each-commit-gcc" test

2024-02-26 Thread Anthony PERARD
On Mon, Feb 26, 2024 at 10:23:37AM +0100, Jan Beulich wrote:
> On 20.02.2024 15:07, Anthony PERARD wrote:
> > --- a/automation/scripts/build-test.sh
> > +++ b/automation/scripts/build-test.sh
> > @@ -9,6 +9,37 @@
> >  # Set NON_SYMBOLIC_REF=1 if you want to use this script in detached HEAD 
> > state.
> >  # This is currently used by automated test system.
> >  
> > +# Colors with ANSI escape sequences
> > +txt_info=''
> > +txt_err=''
> > +txt_clr=''
> > +
> > +# $GITLAB_CI should be "true" or "false".
> > +if [ "$GITLAB_CI" != true ]; then
> > +GITLAB_CI=false
> > +fi
> > +
> > +gitlab_log_section() {
> > +if $GITLAB_CI; then
> > +echo -n "section_$1:$(date +%s):$2
> > "
> 
> ... there was either corruption on transmit here, or there's an embedded
> newline that I don't know how to deal with.

No corruption here, there is a \r in this string. There's also \e a few
times. Most tools can deal with these characters just fine, so I didn't
even think there could be an issue.

If that byte is really an issue, I could rewrite the patch to use
printf, and I think it would read as:

printf "\e[0Ksection_$1:$(date +%s):$2\r\e[0K"

Thanks,

-- 
Anthony PERARD



Re: [PATCH v13.3 01/14] vpci: use per-domain PCI lock to protect vpci structure

2024-02-26 Thread Jan Beulich
On 21.02.2024 03:45, Stewart Hildebrand wrote:
> From: Oleksandr Andrushchenko 
> 
> Use the per-domain PCI read/write lock to protect the presence of the
> pci device vpci field. This lock can be used (and in a few cases is used
> right away) so that vpci removal can be performed while holding the lock
> in write mode. Previously such removal could race with vpci_read for
> example.
> 
> When taking both d->pci_lock and pdev->vpci->lock, they should be
> taken in this exact order: d->pci_lock then pdev->vpci->lock to avoid
> possible deadlock situations.
> 
> 1. Per-domain's pci_lock is used to protect pdev->vpci structure
> from being removed.
> 
> 2. Writing the command register and ROM BAR register may trigger
> modify_bars to run, which in turn may access multiple pdevs while
> checking for the existing BAR's overlap. The overlapping check, if
> done under the read lock, requires vpci->lock to be acquired on both
> devices being compared, which may produce a deadlock. It is not
> possible to upgrade read lock to write lock in such a case. So, in
> order to prevent the deadlock, use d->pci_lock in write mode instead.
> 
> All other code, which doesn't lead to pdev->vpci destruction and does
> not access multiple pdevs at the same time, can still use a
> combination of the read lock and pdev->vpci->lock.
> 
> 3. Drop const qualifier where the new rwlock is used and this is
> appropriate.
> 
> 4. Do not call process_pending_softirqs with any locks held. For that
> unlock prior the call and re-acquire the locks after. After
> re-acquiring the lock there is no need to check if pdev->vpci exists:
>  - in apply_map because of the context it is called (no race condition
>possible)
>  - for MSI/MSI-X debug code because it is called at the end of
>pdev->vpci access and no further access to pdev->vpci is made
> 
> 5. Use d->pci_lock around for_each_pdev and pci_get_pdev()
> while accessing pdevs in vpci code.
> 
> 6. Switch vPCI functions to use per-domain pci_lock for ensuring pdevs
> do not go away. The vPCI functions call several MSI-related functions
> which already have existing non-vPCI callers. Change those MSI-related
> functions to allow using either pcidevs_lock() or d->pci_lock for
> ensuring pdevs do not go away. Holding d->pci_lock in read mode is
> sufficient. Note that this pdev protection mechanism does not protect
> other state or critical sections. These MSI-related functions already
> have other race condition and state protection mechanims (e.g.
> d->event_lock and msixtbl RCU), so we deduce that the use of the global
> pcidevs_lock() is to ensure that pdevs do not go away.
> 
> 7. Introduce wrapper construct, pdev_list_is_read_locked(), for checking
> that pdevs do not go away. The purpose of this wrapper is to aid
> readability and document the intent of the pdev protection mechanism.
> 
> 8. When possible, the existing non-vPCI callers of these MSI-related
> functions haven't been switched to use the newly introduced per-domain
> pci_lock, and will continue to use the global pcidevs_lock(). This is
> done to reduce the risk of the new locking scheme introducing
> regressions. Those users will be adjusted in due time. One exception
> is where the pcidevs_lock() in allocate_and_map_msi_pirq() is moved to
> the caller, physdev_map_pirq(): this instance is switched to
> read_lock(&d->pci_lock) right away.
> 
> Suggested-by: Roger Pau Monné 
> Suggested-by: Jan Beulich 
> Signed-off-by: Oleksandr Andrushchenko 
> Signed-off-by: Volodymyr Babchuk 
> Signed-off-by: Stewart Hildebrand 

Acked-by: Jan Beulich 
with two small remaining remarks (below) and on the assumption that an
R-b from Roger in particular for the vPCI code is going to turn up
eventually.

> @@ -895,6 +891,15 @@ int vpci_msix_arch_print(const struct vpci_msix *msix)
>  {
>  unsigned int i;
>  
> +/*
> + * Assert that pdev_list doesn't change. ASSERT_PDEV_LIST_IS_READ_LOCKED
> + * is not suitable here because it may allow either pcidevs_lock() or
> + * pci_lock to be held, but here we rely on pci_lock being held, not
> + * pcidevs_lock().
> + */
> +ASSERT(rw_is_locked(&msix->pdev->domain->pci_lock));
> +ASSERT(spin_is_locked(&msix->pdev->vpci->lock));

As to the comment, I think it's not really "may". I also think referral to
...

> @@ -913,13 +918,23 @@ int vpci_msix_arch_print(const struct vpci_msix *msix)
>  struct pci_dev *pdev = msix->pdev;
>  
>  spin_unlock(&msix->pdev->vpci->lock);
> +read_unlock(&pdev->domain->pci_lock);
>  process_pending_softirqs();
> +
> +if ( !read_trylock(&pdev->domain->pci_lock) )
> +return -EBUSY;
> +
>  /* NB: we assume that pdev cannot go away for an alive domain. */
>  if ( !pdev->vpci || !spin_trylock(&pdev->vpci->lock) )
> +{
> +read_unlock(&pdev->domain->pci_lock);
>  return -EBUSY;
> +}
> +
>   

[RFC PATCH 66/73] x86/pvm: Use new cpu feature to describe XENPV and PVM

2024-02-26 Thread Lai Jiangshan
From: Hou Wenlong 

Some PVOPS are patched as the native version directly if the guest is
not a XENPV guest. However, this approach will not work after
introducing a PVM guest. To address this, use a new CPU feature to
describe XENPV and PVM, and ensure that those PVOPS are patched only
when it is not a paravirtual guest.

Signed-off-by: Hou Wenlong 
Signed-off-by: Lai Jiangshan 
---
 arch/x86/entry/entry_64.S  |  5 ++---
 arch/x86/include/asm/cpufeatures.h |  1 +
 arch/x86/include/asm/paravirt.h| 14 +++---
 arch/x86/kernel/pvm.c  |  1 +
 arch/x86/xen/enlighten_pv.c|  1 +
 5 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index fe12605b3c05..6b41a1837698 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -127,9 +127,8 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, 
SYM_L_GLOBAL)
 * In the PVM guest case we must use eretu synthetic instruction.
 */
 
-   ALTERNATIVE_2 "testb %al, %al; jz 
swapgs_restore_regs_and_return_to_usermode", \
-   "jmp swapgs_restore_regs_and_return_to_usermode", 
X86_FEATURE_XENPV, \
-   "jmp swapgs_restore_regs_and_return_to_usermode", 
X86_FEATURE_KVM_PVM_GUEST
+   ALTERNATIVE "testb %al, %al; jz 
swapgs_restore_regs_and_return_to_usermode", \
+   "jmp swapgs_restore_regs_and_return_to_usermode", 
X86_FEATURE_PV_GUEST
 
/*
 * We win! This label is here just for ease of understanding
diff --git a/arch/x86/include/asm/cpufeatures.h 
b/arch/x86/include/asm/cpufeatures.h
index e17e72f13423..72ef58a2db19 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -238,6 +238,7 @@
 #define X86_FEATURE_VCPUPREEMPT( 8*32+21) /* "" PV 
vcpu_is_preempted function */
 #define X86_FEATURE_TDX_GUEST  ( 8*32+22) /* Intel Trust Domain 
Extensions Guest */
 #define X86_FEATURE_KVM_PVM_GUEST  ( 8*32+23) /* KVM Pagetable-based 
Virtual Machine guest */
+#define X86_FEATURE_PV_GUEST   ( 8*32+24) /* "" Paravirtual guest */
 
 /* Intel-defined CPU features, CPUID level 0x0007:0 (EBX), word 9 */
 #define X86_FEATURE_FSGSBASE   ( 9*32+ 0) /* RDFSBASE, WRFSBASE, 
RDGSBASE, WRGSBASE instructions*/
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index deaee9ec575e..a864ee481ca2 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -143,7 +143,7 @@ static __always_inline unsigned long read_cr2(void)
 {
return PVOP_ALT_CALLEE0(unsigned long, mmu.read_cr2,
"mov %%cr2, %%rax;",
-   ALT_NOT(X86_FEATURE_XENPV));
+   ALT_NOT(X86_FEATURE_PV_GUEST));
 }
 
 static __always_inline void write_cr2(unsigned long x)
@@ -154,13 +154,13 @@ static __always_inline void write_cr2(unsigned long x)
 static inline unsigned long __read_cr3(void)
 {
return PVOP_ALT_CALL0(unsigned long, mmu.read_cr3,
- "mov %%cr3, %%rax;", ALT_NOT(X86_FEATURE_XENPV));
+ "mov %%cr3, %%rax;", 
ALT_NOT(X86_FEATURE_PV_GUEST));
 }
 
 static inline void write_cr3(unsigned long x)
 {
PVOP_ALT_VCALL1(mmu.write_cr3, x,
-   "mov %%rdi, %%cr3", ALT_NOT(X86_FEATURE_XENPV));
+   "mov %%rdi, %%cr3", ALT_NOT(X86_FEATURE_PV_GUEST));
 }
 
 static inline void __write_cr4(unsigned long x)
@@ -694,17 +694,17 @@ bool __raw_callee_save___native_vcpu_is_preempted(long 
cpu);
 static __always_inline unsigned long arch_local_save_flags(void)
 {
return PVOP_ALT_CALLEE0(unsigned long, irq.save_fl, "pushf; pop %%rax;",
-   ALT_NOT(X86_FEATURE_XENPV));
+   ALT_NOT(X86_FEATURE_PV_GUEST));
 }
 
 static __always_inline void arch_local_irq_disable(void)
 {
-   PVOP_ALT_VCALLEE0(irq.irq_disable, "cli;", ALT_NOT(X86_FEATURE_XENPV));
+   PVOP_ALT_VCALLEE0(irq.irq_disable, "cli;", 
ALT_NOT(X86_FEATURE_PV_GUEST));
 }
 
 static __always_inline void arch_local_irq_enable(void)
 {
-   PVOP_ALT_VCALLEE0(irq.irq_enable, "sti;", ALT_NOT(X86_FEATURE_XENPV));
+   PVOP_ALT_VCALLEE0(irq.irq_enable, "sti;", 
ALT_NOT(X86_FEATURE_PV_GUEST));
 }
 
 static __always_inline unsigned long arch_local_irq_save(void)
@@ -776,7 +776,7 @@ void native_pv_lock_init(void) __init;
 .endm
 
 #define SAVE_FLAGS ALTERNATIVE "PARA_IRQ_save_fl;", "pushf; pop %rax;", \
-   ALT_NOT(X86_FEATURE_XENPV)
+   ALT_NOT(X86_FEATURE_PV_GUEST)
 #endif
 #endif /* CONFIG_PARAVIRT_XXL */
 #endif /* CONFIG_X86_64 */
diff --git a/arch/x86/kernel/pvm.c b/arch/x86/kernel/pvm.c
index c38e46a96ad3..d39550a8159f 100644
--- a/arch/x86/kernel/pvm.c
+++ b/arch/x86/kernel/pvm.c
@@ -300,6 +300,7 @@ void __init pvm_early_setup(void)
 

Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h

2024-02-26 Thread Oleksii
On Mon, 2024-02-26 at 15:20 +0100, Jan Beulich wrote:
> On 26.02.2024 13:58, Oleksii wrote:
> > On Mon, 2024-02-26 at 12:28 +0100, Jan Beulich wrote:
> > > On 26.02.2024 12:18, Oleksii wrote:
> > > > On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
> > > > > On 23.02.2024 13:23, Oleksii wrote:
> > > > > > > 
> > > > > > > > > > As 1- and 2-byte cases are emulated I decided that
> > > > > > > > > > is
> > > > > > > > > > not
> > > > > > > > > > to
> > > > > > > > > > provide
> > > > > > > > > > sfx argument for emulation macros as it will not
> > > > > > > > > > have
> > > > > > > > > > to
> > > > > > > > > > much
> > > > > > > > > > affect on
> > > > > > > > > > emulated types and just consume more performance on
> > > > > > > > > > acquire
> > > > > > > > > > and
> > > > > > > > > > release
> > > > > > > > > > version of sc/ld instructions.
> > > > > > > > > 
> > > > > > > > > Question is whether the common case (4- and 8-byte
> > > > > > > > > accesses)
> > > > > > > > > shouldn't
> > > > > > > > > be valued higher, with 1- and 2-byte emulation being
> > > > > > > > > there
> > > > > > > > > just
> > > > > > > > > to
> > > > > > > > > allow things to not break altogether.
> > > > > > > > If I understand you correctly, it would make sense to
> > > > > > > > add
> > > > > > > > the
> > > > > > > > 'sfx'
> > > > > > > > argument for the 1/2-byte access case, ensuring that
> > > > > > > > all
> > > > > > > > options
> > > > > > > > are
> > > > > > > > available for 1/2-byte access case as well.
> > > > > > > 
> > > > > > > That's one of the possibilities. As said, I'm not overly
> > > > > > > worried
> > > > > > > about
> > > > > > > the emulated cases. For the initial implementation I'd
> > > > > > > recommend
> > > > > > > going
> > > > > > > with what is easiest there, yielding the best possible
> > > > > > > result
> > > > > > > for
> > > > > > > the
> > > > > > > 4- and 8-byte cases. If later it turns out repeated
> > > > > > > acquire/release
> > > > > > > accesses are a problem in the emulation loop, things can
> > > > > > > be
> > > > > > > changed
> > > > > > > to explicit barriers, without touching the 4- and 8-byte
> > > > > > > cases.
> > > > > > I am confused then a little bit if emulated case is not an
> > > > > > issue.
> > > > > > 
> > > > > > For 4- and 8-byte cases for xchg .aqrl is used, for relaxed
> > > > > > and
> > > > > > aqcuire
> > > > > > version of xchg barries are used.
> > > > > > 
> > > > > > The similar is done for cmpxchg.
> > > > > > 
> > > > > > If something will be needed to change in emulation loop it
> > > > > > won't
> > > > > > require to change 4- and 8-byte cases.
> > > > > 
> > > > > I'm afraid I don't understand your reply.
> > > > IIUC, emulated cases it is implemented correctly in terms of
> > > > usage
> > > > barriers. And it also OK not to use sfx for lr/sc instructions
> > > > and
> > > > use
> > > > only barriers.
> > > > 
> > > > For 4- and 8-byte cases are used sfx + barrier depending on the
> > > > specific case ( relaxed, acquire, release, generic xchg/cmpxchg
> > > > ).
> > > > What also looks to me correct. But you suggested to provide the
> > > > best
> > > > possible result for 4- and 8-byte cases. 
> > > > 
> > > > So I don't understand what the best possible result is as the
> > > > current
> > > > one usage of __{cmp}xchg_generic for each specific case  (
> > > > relaxed,
> > > > acquire, release, generic xchg/cmpxchg ) looks correct to me:
> > > > xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
> > > > barriers.
> > > > xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only
> > > > release
> > > > barrier
> > > > xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only
> > > > acquire
> > > > barrier
> > > > xchg_relaxed ->  (..., "", "", "") - no barries, no sfx
> > > 
> > > So first: While explicit barriers are technically okay, I don't
> > > follow why
> > > you insist on using them when you can achieve the same by
> > > suitably
> > > tagging
> > > the actual insn doing the exchange. Then second: It's somewhat
> > > hard
> > > for me
> > > to see the final effect on the emulation paths without you
> > > actually
> > > having
> > > done the switch. Maybe no special handling is necessary there
> > > anymore
> > > then. And as said, it may actually be acceptable for the
> > > emulation
> > > paths
> > > to "only" be correct, but not be ideal in terms of performance.
> > > After
> > > all,
> > > if you use the normal 4-byte primitive in there, more (non-
> > > explicit)
> > > barriers than needed would occur if the involved loop has to take
> > > more
> > > than one iteration. Which could (but imo doesn't need to be)
> > > avoided
> > > by
> > > using a more relaxed 4-byte primitive there and an explicit
> > > barrier
> > > outside of the loop.
> > 
> > According to the spec:
> > Table A.5 ( part of the table only I copied here )
> > 
> > Linux Construct  RVWMO Mapping
> > atomic  relaxed   amo.{w|d

[RFC PATCH 56/73] x86/pvm: Relocate kernel image early in PVH entry

2024-02-26 Thread Lai Jiangshan
From: Hou Wenlong 

For a PIE kernel, it runs in a high virtual address in the PVH entry, so
it needs to relocate the kernel image early in the PVH entry for the PVM
guest.

Signed-off-by: Hou Wenlong 
Signed-off-by: Lai Jiangshan 
---
 arch/x86/include/asm/init.h   |  5 +
 arch/x86/kernel/head64_identity.c |  5 -
 arch/x86/platform/pvh/enlighten.c | 22 ++
 arch/x86/platform/pvh/head.S  |  4 
 4 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index cc9ccf61b6bd..f78edef60253 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -4,6 +4,11 @@
 
 #define __head __section(".head.text")
 
+#define SYM_ABS_VA(sym) ({ \
+   unsigned long __v;  \
+   asm("movabsq $" __stringify(sym) ", %0":"=r"(__v)); \
+   __v; })
+
 struct x86_mapping_info {
void *(*alloc_pgt_page)(void *); /* allocate buf for page table */
void *context;   /* context for alloc_pgt_page */
diff --git a/arch/x86/kernel/head64_identity.c 
b/arch/x86/kernel/head64_identity.c
index 4e6a073d9e6c..f69f9904003c 100644
--- a/arch/x86/kernel/head64_identity.c
+++ b/arch/x86/kernel/head64_identity.c
@@ -82,11 +82,6 @@ static void __head set_kernel_map_base(unsigned long 
text_base)
 }
 #endif
 
-#define SYM_ABS_VA(sym) ({ \
-   unsigned long __v;  \
-   asm("movabsq $" __stringify(sym) ", %0":"=r"(__v)); \
-   __v; })
-
 static unsigned long __head sme_postprocess_startup(struct boot_params *bp, 
pmdval_t *pmd)
 {
unsigned long vaddr, vaddr_end;
diff --git a/arch/x86/platform/pvh/enlighten.c 
b/arch/x86/platform/pvh/enlighten.c
index 00a92cb2c814..8c64c31c971b 100644
--- a/arch/x86/platform/pvh/enlighten.c
+++ b/arch/x86/platform/pvh/enlighten.c
@@ -1,8 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
+#include 
 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -113,6 +115,26 @@ static void __init hypervisor_specific_init(bool xen_guest)
xen_pvh_init(&pvh_bootparams);
 }
 
+#ifdef CONFIG_PVM_GUEST
+void pvm_relocate_kernel(unsigned long physbase);
+
+void __init pvm_update_pgtable(unsigned long physbase)
+{
+   pgdval_t *pgd;
+   pudval_t *pud;
+   unsigned long base;
+
+   pvm_relocate_kernel(physbase);
+
+   pgd = (pgdval_t *)init_top_pgt;
+   base = SYM_ABS_VA(_text);
+   pgd[pgd_index(base)] = pgd[0];
+   pgd[pgd_index(page_offset_base)] = pgd[0];
+   pud = (pudval_t *)level3_ident_pgt;
+   pud[pud_index(base)] = (unsigned long)level2_ident_pgt + 
_KERNPG_TABLE_NOENC;
+}
+#endif
+
 /*
  * This routine (and those that it might call) should not use
  * anything that lives in .bss since that segment will be cleared later.
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index baaa3fe34a00..127f297f7257 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -109,6 +109,10 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
wrmsr
 
 #ifdef CONFIG_X86_PIE
+#ifdef CONFIG_PVM_GUEST
+   leaq_text(%rip), %rdi
+   callpvm_update_pgtable
+#endif
movabs  $2f, %rax
ANNOTATE_RETPOLINE_SAFE
jmp *%rax
-- 
2.19.1.6.gb485710b




Re: [PATCH 3/3] x86/entry: Introduce EFRAME_* constants

2024-02-26 Thread Jan Beulich
On 26.02.2024 13:55, Andrew Cooper wrote:
> restore_all_guest() does a lot of manipulation of the stack after popping the
> GPRs, and uses raw %rsp displacements to do so.  Also, almost all entrypaths
> use raw %rsp displacements prior to pushing GPRs.
> 
> Provide better mnemonics, to aid readability and reduce the chance of errors
> when editing.
> 
> No functional change.  The resulting binary is identical.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 
with one small request:

> --- a/xen/arch/x86/x86_64/asm-offsets.c
> +++ b/xen/arch/x86/x86_64/asm-offsets.c
> @@ -51,6 +51,23 @@ void __dummy__(void)
>  OFFSET(UREGS_kernel_sizeof, struct cpu_user_regs, es);
>  BLANK();
>  
> +/*
> + * EFRAME_* is for the entry/exit logic where %rsp is pointing at
> + * UREGS_error_code and GPRs are still guest values.
> + */

"still/already" or some such to match "entry/exit"?

Jan



Re: [PATCH 2/3] x86/entry: Simplify expressions in compat_restore_all_guest()

2024-02-26 Thread Jan Beulich
On 26.02.2024 13:55, Andrew Cooper wrote:
> compat_restore_all_guest() already has SPEC_CTRL_EXIT_TO_PV with a documented
> requirement for %rsp to be both regs and cpuinfo.
> 
> Use the now-normal annotations and simplify the expressions which happen to be
> a subtraction of 0.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 





Re: [PATCH 1/3] x86/entry: Adjustments to "reduce assembly code size of entry points"

2024-02-26 Thread Jan Beulich
On 26.02.2024 13:54, Andrew Cooper wrote:
> Some retroactive review, for if I'd got to the patch in time.
> 
>  * The new ASM-friendly BUILD_BUG_ON() should be in a header file.
>  * entry_int82() wants the movl->movb treatment too.
> 
> Fixes: c144b9e32427 ("x86: Reduce assembly code size of entry points")
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 

> Fixes just to link the two patches.

Hmm, I see.

> --- a/xen/arch/x86/include/asm/asm_defns.h
> +++ b/xen/arch/x86/include/asm/asm_defns.h
> @@ -56,6 +56,18 @@ register unsigned long current_stack_pointer asm("rsp");
>  #define ASSERT_INTERRUPTS_DISABLED \
>  ASSERT_INTERRUPT_STATUS(z, "INTERRUPTS DISABLED")
>  
> +#ifdef __ASSEMBLY__
> +
> +.macro BUILD_BUG_ON condstr cond:vararg
> +.if \cond
> +.error "Condition \"\condstr\" not satisfied"
> +.endif
> +.endm
> +/* preprocessor macro to make error message more user friendly */
> +#define BUILD_BUG_ON(cond) BUILD_BUG_ON #cond cond
> +
> +#endif /* __ASSEMBLY__ */
> +
>  #ifdef __ASSEMBLY__

Minor remark: Seeing these back to back is slightly odd. A little
higher up there's an "#ifndef __ASSEMBLY__" (producing e.g.
ASM_CALL_CONSTRAINT) in an #else of which this would imo be a
pretty good fit. Further down where SUBSECTION_LBL() is defined would
also look to be more suitable than adding yet another #ifdef.

Jan



Re: [PATCH] tools/xentop: Add VBD3 support to xentop

2024-02-26 Thread Frediano Ziglio
On Mon, Feb 26, 2024 at 2:12 PM Fouad Hilly  wrote:
>
> From: Pritha Srivastava 
>
> xl now knows how to drive tapdisk, so modified libxenstat to
> understand vbd3 statistics.
>
> Signed-off-by: Jorge Martin 
> Signed-off-by: Pritha Srivastava 
> Signed-off-by: Fouad Hilly 
> ---
> CC: Wei Liu 
> CC: Anthony PERARD 
> CC: Juergen Gross 
> ---
>  tools/libs/stat/xenstat_linux.c | 69 -
>  tools/libs/stat/xenstat_priv.h  | 16 
>  tools/xentop/xentop.c   |  1 +
>  3 files changed, 85 insertions(+), 1 deletion(-)
>
> diff --git a/tools/libs/stat/xenstat_linux.c b/tools/libs/stat/xenstat_linux.c
> index cbba54aa83ee..5a4a03634182 100644
> --- a/tools/libs/stat/xenstat_linux.c
> +++ b/tools/libs/stat/xenstat_linux.c
> @@ -390,6 +390,39 @@ void xenstat_uninit_networks(xenstat_handle * handle)
> fclose(priv->procnetdev);
>  }
>
> +static int read_attributes_vbd3(char *vbd3_path, xenstat_vbd *vbd)
> +{
> +   FILE *fp;
> +   struct vbd3_stats vbd3_stats;
> +
> +   fp = fopen(vbd3_path, "rb");
> +
> +   if (fp == NULL)
> +   {

The syntax of this if statement is different from the ones below.

> +   return -1;
> +   }
> +
> +   if (fread(&vbd3_stats, sizeof(struct vbd3_stats), 1, fp) != 1) {
> +   fclose(fp);
> +   return -1;
> +   }
> +
> +   if (vbd3_stats.version != 1) {
> +   fclose(fp);
> +   return -1;
> +   }
> +
> +   vbd->oo_reqs = vbd3_stats.oo_reqs;
> +   vbd->rd_reqs = vbd3_stats.read_reqs_submitted;
> +   vbd->rd_sects = vbd3_stats.read_sectors;
> +   vbd->wr_reqs = vbd3_stats.write_reqs_submitted;
> +   vbd->wr_sects = vbd3_stats.write_sectors;
> +
> +   fclose(fp);
> +
> +   return 0;
> +}
> +
>  static int read_attributes_vbd(const char *vbd_directory, const char *what, 
> char *ret, int cap)
>  {
> static char file_name[80];
> @@ -438,7 +471,7 @@ int xenstat_collect_vbds(xenstat_node * node)
> int ret;
> char buf[256];
>
> -   ret = sscanf(dp->d_name, "%3s-%u-%u", buf, &domid, &vbd.dev);
> +   ret = sscanf(dp->d_name, "%255[^-]-%u-%u", buf, &domid, 
> &vbd.dev);
> if (ret != 3)
> continue;
> if (!(strstr(buf, "vbd")) && !(strstr(buf, "tap")))
> @@ -448,6 +481,8 @@ int xenstat_collect_vbds(xenstat_node * node)
> vbd.back_type = 1;
> else if (strcmp(buf,"tap") == 0)
> vbd.back_type = 2;
> +   else if (strcmp(buf,"vbd3") == 0)
> +   vbd.back_type = 3;
> else
> vbd.back_type = 0;
>
> @@ -479,6 +514,38 @@ int xenstat_collect_vbds(xenstat_node * node)
> vbd.error = 1;
> }
> }
> +   else if (vbd.back_type == 3)
> +   {
> +
> +   char *td3_pid;
> +   char *path;
> +
> +   vbd.back_type = 3;
> +   vbd.error = 0;
> +
> +   if (asprintf(&path, 
> "/local/domain/0/backend/vbd3/%u/%u/kthread-pid", domid, vbd.dev) < 0)
> +   continue;
> +
> +   td3_pid = xs_read(node->handle->xshandle, XBT_NULL, 
> path, NULL);
> +
> +   if (td3_pid == NULL) {
> +   free(path);
> +   continue;
> +   }
> +
> +   free(path);

Why not freeing "path" unconditionally? We free it in all cases.

> +
> +   if (asprintf(&path, "/dev/shm/td3-%s/vbd-%u-%u", 
> td3_pid, domid, vbd.dev) < 0) {
> +   free(td3_pid);
> +   continue;
> +   }
> +
> +   if (read_attributes_vbd3(path, &vbd) < 0)
> +   vbd.error = 1;
> +
> +   free(td3_pid);
> +   free(path);
> +   }
> else
> {
> vbd.error = 1;
> diff --git a/tools/libs/stat/xenstat_priv.h b/tools/libs/stat/xenstat_priv.h
> index 4eb44a8ebb84..c3a9635240e9 100644
> --- a/tools/libs/stat/xenstat_priv.h
> +++ b/tools/libs/stat/xenstat_priv.h
> @@ -98,6 +98,22 @@ struct xenstat_vbd {
> unsigned long long wr_sects;
>  };
>
> +struct vbd3_stats {
> +   uint32_t version;
> +   uint32_t __pad;
> +   uint64_t oo_reqs;
> +   uint64_t read_reqs_submitted;
> +   uint64_t read_reqs_completed;
> +   uint64_t read_sectors;
> +   uint64_t read_total_ticks;
> +   uint64_t write_reqs_submitted;
> +   uint64_t write_reqs_completed;
> +   uint64_t write_sectors;
> +   uint64_t write_total_ticks;
> +   uint64_t io_errors;
> +  

Re: [PATCH] tools/xentop: Add VBD3 support to xentop

2024-02-26 Thread Jan Beulich
On 26.02.2024 15:12, Fouad Hilly wrote:
> From: Pritha Srivastava 
> 
> xl now knows how to drive tapdisk, so modified libxenstat to
> understand vbd3 statistics.
> 
> Signed-off-by: Jorge Martin 
> Signed-off-by: Pritha Srivastava 

Just a formal question (I'm not really qualified to review this change):
With the above two S-o-b and the earlier From: - who is the original
author of this patch? The general expectation is for the 1st S-o-b to
be the author's.

Jan

> Signed-off-by: Fouad Hilly 
> ---
> CC: Wei Liu 
> CC: Anthony PERARD 
> CC: Juergen Gross 
> ---
>  tools/libs/stat/xenstat_linux.c | 69 -
>  tools/libs/stat/xenstat_priv.h  | 16 
>  tools/xentop/xentop.c   |  1 +
>  3 files changed, 85 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/libs/stat/xenstat_linux.c b/tools/libs/stat/xenstat_linux.c
> index cbba54aa83ee..5a4a03634182 100644
> --- a/tools/libs/stat/xenstat_linux.c
> +++ b/tools/libs/stat/xenstat_linux.c
> @@ -390,6 +390,39 @@ void xenstat_uninit_networks(xenstat_handle * handle)
>   fclose(priv->procnetdev);
>  }
>  
> +static int read_attributes_vbd3(char *vbd3_path, xenstat_vbd *vbd)
> +{
> + FILE *fp;
> + struct vbd3_stats vbd3_stats;
> +
> + fp = fopen(vbd3_path, "rb");
> +
> + if (fp == NULL)
> + {
> + return -1;
> + }
> +
> + if (fread(&vbd3_stats, sizeof(struct vbd3_stats), 1, fp) != 1) {
> + fclose(fp);
> + return -1;
> + }
> +
> + if (vbd3_stats.version != 1) {
> + fclose(fp);
> + return -1;
> + }
> +
> + vbd->oo_reqs = vbd3_stats.oo_reqs;
> + vbd->rd_reqs = vbd3_stats.read_reqs_submitted;
> + vbd->rd_sects = vbd3_stats.read_sectors;
> + vbd->wr_reqs = vbd3_stats.write_reqs_submitted;
> + vbd->wr_sects = vbd3_stats.write_sectors;
> +
> + fclose(fp);
> +
> + return 0;
> +}
> +
>  static int read_attributes_vbd(const char *vbd_directory, const char *what, 
> char *ret, int cap)
>  {
>   static char file_name[80];
> @@ -438,7 +471,7 @@ int xenstat_collect_vbds(xenstat_node * node)
>   int ret;
>   char buf[256];
>  
> - ret = sscanf(dp->d_name, "%3s-%u-%u", buf, &domid, &vbd.dev);
> + ret = sscanf(dp->d_name, "%255[^-]-%u-%u", buf, &domid, 
> &vbd.dev);
>   if (ret != 3)
>   continue;
>   if (!(strstr(buf, "vbd")) && !(strstr(buf, "tap")))
> @@ -448,6 +481,8 @@ int xenstat_collect_vbds(xenstat_node * node)
>   vbd.back_type = 1;
>   else if (strcmp(buf,"tap") == 0)
>   vbd.back_type = 2;
> + else if (strcmp(buf,"vbd3") == 0)
> + vbd.back_type = 3;
>   else
>   vbd.back_type = 0;
>  
> @@ -479,6 +514,38 @@ int xenstat_collect_vbds(xenstat_node * node)
>   vbd.error = 1;
>   }
>   }
> + else if (vbd.back_type == 3)
> + {
> +
> + char *td3_pid;
> + char *path;
> +
> + vbd.back_type = 3;
> + vbd.error = 0;
> +
> + if (asprintf(&path, 
> "/local/domain/0/backend/vbd3/%u/%u/kthread-pid", domid, vbd.dev) < 0)
> + continue;
> +
> + td3_pid = xs_read(node->handle->xshandle, XBT_NULL, 
> path, NULL);
> +
> + if (td3_pid == NULL) {
> + free(path);
> + continue;
> + }
> +
> + free(path);
> +
> + if (asprintf(&path, "/dev/shm/td3-%s/vbd-%u-%u", 
> td3_pid, domid, vbd.dev) < 0) {
> + free(td3_pid);
> + continue;
> + }
> +
> + if (read_attributes_vbd3(path, &vbd) < 0)
> + vbd.error = 1;
> +
> + free(td3_pid);
> + free(path);
> + }
>   else
>   {
>   vbd.error = 1;
> diff --git a/tools/libs/stat/xenstat_priv.h b/tools/libs/stat/xenstat_priv.h
> index 4eb44a8ebb84..c3a9635240e9 100644
> --- a/tools/libs/stat/xenstat_priv.h
> +++ b/tools/libs/stat/xenstat_priv.h
> @@ -98,6 +98,22 @@ struct xenstat_vbd {
>   unsigned long long wr_sects;
>  };
>  
> +struct vbd3_stats {
> + uint32_t version;
> + uint32_t __pad;
> + uint64_t oo_reqs;
> + uint64_t read_reqs_submitted;
> + uint64_t read_reqs_completed;
> + uint64_t read_sectors;
> + uint64_t read_total_ticks;
> + uint64_t write_reqs_submitted;
> + uint64_t write_reqs_completed;
> + uint64_t write_sectors;
> + uint64_t write_total_ticks;
> + uint64_t io_errors;
> + uint64_t flags;
> +};
> +
>  extern int xenstat_collect_net

Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h

2024-02-26 Thread Jan Beulich
On 26.02.2024 13:58, Oleksii wrote:
> On Mon, 2024-02-26 at 12:28 +0100, Jan Beulich wrote:
>> On 26.02.2024 12:18, Oleksii wrote:
>>> On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
 On 23.02.2024 13:23, Oleksii wrote:
>>
> As 1- and 2-byte cases are emulated I decided that is
> not
> to
> provide
> sfx argument for emulation macros as it will not have
> to
> much
> affect on
> emulated types and just consume more performance on
> acquire
> and
> release
> version of sc/ld instructions.

 Question is whether the common case (4- and 8-byte
 accesses)
 shouldn't
 be valued higher, with 1- and 2-byte emulation being
 there
 just
 to
 allow things to not break altogether.
>>> If I understand you correctly, it would make sense to add
>>> the
>>> 'sfx'
>>> argument for the 1/2-byte access case, ensuring that all
>>> options
>>> are
>>> available for 1/2-byte access case as well.
>>
>> That's one of the possibilities. As said, I'm not overly
>> worried
>> about
>> the emulated cases. For the initial implementation I'd
>> recommend
>> going
>> with what is easiest there, yielding the best possible result
>> for
>> the
>> 4- and 8-byte cases. If later it turns out repeated
>> acquire/release
>> accesses are a problem in the emulation loop, things can be
>> changed
>> to explicit barriers, without touching the 4- and 8-byte
>> cases.
> I am confused then a little bit if emulated case is not an
> issue.
>
> For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and
> aqcuire
> version of xchg barries are used.
>
> The similar is done for cmpxchg.
>
> If something will be needed to change in emulation loop it
> won't
> require to change 4- and 8-byte cases.

 I'm afraid I don't understand your reply.
>>> IIUC, emulated cases it is implemented correctly in terms of usage
>>> barriers. And it also OK not to use sfx for lr/sc instructions and
>>> use
>>> only barriers.
>>>
>>> For 4- and 8-byte cases are used sfx + barrier depending on the
>>> specific case ( relaxed, acquire, release, generic xchg/cmpxchg ).
>>> What also looks to me correct. But you suggested to provide the
>>> best
>>> possible result for 4- and 8-byte cases. 
>>>
>>> So I don't understand what the best possible result is as the
>>> current
>>> one usage of __{cmp}xchg_generic for each specific case  ( relaxed,
>>> acquire, release, generic xchg/cmpxchg ) looks correct to me:
>>> xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
>>> barriers.
>>> xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only
>>> release
>>> barrier
>>> xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only acquire
>>> barrier
>>> xchg_relaxed ->  (..., "", "", "") - no barries, no sfx
>>
>> So first: While explicit barriers are technically okay, I don't
>> follow why
>> you insist on using them when you can achieve the same by suitably
>> tagging
>> the actual insn doing the exchange. Then second: It's somewhat hard
>> for me
>> to see the final effect on the emulation paths without you actually
>> having
>> done the switch. Maybe no special handling is necessary there anymore
>> then. And as said, it may actually be acceptable for the emulation
>> paths
>> to "only" be correct, but not be ideal in terms of performance. After
>> all,
>> if you use the normal 4-byte primitive in there, more (non-explicit)
>> barriers than needed would occur if the involved loop has to take
>> more
>> than one iteration. Which could (but imo doesn't need to be) avoided
>> by
>> using a more relaxed 4-byte primitive there and an explicit barrier
>> outside of the loop.
> 
> According to the spec:
> Table A.5 ( part of the table only I copied here )
> 
> Linux Construct  RVWMO Mapping
> atomic  relaxed   amo.{w|d}
> atomic  acquire   amo.{w|d}.aq
> atomic  release   amo.{w|d}.rl
> atomicamo.{w|d}.aqrl
> 
> Linux Construct  RVWMO LR/SC Mapping
> atomic  relaxed   loop: lr.{w|d}; ; sc.{w|d}; bnez loop
> atomic  acquire   loop: lr.{w|d}.aq; ; sc.{w|d}; bnez loop
> atomic  release   loop: lr.{w|d}; ; sc.{w|d}.aqrl∗ ; bnez 
> loop OR
>   fence.tso; loop: lr.{w|d}; ; sc.{w|d}∗ ;
> bnez loop
> atomicloop: lr.{w|d}.aq; ; sc.{w|d}.aqrl; bnez
> loop

In your consideration what to implement you will want to limit
things to constructs we actually use. I can't find any use of the
relaxed, acquire, or release forms of atomics as mentioned above.

> The Linux mappings for release operations may seem stronger than
> necessary, but these mappings
> are needed to cover some cases in which Linux requires stronger
> orderings than the more in

[PATCH] tools/xentop: Add VBD3 support to xentop

2024-02-26 Thread Fouad Hilly
From: Pritha Srivastava 

xl now knows how to drive tapdisk, so modified libxenstat to
understand vbd3 statistics.

Signed-off-by: Jorge Martin 
Signed-off-by: Pritha Srivastava 
Signed-off-by: Fouad Hilly 
---
CC: Wei Liu 
CC: Anthony PERARD 
CC: Juergen Gross 
---
 tools/libs/stat/xenstat_linux.c | 69 -
 tools/libs/stat/xenstat_priv.h  | 16 
 tools/xentop/xentop.c   |  1 +
 3 files changed, 85 insertions(+), 1 deletion(-)

diff --git a/tools/libs/stat/xenstat_linux.c b/tools/libs/stat/xenstat_linux.c
index cbba54aa83ee..5a4a03634182 100644
--- a/tools/libs/stat/xenstat_linux.c
+++ b/tools/libs/stat/xenstat_linux.c
@@ -390,6 +390,39 @@ void xenstat_uninit_networks(xenstat_handle * handle)
fclose(priv->procnetdev);
 }
 
+static int read_attributes_vbd3(char *vbd3_path, xenstat_vbd *vbd)
+{
+   FILE *fp;
+   struct vbd3_stats vbd3_stats;
+
+   fp = fopen(vbd3_path, "rb");
+
+   if (fp == NULL)
+   {
+   return -1;
+   }
+
+   if (fread(&vbd3_stats, sizeof(struct vbd3_stats), 1, fp) != 1) {
+   fclose(fp);
+   return -1;
+   }
+
+   if (vbd3_stats.version != 1) {
+   fclose(fp);
+   return -1;
+   }
+
+   vbd->oo_reqs = vbd3_stats.oo_reqs;
+   vbd->rd_reqs = vbd3_stats.read_reqs_submitted;
+   vbd->rd_sects = vbd3_stats.read_sectors;
+   vbd->wr_reqs = vbd3_stats.write_reqs_submitted;
+   vbd->wr_sects = vbd3_stats.write_sectors;
+
+   fclose(fp);
+
+   return 0;
+}
+
 static int read_attributes_vbd(const char *vbd_directory, const char *what, 
char *ret, int cap)
 {
static char file_name[80];
@@ -438,7 +471,7 @@ int xenstat_collect_vbds(xenstat_node * node)
int ret;
char buf[256];
 
-   ret = sscanf(dp->d_name, "%3s-%u-%u", buf, &domid, &vbd.dev);
+   ret = sscanf(dp->d_name, "%255[^-]-%u-%u", buf, &domid, 
&vbd.dev);
if (ret != 3)
continue;
if (!(strstr(buf, "vbd")) && !(strstr(buf, "tap")))
@@ -448,6 +481,8 @@ int xenstat_collect_vbds(xenstat_node * node)
vbd.back_type = 1;
else if (strcmp(buf,"tap") == 0)
vbd.back_type = 2;
+   else if (strcmp(buf,"vbd3") == 0)
+   vbd.back_type = 3;
else
vbd.back_type = 0;
 
@@ -479,6 +514,38 @@ int xenstat_collect_vbds(xenstat_node * node)
vbd.error = 1;
}
}
+   else if (vbd.back_type == 3)
+   {
+
+   char *td3_pid;
+   char *path;
+
+   vbd.back_type = 3;
+   vbd.error = 0;
+
+   if (asprintf(&path, 
"/local/domain/0/backend/vbd3/%u/%u/kthread-pid", domid, vbd.dev) < 0)
+   continue;
+
+   td3_pid = xs_read(node->handle->xshandle, XBT_NULL, 
path, NULL);
+
+   if (td3_pid == NULL) {
+   free(path);
+   continue;
+   }
+
+   free(path);
+
+   if (asprintf(&path, "/dev/shm/td3-%s/vbd-%u-%u", 
td3_pid, domid, vbd.dev) < 0) {
+   free(td3_pid);
+   continue;
+   }
+
+   if (read_attributes_vbd3(path, &vbd) < 0)
+   vbd.error = 1;
+
+   free(td3_pid);
+   free(path);
+   }
else
{
vbd.error = 1;
diff --git a/tools/libs/stat/xenstat_priv.h b/tools/libs/stat/xenstat_priv.h
index 4eb44a8ebb84..c3a9635240e9 100644
--- a/tools/libs/stat/xenstat_priv.h
+++ b/tools/libs/stat/xenstat_priv.h
@@ -98,6 +98,22 @@ struct xenstat_vbd {
unsigned long long wr_sects;
 };
 
+struct vbd3_stats {
+   uint32_t version;
+   uint32_t __pad;
+   uint64_t oo_reqs;
+   uint64_t read_reqs_submitted;
+   uint64_t read_reqs_completed;
+   uint64_t read_sectors;
+   uint64_t read_total_ticks;
+   uint64_t write_reqs_submitted;
+   uint64_t write_reqs_completed;
+   uint64_t write_sectors;
+   uint64_t write_total_ticks;
+   uint64_t io_errors;
+   uint64_t flags;
+};
+
 extern int xenstat_collect_networks(xenstat_node * node);
 extern void xenstat_uninit_networks(xenstat_handle * handle);
 extern int xenstat_collect_vbds(xenstat_node * node);
diff --git a/tools/xentop/xentop.c b/tools/xentop/xentop.c
index 0a2fab7f15a3..f5a456fd4dfd 100644
--- a/tools/xentop/xentop.c
+++ b/tools/xentop/xentop.c
@@ -1124,6 +1124,7 @@ void do_vbd(xenstat_domain *domain)
"Unidentified",   /* number 0

Re: [PATCH v2] x86/HVM: limit upcall vector related verbosity

2024-02-26 Thread Andrew Cooper
On 18/12/2023 7:26 am, Jan Beulich wrote:
> Avoid logging all-identical messages for every vCPU, but make sure to
> log unusual events like the vector differing from vCPU 0's (note that
> the respective condition also makes sure vCPU 0 itself will have the
> vector setting logged), or it changing after it was once set. (Arguably
> a downside is that some vCPU not having its vector set would no longer
> be recognizable from the logs. But I think that's tolerable as
> sufficiently unlikely outside of people actively fiddling with related
> code.)
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 



Re: [PATCH v2] x86: Dom0 "broken ELF" reporting adjustments

2024-02-26 Thread Andrew Cooper
On 22/01/2024 1:41 pm, Jan Beulich wrote:
> elf_load_binary() isn't the primary source of brokenness being
> indicated. Therefore make the respective PVH log message there
> conditional (much like PV has it), and add another instance when
> elf_xen_parse() failed (again matching behavior in the PV case).
>
> Make the PV side match the (new) use of %pd here.
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 



Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h

2024-02-26 Thread Oleksii
On Mon, 2024-02-26 at 12:28 +0100, Jan Beulich wrote:
> On 26.02.2024 12:18, Oleksii wrote:
> > On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
> > > On 23.02.2024 13:23, Oleksii wrote:
> > > > > 
> > > > > > > > As 1- and 2-byte cases are emulated I decided that is
> > > > > > > > not
> > > > > > > > to
> > > > > > > > provide
> > > > > > > > sfx argument for emulation macros as it will not have
> > > > > > > > to
> > > > > > > > much
> > > > > > > > affect on
> > > > > > > > emulated types and just consume more performance on
> > > > > > > > acquire
> > > > > > > > and
> > > > > > > > release
> > > > > > > > version of sc/ld instructions.
> > > > > > > 
> > > > > > > Question is whether the common case (4- and 8-byte
> > > > > > > accesses)
> > > > > > > shouldn't
> > > > > > > be valued higher, with 1- and 2-byte emulation being
> > > > > > > there
> > > > > > > just
> > > > > > > to
> > > > > > > allow things to not break altogether.
> > > > > > If I understand you correctly, it would make sense to add
> > > > > > the
> > > > > > 'sfx'
> > > > > > argument for the 1/2-byte access case, ensuring that all
> > > > > > options
> > > > > > are
> > > > > > available for 1/2-byte access case as well.
> > > > > 
> > > > > That's one of the possibilities. As said, I'm not overly
> > > > > worried
> > > > > about
> > > > > the emulated cases. For the initial implementation I'd
> > > > > recommend
> > > > > going
> > > > > with what is easiest there, yielding the best possible result
> > > > > for
> > > > > the
> > > > > 4- and 8-byte cases. If later it turns out repeated
> > > > > acquire/release
> > > > > accesses are a problem in the emulation loop, things can be
> > > > > changed
> > > > > to explicit barriers, without touching the 4- and 8-byte
> > > > > cases.
> > > > I am confused then a little bit if emulated case is not an
> > > > issue.
> > > > 
> > > > For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and
> > > > aqcuire
> > > > version of xchg barries are used.
> > > > 
> > > > The similar is done for cmpxchg.
> > > > 
> > > > If something will be needed to change in emulation loop it
> > > > won't
> > > > require to change 4- and 8-byte cases.
> > > 
> > > I'm afraid I don't understand your reply.
> > IIUC, emulated cases it is implemented correctly in terms of usage
> > barriers. And it also OK not to use sfx for lr/sc instructions and
> > use
> > only barriers.
> > 
> > For 4- and 8-byte cases are used sfx + barrier depending on the
> > specific case ( relaxed, acquire, release, generic xchg/cmpxchg ).
> > What also looks to me correct. But you suggested to provide the
> > best
> > possible result for 4- and 8-byte cases. 
> > 
> > So I don't understand what the best possible result is as the
> > current
> > one usage of __{cmp}xchg_generic for each specific case  ( relaxed,
> > acquire, release, generic xchg/cmpxchg ) looks correct to me:
> > xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
> > barriers.
> > xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only
> > release
> > barrier
> > xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only acquire
> > barrier
> > xchg_relaxed ->  (..., "", "", "") - no barries, no sfx
> 
> So first: While explicit barriers are technically okay, I don't
> follow why
> you insist on using them when you can achieve the same by suitably
> tagging
> the actual insn doing the exchange. Then second: It's somewhat hard
> for me
> to see the final effect on the emulation paths without you actually
> having
> done the switch. Maybe no special handling is necessary there anymore
> then. And as said, it may actually be acceptable for the emulation
> paths
> to "only" be correct, but not be ideal in terms of performance. After
> all,
> if you use the normal 4-byte primitive in there, more (non-explicit)
> barriers than needed would occur if the involved loop has to take
> more
> than one iteration. Which could (but imo doesn't need to be) avoided
> by
> using a more relaxed 4-byte primitive there and an explicit barrier
> outside of the loop.

According to the spec:
Table A.5 ( part of the table only I copied here )

Linux Construct  RVWMO Mapping
atomic  relaxed   amo.{w|d}
atomic  acquire   amo.{w|d}.aq
atomic  release   amo.{w|d}.rl
atomicamo.{w|d}.aqrl

Linux Construct  RVWMO LR/SC Mapping
atomic  relaxed   loop: lr.{w|d}; ; sc.{w|d}; bnez loop
atomic  acquire   loop: lr.{w|d}.aq; ; sc.{w|d}; bnez loop
atomic  release   loop: lr.{w|d}; ; sc.{w|d}.aqrl∗ ; bnez 
loop OR
  fence.tso; loop: lr.{w|d}; ; sc.{w|d}∗ ;
bnez loop
atomicloop: lr.{w|d}.aq; ; sc.{w|d}.aqrl; bnez
loop

The Linux mappings for release operations may seem stronger than
necessary, but these mappings
are needed to cover some cases in which Linux requires stronger
orderings than the more intuitive
mappings would provide. In particular, as of the time t

[xen-unstable test] 184763: tolerable FAIL

2024-02-26 Thread osstest service owner
flight 184763 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184763/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 12 debian-hvm-install fail in 
184756 pass in 184763
 test-armhf-armhf-xl-credit2   8 xen-boot   fail pass in 184756

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-credit2 15 migrate-support-check fail in 184756 never pass
 test-armhf-armhf-xl-credit2 16 saverestore-support-check fail in 184756 never 
pass
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 184756
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 184756
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 184756
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 184756
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 184756
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 184756
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 184756
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 184756
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 184756
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 184756
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 184756
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 184756
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  92babc88f67ed0ef3dc575a8b9534040274678ee
baseline version:
 xen  92babc88f67ed0ef3dc575a8b9534040274678ee

Last test of basis   184763  2024-02-26 01:52:19 Z

[PATCH 1/3] x86/entry: Adjustments to "reduce assembly code size of entry points"

2024-02-26 Thread Andrew Cooper
Some retroactive review, for if I'd got to the patch in time.

 * The new ASM-friendly BUILD_BUG_ON() should be in a header file.
 * entry_int82() wants the movl->movb treatment too.

Fixes: c144b9e32427 ("x86: Reduce assembly code size of entry points")
Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

Fixes just to link the two patches.
---
 xen/arch/x86/include/asm/asm_defns.h | 12 
 xen/arch/x86/x86_64/compat/entry.S   |  2 +-
 xen/arch/x86/x86_64/entry.S  |  8 
 3 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/include/asm/asm_defns.h 
b/xen/arch/x86/include/asm/asm_defns.h
index a9a6c21c76cd..f18a11b36198 100644
--- a/xen/arch/x86/include/asm/asm_defns.h
+++ b/xen/arch/x86/include/asm/asm_defns.h
@@ -56,6 +56,18 @@ register unsigned long current_stack_pointer asm("rsp");
 #define ASSERT_INTERRUPTS_DISABLED \
 ASSERT_INTERRUPT_STATUS(z, "INTERRUPTS DISABLED")
 
+#ifdef __ASSEMBLY__
+
+.macro BUILD_BUG_ON condstr cond:vararg
+.if \cond
+.error "Condition \"\condstr\" not satisfied"
+.endif
+.endm
+/* preprocessor macro to make error message more user friendly */
+#define BUILD_BUG_ON(cond) BUILD_BUG_ON #cond cond
+
+#endif /* __ASSEMBLY__ */
+
 #ifdef __ASSEMBLY__
 # define _ASM_EX(p) p-.
 #else
diff --git a/xen/arch/x86/x86_64/compat/entry.S 
b/xen/arch/x86/x86_64/compat/entry.S
index d4f0e4804090..93fbbeb4ae18 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -15,7 +15,7 @@ FUNC(entry_int82)
 ENDBR64
 ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
 pushq $0
-movl  $HYPERCALL_VECTOR, 4(%rsp)
+movb  $HYPERCALL_VECTOR, 4(%rsp)
 SAVE_ALL compat=1 /* DPL1 gate, restricted to 32bit PV guests only. */
 
 SPEC_CTRL_ENTRY_FROM_PV /* Req: %rsp=regs/cpuinfo, %rdx=0, Clob: acd */
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index a7bd8f0ca5b1..f8938b0b42fd 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -22,14 +22,6 @@
 #endif
 .endm
 
-.macro BUILD_BUG_ON condstr cond:vararg
-.if \cond
-.error "Condition \"\condstr\" not satisfied"
-.endif
-.endm
-/* preprocessor macro to make error message more user friendly */
-#define BUILD_BUG_ON(cond) BUILD_BUG_ON #cond cond
-
 #ifdef CONFIG_PV
 /* %rbx: struct vcpu */
 FUNC_LOCAL(switch_to_kernel)
-- 
2.30.2




[PATCH 2/3] x86/entry: Simplify expressions in compat_restore_all_guest()

2024-02-26 Thread Andrew Cooper
compat_restore_all_guest() already has SPEC_CTRL_EXIT_TO_PV with a documented
requirement for %rsp to be both regs and cpuinfo.

Use the now-normal annotations and simplify the expressions which happen to be
a subtraction of 0.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/x86_64/compat/entry.S | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/x86_64/compat/entry.S 
b/xen/arch/x86/x86_64/compat/entry.S
index 93fbbeb4ae18..727ab65290de 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -117,19 +117,19 @@ compat_process_trap:
 jmp   compat_test_all_events
 END(compat_test_all_events)
 
-/* %rbx: struct vcpu, interrupts disabled */
+/* %rbx: struct vcpu, interrupts disabled, %rsp=regs/cpuinfo */
 FUNC(compat_restore_all_guest)
 ASSERT_INTERRUPTS_DISABLED
 mov   $~(X86_EFLAGS_IOPL | X86_EFLAGS_VM), %r11d
 and   UREGS_eflags(%rsp),%r11d
 
-.macro alt_cr4_pv32
+.macro alt_cr4_pv32 /* %rsp=regs/cpuinfo */
 testb $3,UREGS_cs(%rsp)
 jpe   2f
-mov   CPUINFO_cr4-CPUINFO_guest_cpu_user_regs(%rsp), %rax
+mov   CPUINFO_cr4(%rsp), %rax
 and   $~XEN_CR4_PV32_BITS, %rax
 1:
-mov   %rax, CPUINFO_cr4-CPUINFO_guest_cpu_user_regs(%rsp)
+mov   %rax, CPUINFO_cr4(%rsp)
 mov   %rax, %cr4
 /*
  * An NMI or MCE may have occurred between the previous two
@@ -141,7 +141,7 @@ FUNC(compat_restore_all_guest)
  * loop to cause a live lock: If NMIs/MCEs occurred at that
  * high a rate, we'd be live locked anyway.
  */
-cmp   %rax, CPUINFO_cr4-CPUINFO_guest_cpu_user_regs(%rsp)
+cmp   %rax, CPUINFO_cr4(%rsp)
 jne   1b
 2:
 .endm
-- 
2.30.2




[PATCH 3/3] x86/entry: Introduce EFRAME_* constants

2024-02-26 Thread Andrew Cooper
restore_all_guest() does a lot of manipulation of the stack after popping the
GPRs, and uses raw %rsp displacements to do so.  Also, almost all entrypaths
use raw %rsp displacements prior to pushing GPRs.

Provide better mnemonics, to aid readability and reduce the chance of errors
when editing.

No functional change.  The resulting binary is identical.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/x86_64/asm-offsets.c  | 17 
 xen/arch/x86/x86_64/compat/entry.S |  2 +-
 xen/arch/x86/x86_64/entry.S| 68 +++---
 3 files changed, 52 insertions(+), 35 deletions(-)

diff --git a/xen/arch/x86/x86_64/asm-offsets.c 
b/xen/arch/x86/x86_64/asm-offsets.c
index fee0edc61abb..4cc23cd032c1 100644
--- a/xen/arch/x86/x86_64/asm-offsets.c
+++ b/xen/arch/x86/x86_64/asm-offsets.c
@@ -51,6 +51,23 @@ void __dummy__(void)
 OFFSET(UREGS_kernel_sizeof, struct cpu_user_regs, es);
 BLANK();
 
+/*
+ * EFRAME_* is for the entry/exit logic where %rsp is pointing at
+ * UREGS_error_code and GPRs are still guest values.
+ */
+#define OFFSET_EF(sym, mem) \
+DEFINE(sym, offsetof(struct cpu_user_regs, mem) -   \
+offsetof(struct cpu_user_regs, error_code))
+
+OFFSET_EF(EFRAME_entry_vector,entry_vector);
+OFFSET_EF(EFRAME_rip, rip);
+OFFSET_EF(EFRAME_cs,  cs);
+OFFSET_EF(EFRAME_eflags,  eflags);
+OFFSET_EF(EFRAME_rsp, rsp);
+BLANK();
+
+#undef OFFSET_EF
+
 OFFSET(VCPU_processor, struct vcpu, processor);
 OFFSET(VCPU_domain, struct vcpu, domain);
 OFFSET(VCPU_vcpu_info, struct vcpu, vcpu_info_area.map);
diff --git a/xen/arch/x86/x86_64/compat/entry.S 
b/xen/arch/x86/x86_64/compat/entry.S
index 727ab65290de..2f8fe5ebfbe4 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -15,7 +15,7 @@ FUNC(entry_int82)
 ENDBR64
 ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
 pushq $0
-movb  $HYPERCALL_VECTOR, 4(%rsp)
+movb  $HYPERCALL_VECTOR, EFRAME_entry_vector(%rsp)
 SAVE_ALL compat=1 /* DPL1 gate, restricted to 32bit PV guests only. */
 
 SPEC_CTRL_ENTRY_FROM_PV /* Req: %rsp=regs/cpuinfo, %rdx=0, Clob: acd */
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index f8938b0b42fd..1b846f3aaff0 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -188,15 +188,15 @@ FUNC_LOCAL(restore_all_guest)
 
 RESTORE_ALL
 BUILD_BUG_ON(TRAP_syscall & 0xff)
-testb $TRAP_syscall >> 8, 4+1(%rsp)
+testb $TRAP_syscall >> 8, EFRAME_entry_vector + 1(%rsp)
 jziret_exit_to_guest
 
-movq  24(%rsp),%r11   # RFLAGS
+mov   EFRAME_eflags(%rsp), %r11
 andq  $~(X86_EFLAGS_IOPL | X86_EFLAGS_VM), %r11
 orq   $X86_EFLAGS_IF,%r11
 
 /* Don't use SYSRET path if the return address is not canonical. */
-movq  8(%rsp),%rcx
+mov   EFRAME_rip(%rsp), %rcx
 sarq  $47,%rcx
 incl  %ecx
 cmpl  $1,%ecx
@@ -211,19 +211,19 @@ FUNC_LOCAL(restore_all_guest)
 ALTERNATIVE "", rag_clrssbsy, X86_FEATURE_XEN_SHSTK
 #endif
 
-movq  8(%rsp), %rcx   # RIP
-cmpw  $FLAT_USER_CS32,16(%rsp)# CS
-movq  32(%rsp),%rsp   # RSP
+mov   EFRAME_rip(%rsp), %rcx
+cmpw  $FLAT_USER_CS32, EFRAME_cs(%rsp)
+mov   EFRAME_rsp(%rsp), %rsp
 je1f
 sysretq
 1:  sysretl
 
 LABEL_LOCAL(.Lrestore_rcx_iret_exit_to_guest)
-movq  8(%rsp), %rcx   # RIP
+mov   EFRAME_rip(%rsp), %rcx
 /* No special register assumptions. */
 iret_exit_to_guest:
-andl  $~(X86_EFLAGS_IOPL | X86_EFLAGS_VM), 24(%rsp)
-orl   $X86_EFLAGS_IF,24(%rsp)
+andl  $~(X86_EFLAGS_IOPL | X86_EFLAGS_VM), EFRAME_eflags(%rsp)
+orl   $X86_EFLAGS_IF, EFRAME_eflags(%rsp)
 addq  $8,%rsp
 .Lft0:  iretq
 _ASM_PRE_EXTABLE(.Lft0, handle_exception)
@@ -256,7 +256,7 @@ FUNC(lstar_enter)
 pushq %rcx
 pushq $0
 BUILD_BUG_ON(TRAP_syscall & 0xff)
-movb  $TRAP_syscall >> 8, 4+1(%rsp)
+movb  $TRAP_syscall >> 8, EFRAME_entry_vector + 1(%rsp)
 SAVE_ALL
 
 SPEC_CTRL_ENTRY_FROM_PV /* Req: %rsp=regs/cpuinfo, %rdx=0, Clob: acd */
@@ -295,7 +295,7 @@ FUNC(cstar_enter)
 pushq %rcx
 pushq $0
 BUILD_BUG_ON(TRAP_syscall & 0xff)
-movb  $TRAP_syscall >> 8, 4+1(%rsp)
+movb  $TRAP_syscall >> 8, EFRAME_entry_vector + 1(%rsp)
 SAVE_ALL
 
 SPEC_CTRL_ENTRY_FROM_PV /* Req: %rsp=regs/cpuinfo, %rdx=0, Clob: acd */
@@ -338,7 +338,7 @@ LABEL(sysenter_eflags_saved, 0)
 pushq $0 /* null rip */
 pushq $0
 BUILD_BUG_ON(TRAP_syscall & 0xff)
-movb  $TRAP_syscall >> 8, 4+1(%rsp)

[PATCH 0/3] x86/entry: More cleanup

2024-02-26 Thread Andrew Cooper
Misc adjustments and cleanup.  No functional change.

Andrew Cooper (3):
  x86/entry: Adjustments to "reduce assembly code size of entry points"
  x86/entry: Simplify expressions in compat_restore_all_guest()
  x86/entry: Introduce EFRAME_* constants

 xen/arch/x86/include/asm/asm_defns.h | 12 +
 xen/arch/x86/x86_64/asm-offsets.c| 17 +++
 xen/arch/x86/x86_64/compat/entry.S   | 12 ++---
 xen/arch/x86/x86_64/entry.S  | 76 +---
 4 files changed, 69 insertions(+), 48 deletions(-)


base-commit: 8de3afc0b402bc17f65093a53e5870862707a8c7
-- 
2.30.2




Re: [PATCH v3 4/4] x86/spec: do not print thunk option selection if not built-in

2024-02-26 Thread Jan Beulich
On 26.02.2024 12:07, Roger Pau Monne wrote:
> Now that the thunk built-in enable is printed as part of the "Compiled-in
> support:" line, avoid printing anything in "Xen settings:" if the thunk is
> disabled at build time.

Why "Now that ..."? It's other logging the earlier patch adds there.

> Note the BTI-Thunk option printing is also adjusted to print a colon in the
> same way the other options on the line do.
> 
> Requested-by: Jan Beulich 
> Signed-off-by: Roger Pau Monné 

With either a clarification of what's meant or e.g. s/Now that/Since/
Reviewed-by: Jan Beulich 

Jan



Re: [PATCH v3 3/4] x86/spec: fix INDIRECT_THUNK option to only be set when build-enabled

2024-02-26 Thread Jan Beulich
On 26.02.2024 12:07, Roger Pau Monne wrote:
> Attempt to provide a more helpful error message when the user attempts to set
> spec-ctrl=bti-thunk option but the support is build-time disabled.
> 
> While there also adjust the command line documentation to mention
> CONFIG_INDIRECT_THUNK instead of INDIRECT_THUNK.
> 
> Reported-by: Andrew Cooper 
> Signed-off-by: Roger Pau Monné 

Reviewed-by: Jan Beulich 
with one minor remark:

> --- a/xen/arch/x86/spec_ctrl.c
> +++ b/xen/arch/x86/spec_ctrl.c
> @@ -241,7 +241,12 @@ static int __init cf_check parse_spec_ctrl(const char *s)
>  {
>  s += 10;
>  
> -if ( !cmdline_strcmp(s, "retpoline") )
> +if ( !IS_ENABLED(CONFIG_INDIRECT_THUNK) )
> +{
> +no_config_param("INDIRECT_THUNK", "spec-ctrl=bti-thunk", s, 
> ss);
> +rc = -EINVAL;
> +}
> +else if ( !cmdline_strcmp(s, "retpoline") )
>  opt_thunk = THUNK_RETPOLINE;
>  else if ( !cmdline_strcmp(s, "lfence") )
>  opt_thunk = THUNK_LFENCE;

How about

if ( !IS_ENABLED(CONFIG_INDIRECT_THUNK) )
{
no_config_param("INDIRECT_THUNK", "spec-ctrl", s - 10, ss);
rc = -EINVAL;
}
else if ( !cmdline_strcmp(s, "retpoline") )

or (likely less liked by you and Andrew) "s += 10;" dropped and then

if ( !IS_ENABLED(CONFIG_INDIRECT_THUNK) )
{
no_config_param("INDIRECT_THUNK", "spec-ctrl", s, ss);
rc = -EINVAL;
}
else if ( !cmdline_strcmp(s += 10, "retpoline") )

conserving a little on string literal space (sadly, despite the function
being __init, string literals remain post-init due to living in .rodata)?

Jan



Re: [PATCH v3 2/4] x86/spec: fix BRANCH_HARDEN option to only be set when build-enabled

2024-02-26 Thread Jan Beulich
On 26.02.2024 12:07, Roger Pau Monne wrote:
> The current logic to handle the BRANCH_HARDEN option will report it as enabled
> even when build-time disabled. Fix this by only allowing the option to be set
> when support for it is built into Xen.
> 
> Fixes: 2d6f36daa086 ('x86/nospec: Introduce CONFIG_SPECULATIVE_HARDEN_BRANCH')
> Signed-off-by: Roger Pau Monné 

Reviewed-by: Jan Beulich 





Re: [PATCH v3 1/4] x86/spec: print the built-in SPECULATIVE_HARDEN_* options

2024-02-26 Thread Jan Beulich
On 26.02.2024 12:07, Roger Pau Monne wrote:
> Just like it's done for INDIRECT_THUNK and SHADOW_PAGING.
> 
> Reported-by: Jan Beulich 
> Signed-off-by: Roger Pau Monné 

In principle
Reviewed-by: Jan Beulich 
but ...

> --- a/xen/arch/x86/spec_ctrl.c
> +++ b/xen/arch/x86/spec_ctrl.c
> @@ -466,13 +466,25 @@ static void __init print_details(enum ind_thunk thunk)
> (e21a & cpufeat_mask(X86_FEATURE_SBPB))   ? " SBPB"   
> : "");
>  
>  /* Compiled-in support which pertains to mitigations. */
> -if ( IS_ENABLED(CONFIG_INDIRECT_THUNK) || 
> IS_ENABLED(CONFIG_SHADOW_PAGING) )
> +if ( IS_ENABLED(CONFIG_INDIRECT_THUNK) || 
> IS_ENABLED(CONFIG_SHADOW_PAGING) ||
> + IS_ENABLED(CONFIG_SPECULATIVE_HARDEN_ARRAY) ||
> + IS_ENABLED(CONFIG_SPECULATIVE_HARDEN_BRANCH) ||
> + IS_ENABLED(CONFIG_SPECULATIVE_HARDEN_GUEST_ACCESS) )
>  printk("  Compiled-in support:"
>  #ifdef CONFIG_INDIRECT_THUNK
> " INDIRECT_THUNK"
>  #endif
>  #ifdef CONFIG_SHADOW_PAGING
> " SHADOW_PAGING"
> +#endif
> +#ifdef CONFIG_SPECULATIVE_HARDEN_ARRAY
> +   " SPECULATIVE_HARDEN_ARRAY"
> +#endif
> +#ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
> +   " SPECULATIVE_HARDEN_BRANCH"
> +#endif
> +#ifdef CONFIG_SPECULATIVE_HARDEN_GUEST_ACCESS
> +   " SPECULATIVE_HARDEN_GUEST_ACCESS"
>  #endif

... I'd like to suggest to drop the SPECULATIVE_ from the string literals.
They're relevant in the Kconfig identifiers, but they're imo redundant in
the context of these log messages. (Happy to adjust while committing, if
need be.)

Jan



Re: [PATCH v6 3/3] xen/livepatch: align functions to ensure minimal distance between entry points

2024-02-26 Thread Jan Beulich
On 26.02.2024 12:32, Roger Pau Monné wrote:
> On Tue, Feb 13, 2024 at 04:58:38PM +0100, Jan Beulich wrote:
>> On 07.02.2024 15:55, Roger Pau Monne wrote:
>>> The minimal function size requirements for an x86 livepatch are either 5 
>>> bytes
>>> (for jmp) or 9 bytes (for endbr + jmp), and always 4 bytes on Arm.  Ensure 
>>> that
>>> distance between functions entry points is always at least of the minimal
>>> required size for livepatch instruction replacement to be successful.
>>>
>>> Add an additional align directive to the linker scripts, in order to ensure 
>>> that
>>> the next section placed after the .text.* (per-function sections) is also
>>> aligned to the required boundary, so that the distance of the last function
>>> entry point with the next symbol is also of minimal size.
>>
>> Perhaps "... minimal required size"?
> 
> Yes.
> 
>>> --- a/xen/common/Kconfig
>>> +++ b/xen/common/Kconfig
>>> @@ -395,8 +395,11 @@ config CRYPTO
>>>  config LIVEPATCH
>>> bool "Live patching support"
>>> default X86
>>> -   depends on "$(XEN_HAS_BUILD_ID)" = "y"
>>> +   depends on "$(XEN_HAS_BUILD_ID)" = "y" && CC_HAS_FUNCTION_ALIGNMENT
>>> select CC_SPLIT_SECTIONS
>>> +   select FUNCTION_ALIGNMENT_16B if XEN_IBT
>>> +   select FUNCTION_ALIGNMENT_8B  if X86
>>> +   select FUNCTION_ALIGNMENT_4B  if ARM
>>
>> This isn't strictly needed, is it? Would be nice to avoid re-selection
>> of what the default for an arch is anyway, as otherwise this will start
>> looking clumsy when a couple more architectures are added.
> 
> My worry was that the default per-arch could change, ie: for example
> x86 moving from 16 to 8 and then it would hamper livepatch support if
> IBT is also enabled.  I however think it's very unlikely to reduce the
> default alignment, and in any case we would hit a build time assert if
> that ever happens.
> 
> So yes, I'm fine with dropping those.

Oh, no - not "those", only "that", i.e. only the last (Arm) one.

Jan



[xen-unstable-smoke test] 184764: tolerable all pass - PUSHED

2024-02-26 Thread osstest service owner
flight 184764 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/184764/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  8de3afc0b402bc17f65093a53e5870862707a8c7
baseline version:
 xen  92babc88f67ed0ef3dc575a8b9534040274678ee

Last test of basis   184745  2024-02-23 23:03:51 Z2 days
Testing same since   184764  2024-02-26 10:02:04 Z0 days1 attempts


People who touched revisions under test:
  Daniel P. Smith 
  Jan Beulich 
  Julien Grall 
  Nicola Vetrini 
  Oleksii Kurochko 
  Roger Pau Monné 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   92babc88f6..8de3afc0b4  8de3afc0b402bc17f65093a53e5870862707a8c7 -> smoke



Re: [PATCH v6 1/3] xen: introduce Kconfig function alignment option

2024-02-26 Thread Jan Beulich
On 26.02.2024 12:33, Roger Pau Monné wrote:
> On Tue, Feb 13, 2024 at 04:51:13PM +0100, Jan Beulich wrote:
>> On 07.02.2024 15:55, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/Kconfig
>>> +++ b/xen/arch/x86/Kconfig
>>> @@ -29,6 +29,7 @@ config X86
>>> select HAS_UBSAN
>>> select HAS_VPCI if HVM
>>> select NEEDS_LIBELF
>>> +   select FUNCTION_ALIGNMENT_16B
>>
>> With the insertion here as well as for Arm and PPC obeying alphabetic
>> sorting:
>> Reviewed-by: Jan Beulich 
> 
> Would you like me to resend with that adjusted?

I guess it can be taken care of while committing; I've taken note of
this. Sadly there is still at least one missing ack.

Jan



Re: [PATCH v6 1/3] xen: introduce Kconfig function alignment option

2024-02-26 Thread Roger Pau Monné
On Tue, Feb 13, 2024 at 04:51:13PM +0100, Jan Beulich wrote:
> On 07.02.2024 15:55, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/Kconfig
> > +++ b/xen/arch/x86/Kconfig
> > @@ -29,6 +29,7 @@ config X86
> > select HAS_UBSAN
> > select HAS_VPCI if HVM
> > select NEEDS_LIBELF
> > +   select FUNCTION_ALIGNMENT_16B
> 
> With the insertion here as well as for Arm and PPC obeying alphabetic
> sorting:
> Reviewed-by: Jan Beulich 

Would you like me to resend with that adjusted?

Thanks, Roger.



Re: [PATCH v6 3/3] xen/livepatch: align functions to ensure minimal distance between entry points

2024-02-26 Thread Roger Pau Monné
On Tue, Feb 13, 2024 at 04:58:38PM +0100, Jan Beulich wrote:
> On 07.02.2024 15:55, Roger Pau Monne wrote:
> > The minimal function size requirements for an x86 livepatch are either 5 
> > bytes
> > (for jmp) or 9 bytes (for endbr + jmp), and always 4 bytes on Arm.  Ensure 
> > that
> > distance between functions entry points is always at least of the minimal
> > required size for livepatch instruction replacement to be successful.
> > 
> > Add an additional align directive to the linker scripts, in order to ensure 
> > that
> > the next section placed after the .text.* (per-function sections) is also
> > aligned to the required boundary, so that the distance of the last function
> > entry point with the next symbol is also of minimal size.
> 
> Perhaps "... minimal required size"?

Yes.

> > --- a/xen/common/Kconfig
> > +++ b/xen/common/Kconfig
> > @@ -395,8 +395,11 @@ config CRYPTO
> >  config LIVEPATCH
> > bool "Live patching support"
> > default X86
> > -   depends on "$(XEN_HAS_BUILD_ID)" = "y"
> > +   depends on "$(XEN_HAS_BUILD_ID)" = "y" && CC_HAS_FUNCTION_ALIGNMENT
> > select CC_SPLIT_SECTIONS
> > +   select FUNCTION_ALIGNMENT_16B if XEN_IBT
> > +   select FUNCTION_ALIGNMENT_8B  if X86
> > +   select FUNCTION_ALIGNMENT_4B  if ARM
> 
> This isn't strictly needed, is it? Would be nice to avoid re-selection
> of what the default for an arch is anyway, as otherwise this will start
> looking clumsy when a couple more architectures are added.

My worry was that the default per-arch could change, ie: for example
x86 moving from 16 to 8 and then it would hamper livepatch support if
IBT is also enabled.  I however think it's very unlikely to reduce the
default alignment, and in any case we would hit a build time assert if
that ever happens.

So yes, I'm fine with dropping those.

> Preferably
> with that dropped (or it being clarified why it's still desirable to
> have):
> Reviewed-by: Jan Beulich 

Thanks, Roger.



Re: [PATCH v4 12/30] xen/riscv: introduce cmpxchg.h

2024-02-26 Thread Jan Beulich
On 26.02.2024 12:18, Oleksii wrote:
> On Mon, 2024-02-26 at 10:45 +0100, Jan Beulich wrote:
>> On 23.02.2024 13:23, Oleksii wrote:

>>> As 1- and 2-byte cases are emulated I decided that is not
>>> to
>>> provide
>>> sfx argument for emulation macros as it will not have to
>>> much
>>> affect on
>>> emulated types and just consume more performance on acquire
>>> and
>>> release
>>> version of sc/ld instructions.
>>
>> Question is whether the common case (4- and 8-byte accesses)
>> shouldn't
>> be valued higher, with 1- and 2-byte emulation being there
>> just
>> to
>> allow things to not break altogether.
> If I understand you correctly, it would make sense to add the
> 'sfx'
> argument for the 1/2-byte access case, ensuring that all
> options
> are
> available for 1/2-byte access case as well.

 That's one of the possibilities. As said, I'm not overly worried
 about
 the emulated cases. For the initial implementation I'd recommend
 going
 with what is easiest there, yielding the best possible result for
 the
 4- and 8-byte cases. If later it turns out repeated
 acquire/release
 accesses are a problem in the emulation loop, things can be
 changed
 to explicit barriers, without touching the 4- and 8-byte cases.
>>> I am confused then a little bit if emulated case is not an issue.
>>>
>>> For 4- and 8-byte cases for xchg .aqrl is used, for relaxed and
>>> aqcuire
>>> version of xchg barries are used.
>>>
>>> The similar is done for cmpxchg.
>>>
>>> If something will be needed to change in emulation loop it won't
>>> require to change 4- and 8-byte cases.
>>
>> I'm afraid I don't understand your reply.
> IIUC, emulated cases it is implemented correctly in terms of usage
> barriers. And it also OK not to use sfx for lr/sc instructions and use
> only barriers.
> 
> For 4- and 8-byte cases are used sfx + barrier depending on the
> specific case ( relaxed, acquire, release, generic xchg/cmpxchg ).
> What also looks to me correct. But you suggested to provide the best
> possible result for 4- and 8-byte cases. 
> 
> So I don't understand what the best possible result is as the current
> one usage of __{cmp}xchg_generic for each specific case  ( relaxed,
> acquire, release, generic xchg/cmpxchg ) looks correct to me:
> xchg -> (..., ".aqrl", "", "") just suffix .aqrl suffix without
> barriers.
> xchg_release -> (..., "", RISCV_RELEASE_BARRIER, "" ) use only release
> barrier
> xchg_acquire -> (..., "", "", RISCV_ACQUIRE_BARRIER ), only acquire
> barrier
> xchg_relaxed ->  (..., "", "", "") - no barries, no sfx

So first: While explicit barriers are technically okay, I don't follow why
you insist on using them when you can achieve the same by suitably tagging
the actual insn doing the exchange. Then second: It's somewhat hard for me
to see the final effect on the emulation paths without you actually having
done the switch. Maybe no special handling is necessary there anymore
then. And as said, it may actually be acceptable for the emulation paths
to "only" be correct, but not be ideal in terms of performance. After all,
if you use the normal 4-byte primitive in there, more (non-explicit)
barriers than needed would occur if the involved loop has to take more
than one iteration. Which could (but imo doesn't need to be) avoided by
using a more relaxed 4-byte primitive there and an explicit barrier
outside of the loop.

Jan



  1   2   >