[xen-unstable test] 179018: tolerable trouble: fail/pass/starved - PUSHED

2023-03-02 Thread osstest service owner
flight 179018 xen-unstable real [real]
flight 179059 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/179018/
http://logs.test-lab.xenproject.org/osstest/logs/179059/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-migrupgrade 10 xen-install/src_host fail pass in 179059-retest
 test-amd64-amd64-xl-qcow222 guest-start.2   fail pass in 179059-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 178965
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 178965
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 178965
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 178965
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 178965
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 178965
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 178965
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 178965
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 178965
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  380a8c0c65bfb84dab54ab4641cca1387cc41edb
baseline version:
 xen  b84fdf521b306cce64388fe57ee6c7c00f9d3e76

Last test of basis   178965  2023-03-02 09:32:57 Z0 days
Testing same since   179018  2023-03-02 21:09:15 Z0 days1 attempts


People who touched revisions under test:
  Oleksii Kurochko 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64 

Re: [PATCH 1/2] automation: arm64: Create a test job for testing static heap on qemu

2023-03-02 Thread Michal Orzel
Hi Jiamei,

On 03/03/2023 07:49, Jiamei Xie wrote:
> 
> 
> Hi Stefano,
> 
>> -Original Message-
>> From: Stefano Stabellini 
>> Sent: Friday, March 3, 2023 9:51 AM
>> To: Jiamei Xie 
>> Cc: xen-devel@lists.xenproject.org; Wei Chen ;
>> sstabell...@kernel.org; Bertrand Marquis ;
>> Doug Goldstein 
>> Subject: Re: [PATCH 1/2] automation: arm64: Create a test job for testing
>> static heap on qemu
>>
>> On Thu, 2 Mar 2023, jiamei.xie wrote:
>>> From: Jiamei Xie 
>>>
>>> Create a new test job, called qemu-smoke-dom0less-arm64-gcc-staticheap.
>>>
>>> Add property "xen,static-heap" under /chosen node to enable static-heap.
>>> If the domU can start successfully with static-heap enabled, then this
>>> test pass.
>>>
>>> Signed-off-by: Jiamei Xie 
>>
>> Hi Jiamei, thanks for the patch!
>>
>>
>>> ---
>>>  automation/gitlab-ci/test.yaml | 16 
>>>  .../scripts/qemu-smoke-dom0less-arm64.sh   | 18
>> ++
>>>  2 files changed, 34 insertions(+)
>>>
>>> diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
>>> index 1c5f400b68..5a9b88477a 100644
>>> --- a/automation/gitlab-ci/test.yaml
>>> +++ b/automation/gitlab-ci/test.yaml
>>> @@ -133,6 +133,22 @@ qemu-smoke-dom0less-arm64-gcc-debug-
>> staticmem:
>>>  - *arm64-test-needs
>>>  - alpine-3.12-gcc-debug-arm64-staticmem
>>>
>>> +qemu-smoke-dom0less-arm64-gcc-staticheap:
>>> + extends: .qemu-arm64
>>> + script:
>>> +   - ./automation/scripts/qemu-smoke-dom0less-arm64.sh static-heap
>> 2>&1 | tee ${LOGFILE}
>>> + needs:
>>> +   - *arm64-test-needs
>>> +   - alpine-3.12-gcc-arm64
>>> +
>>> +qemu-smoke-dom0less-arm64-gcc-debug-staticheap:
>>> + extends: .qemu-arm64
>>> + script:
>>> +   - ./automation/scripts/qemu-smoke-dom0less-arm64.sh static-heap
>> 2>&1 | tee ${LOGFILE}
>>> + needs:
>>> +   - *arm64-test-needs
>>> +   - alpine-3.12-gcc-debug-arm64
>>> +
>>>  qemu-smoke-dom0less-arm64-gcc-boot-cpupools:
>>>extends: .qemu-arm64
>>>script:
>>> diff --git a/automation/scripts/qemu-smoke-dom0less-arm64.sh
>> b/automation/scripts/qemu-smoke-dom0less-arm64.sh
>>> index 182a4b6c18..4e73857199 100755
>>> --- a/automation/scripts/qemu-smoke-dom0less-arm64.sh
>>> +++ b/automation/scripts/qemu-smoke-dom0less-arm64.sh
>>> @@ -27,6 +27,11 @@ fi
>>>  "
>>>  fi
>>>
>>> +if [[ "${test_variant}" == "static-heap" ]]; then
>>> +passed="${test_variant} test passed"
>>> +domU_check="echo \"${passed}\""
>>> +fi
>>> +
>>>  if [[ "${test_variant}" == "boot-cpupools" ]]; then
>>>  # Check if domU0 (id=1) is assigned to Pool-1 with null scheduler
>>>  passed="${test_variant} test passed"
>>> @@ -128,6 +133,19 @@ if [[ "${test_variant}" == "static-mem" ]]; then
>>>  echo -e "\nDOMU_STATIC_MEM[0]=\"${domu_base} ${domu_size}\"" >>
>> binaries/config
>>>  fi
>>>
>>> +if [[ "${test_variant}" == "static-heap" ]]; then
>>> +# ImageBuilder uses the config file to create the uboot script. 
>>> Devicetree
>>> +# will be set via the generated uboot script.
>>> +# The valid memory range is 0x4000 to 0x8000 as defined
>> before.
>>> +# ImageBuillder sets the kernel and ramdisk range based on the file 
>>> size.
>>> +# It will use the memory range between 0x4560 to 0x47AED1E8, so
>> set
>>> +# memory range between 0x5000 and 0x8000 as static heap.
>>
>> I think this is OK. One suggestion to make things more reliable would be
>> to change MEMORY_END to be 0x5000 so that you can be sure that
>> ImageBuilder won't go over the limit. You could do it just for this
>> test, which would be safer, but to be honest you could limit MEMORY_END
>> to 0x5000 for all tests in qemu-smoke-dom0less-arm64.sh because it
>> shouldn't really cause any problems.
>>
> [Jiamei Xie]
> Thanks for your comments. I am a little confused about " to change MEMORY_END 
> to be 0x5000".
>  I set 0STATIC_HEAP="0x5000 0x3000" where is the start address. Why 
> change MEMORY_END
>  to be 0x5000?
Let me answer to that question so that you do not need to wait another day for 
Stefano.
ImageBuilder uses MEMORY_START and MEMORY_END from the cfg file as a range in 
which it can instruct
u-boot where to place the images. It is safer to set MEMORY_END to 0x5000 
rather than to 0xC000
because you will be sure that no image will be placed in a region that you 
defined as static heap.

~Michal




Re: [XEN PATCH v7 13/20] xen/arm: ffa: support mapping guest RX/TX buffers

2023-03-02 Thread Jens Wiklander
Hi Bertrand,

On Thu, Mar 2, 2023 at 4:05 PM Bertrand Marquis
 wrote:
>
> Hi Jens,
>
> > On 22 Feb 2023, at 16:33, Jens Wiklander  wrote:
> >
> > Adds support in the mediator to map and unmap the RX and TX buffers
> > provided by the guest using the two FF-A functions FFA_RXTX_MAP and
> > FFA_RXTX_UNMAP.
> >
> > These buffer are later used to to transmit data that cannot be passed in
> > registers only.
> >
> > Signed-off-by: Jens Wiklander 
> > ---
> > xen/arch/arm/tee/ffa.c | 127 +
> > 1 file changed, 127 insertions(+)
> >
> > diff --git a/xen/arch/arm/tee/ffa.c b/xen/arch/arm/tee/ffa.c
> > index f1b014b6c7f4..953b6dfd5eca 100644
> > --- a/xen/arch/arm/tee/ffa.c
> > +++ b/xen/arch/arm/tee/ffa.c
> > @@ -149,10 +149,17 @@ struct ffa_partition_info_1_1 {
> > };
> >
> > struct ffa_ctx {
> > +void *rx;
> > +const void *tx;
> > +struct page_info *rx_pg;
> > +struct page_info *tx_pg;
> > +unsigned int page_count;
> > uint32_t guest_vers;
> > +bool tx_is_mine;
> > bool interrupted;
> > };
> >
> > +
> Newline probably added by mistake.

Yes, I'll remove it.

>
> > /* Negotiated FF-A version to use with the SPMC */
> > static uint32_t ffa_version __ro_after_init;
> >
> > @@ -337,6 +344,11 @@ static void set_regs(struct cpu_user_regs *regs, 
> > register_t v0, register_t v1,
> > set_user_reg(regs, 7, v7);
> > }
> >
> > +static void set_regs_error(struct cpu_user_regs *regs, uint32_t error_code)
> > +{
> > +set_regs(regs, FFA_ERROR, 0, error_code, 0, 0, 0, 0, 0);
> > +}
> > +
> > static void set_regs_success(struct cpu_user_regs *regs, uint32_t w2,
> >  uint32_t w3)
> > {
> > @@ -358,6 +370,99 @@ static void handle_version(struct cpu_user_regs *regs)
> > set_regs(regs, vers, 0, 0, 0, 0, 0, 0, 0);
> > }
> >
> > +static uint32_t handle_rxtx_map(uint32_t fid, register_t tx_addr,
> > +register_t rx_addr, uint32_t page_count)
> > +{
> > +uint32_t ret = FFA_RET_INVALID_PARAMETERS;
> > +struct domain *d = current->domain;
> > +struct ffa_ctx *ctx = d->arch.tee;
> > +struct page_info *tx_pg;
> > +struct page_info *rx_pg;
> > +p2m_type_t t;
> > +void *rx;
> > +void *tx;
> > +
> > +if ( !smccc_is_conv_64(fid) )
> > +{
> > +tx_addr &= UINT32_MAX;
> > +rx_addr &= UINT32_MAX;
> > +}
>
> I am bit wondering here what we should do:
> - we could just say that 32bit version of the call is not allowed for non 
> 32bit guests
> - we could check that the highest bits are 0 for 64bit guests and return an 
> error if not
> - we can just mask hopping that if there was a mistake the address after the 
> mask
> does not exist in the guest space
>
> At the end nothing in the spec is preventing a 64bit guest to use the 32bit 
> so it might
> be a good idea to return an error if the highest 32bit are not 0 here ?

The SMC Calling Convention says:
When an SMC32/HVC32 call is made from AArch64:
- A Function Identifier is passed in register W0.
- Arguments are passed in registers W1-W7.

So masking off the higher bits is all that should be done.

>
> > +
> > +/* For now to keep things simple, only deal with a single page */
> > +if ( page_count != 1 )
> > +return FFA_RET_NOT_SUPPORTED;
>
> Please add a TODO here and a print as this is a limitation we will probably 
> have to
> work on soon.

I'll add an arbitrary upper limit and a print if it's exceeded.

>
>
> > +
> > +/* Already mapped */
> > +if ( ctx->rx )
> > +return FFA_RET_DENIED;
> > +
> > +tx_pg = get_page_from_gfn(d, gfn_x(gaddr_to_gfn(tx_addr)), , 
> > P2M_ALLOC);
> > +if ( !tx_pg )
> > +return FFA_RET_INVALID_PARAMETERS;
> > +/* Only normal RAM for now */
> > +if ( !p2m_is_ram(t) )
> > +goto err_put_tx_pg;
> > +
> > +rx_pg = get_page_from_gfn(d, gfn_x(gaddr_to_gfn(rx_addr)), , 
> > P2M_ALLOC);
> > +if ( !tx_pg )
> > +goto err_put_tx_pg;
> > +/* Only normal RAM for now */
> > +if ( !p2m_is_ram(t) )
> > +goto err_put_rx_pg;
> > +
> > +tx = __map_domain_page_global(tx_pg);
> > +if ( !tx )
> > +goto err_put_rx_pg;
> > +
> > +rx = __map_domain_page_global(rx_pg);
> > +if ( !rx )
> > +goto err_unmap_tx;
> > +
> > +ctx->rx = rx;
> > +ctx->tx = tx;
> > +ctx->rx_pg = rx_pg;
> > +ctx->tx_pg = tx_pg;
> > +ctx->page_count = 1;
>
> please use page_count here instead of 1 so that this is not forgotten once
> we add support for more pages.

OK

Cheers,
Jens

>
>
> Cheers
> Bertrand
>
> > +ctx->tx_is_mine = true;
> > +return FFA_RET_OK;
> > +
> > +err_unmap_tx:
> > +unmap_domain_page_global(tx);
> > +err_put_rx_pg:
> > +put_page(rx_pg);
> > +err_put_tx_pg:
> > +put_page(tx_pg);
> > +
> > +return ret;
> > +}
> > +
> > +static void rxtx_unmap(struct ffa_ctx *ctx)
> > +{
> > +unmap_domain_page_global(ctx->rx);
> > +

Re: [PATCH v2 4/6] docs/about/deprecated: Deprecate the qemu-system-arm binary

2023-03-02 Thread Thomas Huth

On 02/03/2023 23.16, Philippe Mathieu-Daudé wrote:

On 2/3/23 17:31, Thomas Huth wrote:

qemu-system-aarch64 is a proper superset of qemu-system-arm,
and the latter was mainly still required for 32-bit KVM support.
But this 32-bit KVM arm support has been dropped in the Linux
kernel a couple of years ago already, so we don't really need
qemu-system-arm anymore, thus deprecated it now.

Signed-off-by: Thomas Huth 
---
  docs/about/deprecated.rst | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index a30aa8dfdf..21ce70b5c9 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -45,6 +45,16 @@ run 32-bit guests by selecting a 32-bit CPU model, 
including KVM support

  on x86_64 hosts. Thus users are recommended to reconfigure their systems
  to use the ``qemu-system-x86_64`` binary instead.
+``qemu-system-arm`` binary (since 8.0)
+''
+
+``qemu-system-aarch64`` is a proper superset of ``qemu-system-arm``. The
+latter was mainly a requirement for running KVM on 32-bit arm hosts, but
+this 32-bit KVM support has been removed some years ago already (see:


s/some/few/?


I can also use "three years ago" since the patch had been merged in March 2020.

+https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=541ad0150ca4 


+). Thus the QEMU project will drop the ``qemu-system-arm`` binary in a
+future release. Use ``qemu-system-aarch64`` instead.


If we unify, wouldn't it be simpler to name the single qemu-system
binary emulating various ARM architectures as 'qemu-system-arm'?


That would be more intuitive for people who are completely new to QEMU, but 
I guess it will cause a lot of "you broke my script that uses the -aarch64 
binary" troubles again. So I think it's likely better to not go down that road.


 Thomas




[PATCH] x86/altp2m: help gcc13 to avoid it emitting a warning

2023-03-02 Thread Jan Beulich
Switches of altp2m-s always expect a valid altp2m to be in place (and
indeed altp2m_vcpu_initialise() sets the active one to be at index 0).
The compiler, however, cannot know that, and hence it cannot eliminate
p2m_get_altp2m()'s case of returnin (literal) NULL. If then the compiler
decides to special case that code path in the caller, the dereference in
instances of

atomic_dec(_get_altp2m(v)->active_vcpus);

can, to the code generator, appear to be NULL dereferences, leading to

In function 'atomic_dec',
inlined from '...' at ...:
./arch/x86/include/asm/atomic.h:182:5: error: array subscript 0 is outside 
array bounds of 'int[0]' [-Werror=array-bounds=]

Aid the compiler by adding a BUG_ON() checking the return value of the
problematic p2m_get_altp2m(). Since with the use of the local variable
the 2nd p2m_get_altp2m() each will look questionable at the first glance
(Why is the local variable not used here?), open-code the only relevant
piece of p2m_get_altp2m() there.

To avoid repeatedly doing these transformations, and also to limit how
"bad" the open-coding really is, convert the entire operation to an
inline helper, used by all three instances (and accepting the redundant
BUG_ON(idx >= MAX_ALTP2M) in two of the three cases).

Reported-by: Charles Arnold 
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -4128,13 +4128,7 @@ void vmx_vmexit_handler(struct cpu_user_
 }
 }
 
-if ( idx != vcpu_altp2m(v).p2midx )
-{
-BUG_ON(idx >= MAX_ALTP2M);
-atomic_dec(_get_altp2m(v)->active_vcpus);
-vcpu_altp2m(v).p2midx = idx;
-atomic_inc(_get_altp2m(v)->active_vcpus);
-}
+p2m_set_altp2m(v, idx);
 }
 
 if ( unlikely(currd->arch.monitor.vmexit_enabled) )
--- a/xen/arch/x86/include/asm/p2m.h
+++ b/xen/arch/x86/include/asm/p2m.h
@@ -879,6 +879,26 @@ static inline struct p2m_domain *p2m_get
 return v->domain->arch.altp2m_p2m[index];
 }
 
+/* set current alternate p2m table */
+static inline bool p2m_set_altp2m(struct vcpu *v, unsigned int idx)
+{
+struct p2m_domain *orig;
+
+BUG_ON(idx >= MAX_ALTP2M);
+
+if ( idx == vcpu_altp2m(v).p2midx )
+return false;
+
+orig = p2m_get_altp2m(v);
+BUG_ON(!orig);
+atomic_dec(>active_vcpus);
+
+vcpu_altp2m(v).p2midx = idx;
+atomic_inc(>domain->arch.altp2m_p2m[idx]->active_vcpus);
+
+return true;
+}
+
 /* Switch alternate p2m for a single vcpu */
 bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, unsigned int idx);
 
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1789,13 +1789,8 @@ bool_t p2m_switch_vcpu_altp2m_by_id(stru
 
 if ( d->arch.altp2m_eptp[idx] != mfn_x(INVALID_MFN) )
 {
-if ( idx != vcpu_altp2m(v).p2midx )
-{
-atomic_dec(_get_altp2m(v)->active_vcpus);
-vcpu_altp2m(v).p2midx = idx;
-atomic_inc(_get_altp2m(v)->active_vcpus);
+if ( p2m_set_altp2m(v, idx) )
 altp2m_vcpu_update_p2m(v);
-}
 rc = 1;
 }
 
@@ -2072,13 +2067,8 @@ int p2m_switch_domain_altp2m_by_id(struc
 if ( d->arch.altp2m_visible_eptp[idx] != mfn_x(INVALID_MFN) )
 {
 for_each_vcpu( d, v )
-if ( idx != vcpu_altp2m(v).p2midx )
-{
-atomic_dec(_get_altp2m(v)->active_vcpus);
-vcpu_altp2m(v).p2midx = idx;
-atomic_inc(_get_altp2m(v)->active_vcpus);
+if ( p2m_set_altp2m(v, idx) )
 altp2m_vcpu_update_p2m(v);
-}
 
 rc = 0;
 }



[PATCH RFC] bunzip: work around gcc13 warning

2023-03-02 Thread Jan Beulich
While provable that length[0] is always initialized (because symCount
cannot be zero), upcoming gcc13 fails to recognize this and warns about
the unconditional use of the value immediately following the loop.

See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106511.

Reported-by: Martin Liška 
Signed-off-by: Jan Beulich 
---
RFC: We've cloned this code from Linux and the code is unchanged there.
 Therefore the same issue should exist there, and we may better get
 whatever workaround is going to be applied there. But I'm unaware
 of the issue, so far, having been observed in and reported against
 Linux. This may be because they disable the maybe-uninitialized
 warning by default, and they re-enable it only when building with
 W=2.

--- a/xen/common/bunzip2.c
+++ b/xen/common/bunzip2.c
@@ -233,7 +233,7 @@ static int __init get_next_block(struct
   becomes negative, so an unsigned inequality catches
   it.) */
t = get_bits(bd, 5)-1;
-   for (i = 0; i < symCount; i++) {
+   for (length[0] = i = 0; i < symCount; i++) {
for (;;) {
if (((unsigned)t) > (MAX_HUFCODE_BITS-1))
return RETVAL_DATA_ERROR;



Re: [PATCH 1/2] x86/cpuid: Infrastructure for leaves 7:1{ecx,edx}

2023-03-02 Thread Jan Beulich
On 04.01.2023 12:11, Andrew Cooper wrote:
> --- a/xen/include/public/arch-x86/cpufeatureset.h
> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> @@ -288,6 +288,9 @@ XEN_CPUFEATURE(NSCB,   11*32+ 6) /*A  Null 
> Selector Clears Base (and
>  /* Intel-defined CPU features, CPUID level 0x0007:1.ebx, word 12 */
>  XEN_CPUFEATURE(INTEL_PPIN, 12*32+ 0) /*   Protected Processor 
> Inventory Number */
>  
> +/* Intel-defined CPU features, CPUID level 0x0007:1.ecx, word 14 */
> +/* Intel-defined CPU features, CPUID level 0x0007:1.edx, word 15 */

While committing the backports of this (where I normally test-build
every commit individually) I came to notice that this introduces a
transient (until the next commit) build breakage: FEATURESET_NR_ENTRIES
is calculated from the highest entry found; the comments here don't
matter at all. Therefore ...

> @@ -343,6 +352,8 @@ static inline void cpuid_policy_to_featureset(
>  fs[FEATURESET_e21a] = p->extd.e21a;
>  fs[FEATURESET_7b1] = p->feat._7b1;
>  fs[FEATURESET_7d2] = p->feat._7d2;
> +fs[FEATURESET_7c1] = p->feat._7c1;
> +fs[FEATURESET_7d1] = p->feat._7d1;
>  }
>  
>  /* Fill in a CPUID policy from a featureset bitmap. */
> @@ -363,6 +374,8 @@ static inline void cpuid_featureset_to_policy(
>  p->extd.e21a  = fs[FEATURESET_e21a];
>  p->feat._7b1  = fs[FEATURESET_7b1];
>  p->feat._7d2  = fs[FEATURESET_7d2];
> +p->feat._7c1  = fs[FEATURESET_7c1];
> +p->feat._7d1  = fs[FEATURESET_7d1];
>  }

... the compiler legitimately complains about out-of-bounds array
accesses here. This is just fyi for the future (to arrange patch
splitting differently); I've left the backports as they were.

Jan



Re: [XEN PATCH v7 07/20] xen/arm: ffa: add defines for framework direct request/response messages

2023-03-02 Thread Jens Wiklander
Hi Bertrand,

On Fri, Feb 24, 2023 at 10:39 AM Bertrand Marquis
 wrote:
>
> Hi Jens,
>
> > On 22 Feb 2023, at 16:33, Jens Wiklander  wrote:
> >
> > Adds defines for framework direct request/response messages.
> >
> > Signed-off-by: Jens Wiklander 
> > ---
> > xen/arch/arm/tee/ffa.c | 9 +
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/xen/arch/arm/tee/ffa.c b/xen/arch/arm/tee/ffa.c
> > index f4562ed2defc..d04bac9cc47f 100644
> > --- a/xen/arch/arm/tee/ffa.c
> > +++ b/xen/arch/arm/tee/ffa.c
> > @@ -56,6 +56,15 @@
> > #define FFA_MY_VERSION  MAKE_FFA_VERSION(FFA_MY_VERSION_MAJOR, \
> >  FFA_MY_VERSION_MINOR)
> >
> > +/* Framework direct request/response */
>
> In the previous patch you were more verbose in the comment which was nice.
> I would suggest here to use the same "format":
>
> Flags used for the MSG_SEND_DIRECT_REQ/RESP:
> BIT(31): Framework or partition message
> BIT(7-0): Message type for frameworks messages

OK, I'll update.

>
> > +#define FFA_MSG_FLAG_FRAMEWORK  BIT(31, U)
> > +#define FFA_MSG_TYPE_MASK   0xFFU;
>
> Maybe more coherent to name this FFA_MSG_FLAG_TYPE_MASK ?

This is a balancing act, in this case, I don't think that adding FLAG_
helps much.

>
> I am a bit unsure here because we could also keep it like that and just
> add _TYPE to other definitions after.
>
> What do you think ?

I think the defines are long enough as they are.

Cheers,
Jens

>
> > +#define FFA_MSG_PSCI0x0U
> > +#define FFA_MSG_SEND_VM_CREATED 0x4U
> > +#define FFA_MSG_RESP_VM_CREATED 0x5U
> > +#define FFA_MSG_SEND_VM_DESTROYED   0x6U
> > +#define FFA_MSG_RESP_VM_DESTROYED   0x7U
> > +
> > /*
> >  * Flags used for the FFA_PARTITION_INFO_GET return message:
> >  * BIT(0): Supports receipt of direct requests
> > --
> > 2.34.1
> >
>
> Cheers
> Bertrand
>



RE: [ImageBuilder][PATCH 1/2] uboot-script-gen: Add support for static heap

2023-03-02 Thread Jiamei Xie
Hi Stefano and Michal,

> -Original Message-
> From: Stefano Stabellini 
> Sent: Friday, March 3, 2023 7:42 AM
> To: Michal Orzel 
> Cc: Jiamei Xie ; xen-devel@lists.xenproject.org; Wei
> Chen ; sstabell...@kernel.org
> Subject: Re: [ImageBuilder][PATCH 1/2] uboot-script-gen: Add support for
> static heap
> 
> On Thu, 2 Mar 2023, Michal Orzel wrote:
> > Hi Jiamei,
> >
> > Patch looks good apart from minor comments down below.
> 
> Just wanted to add that the patch looks OK to me too and don't have any
> further comments beyond the ones Michal's already made
> 
> 
> > On 02/03/2023 05:46, jiamei.xie wrote:
> > >
> > >
> > > From: jiamei Xie 
> > >
> > > Add a new config parameter to configure static heap.
> > > STATIC_HEAP="baseaddr1 size1 ... baseaddrN sizeN"
> > > if specified, indicates the host physical address regions
> > > [baseaddr, baseaddr + size) to be reserved as static heap.
> > >
> > > For instance, STATIC_HEAP="0x5000 0x3000", if specified,
> > > indicates the host memory region starting from paddr 0x5000
> > > with a size of 0x3000 to be reserved as static heap.
> > >
> > > Signed-off-by: jiamei Xie 
> > > ---
> > >  README.md|  4 
> > >  scripts/uboot-script-gen | 20 
> > >  2 files changed, 24 insertions(+)
> > >
> > > diff --git a/README.md b/README.md
> > > index 814a004..787f413 100644
> > > --- a/README.md
> > > +++ b/README.md
> > > @@ -256,6 +256,10 @@ Where:
> > >
> > >  - NUM_CPUPOOLS specifies the number of boot-time cpupools to create.
> > >
> > > +- STATIC_HEAP="baseaddr1 size1 ... baseaddrN sizeN"
> > > +  if specified, indicates the host physical address regions
> > > +  [baseaddr, baseaddr + size) to be reserved as static heap.
> > As this option impacts Xen and not domUs, please call it XEN_STATIC_HEAP
> and move
> > it right after XEN_CMD documentation.
Thanks for your comments . Ack
> >
> > > +
> > >  Then you can invoke uboot-script-gen as follows:
> > >
> > >  ```
> > > diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
> > > index f07e334..4775293 100755
> > > --- a/scripts/uboot-script-gen
> > > +++ b/scripts/uboot-script-gen
> > > @@ -189,6 +189,21 @@ function add_device_tree_static_mem()
> > >  dt_set "$path" "xen,static-mem" "hex" "${cells[*]}"
> > >  }
> > >
> > > +function add_device_tree_static_heap()
> > > +{
> > > +local path=$1
> > > +local regions=$2
> > > +local cells=()
> > > +local val
> > > +
> > > +for val in ${regions[@]}
> > > +do
> > > +cells+=("$(printf "0x%x 0x%x" $(($val >> 32)) $(($val & ((1 << 
> > > 32) -
> 1")
> > Please use split_value function instead of opencoding it.
> > It will then become:
> > cells+=("$(split_value $val)")

Thanks for your comments . Ack.
> >
> > ~Michal
> >



RE: [PATCH 1/2] automation: arm64: Create a test job for testing static heap on qemu

2023-03-02 Thread Jiamei Xie
Hi Stefano,

> -Original Message-
> From: Stefano Stabellini 
> Sent: Friday, March 3, 2023 9:51 AM
> To: Jiamei Xie 
> Cc: xen-devel@lists.xenproject.org; Wei Chen ;
> sstabell...@kernel.org; Bertrand Marquis ;
> Doug Goldstein 
> Subject: Re: [PATCH 1/2] automation: arm64: Create a test job for testing
> static heap on qemu
> 
> On Thu, 2 Mar 2023, jiamei.xie wrote:
> > From: Jiamei Xie 
> >
> > Create a new test job, called qemu-smoke-dom0less-arm64-gcc-staticheap.
> >
> > Add property "xen,static-heap" under /chosen node to enable static-heap.
> > If the domU can start successfully with static-heap enabled, then this
> > test pass.
> >
> > Signed-off-by: Jiamei Xie 
> 
> Hi Jiamei, thanks for the patch!
> 
> 
> > ---
> >  automation/gitlab-ci/test.yaml | 16 
> >  .../scripts/qemu-smoke-dom0less-arm64.sh   | 18
> ++
> >  2 files changed, 34 insertions(+)
> >
> > diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> > index 1c5f400b68..5a9b88477a 100644
> > --- a/automation/gitlab-ci/test.yaml
> > +++ b/automation/gitlab-ci/test.yaml
> > @@ -133,6 +133,22 @@ qemu-smoke-dom0less-arm64-gcc-debug-
> staticmem:
> >  - *arm64-test-needs
> >  - alpine-3.12-gcc-debug-arm64-staticmem
> >
> > +qemu-smoke-dom0less-arm64-gcc-staticheap:
> > + extends: .qemu-arm64
> > + script:
> > +   - ./automation/scripts/qemu-smoke-dom0less-arm64.sh static-heap
> 2>&1 | tee ${LOGFILE}
> > + needs:
> > +   - *arm64-test-needs
> > +   - alpine-3.12-gcc-arm64
> > +
> > +qemu-smoke-dom0less-arm64-gcc-debug-staticheap:
> > + extends: .qemu-arm64
> > + script:
> > +   - ./automation/scripts/qemu-smoke-dom0less-arm64.sh static-heap
> 2>&1 | tee ${LOGFILE}
> > + needs:
> > +   - *arm64-test-needs
> > +   - alpine-3.12-gcc-debug-arm64
> > +
> >  qemu-smoke-dom0less-arm64-gcc-boot-cpupools:
> >extends: .qemu-arm64
> >script:
> > diff --git a/automation/scripts/qemu-smoke-dom0less-arm64.sh
> b/automation/scripts/qemu-smoke-dom0less-arm64.sh
> > index 182a4b6c18..4e73857199 100755
> > --- a/automation/scripts/qemu-smoke-dom0less-arm64.sh
> > +++ b/automation/scripts/qemu-smoke-dom0less-arm64.sh
> > @@ -27,6 +27,11 @@ fi
> >  "
> >  fi
> >
> > +if [[ "${test_variant}" == "static-heap" ]]; then
> > +passed="${test_variant} test passed"
> > +domU_check="echo \"${passed}\""
> > +fi
> > +
> >  if [[ "${test_variant}" == "boot-cpupools" ]]; then
> >  # Check if domU0 (id=1) is assigned to Pool-1 with null scheduler
> >  passed="${test_variant} test passed"
> > @@ -128,6 +133,19 @@ if [[ "${test_variant}" == "static-mem" ]]; then
> >  echo -e "\nDOMU_STATIC_MEM[0]=\"${domu_base} ${domu_size}\"" >>
> binaries/config
> >  fi
> >
> > +if [[ "${test_variant}" == "static-heap" ]]; then
> > +# ImageBuilder uses the config file to create the uboot script. 
> > Devicetree
> > +# will be set via the generated uboot script.
> > +# The valid memory range is 0x4000 to 0x8000 as defined
> before.
> > +# ImageBuillder sets the kernel and ramdisk range based on the file 
> > size.
> > +# It will use the memory range between 0x4560 to 0x47AED1E8, so
> set
> > +# memory range between 0x5000 and 0x8000 as static heap.
> 
> I think this is OK. One suggestion to make things more reliable would be
> to change MEMORY_END to be 0x5000 so that you can be sure that
> ImageBuilder won't go over the limit. You could do it just for this
> test, which would be safer, but to be honest you could limit MEMORY_END
> to 0x5000 for all tests in qemu-smoke-dom0less-arm64.sh because it
> shouldn't really cause any problems.
> 
[Jiamei Xie] 
Thanks for your comments. I am a little confused about " to change MEMORY_END 
to be 0x5000". 
 I set 0STATIC_HEAP="0x5000 0x3000" where is the start address. Why 
change MEMORY_END
 to be 0x5000?


Best wishes
Jiamei Xie


> 
> > +echo  '
> > +STATIC_HEAP="0x5000 0x3000"
> > +# The size of static heap should be greater than the guest memory
> > +DOMU_MEM[0]="128"' >> binaries/config
> > +fi
> > +
> >  if [[ "${test_variant}" == "boot-cpupools" ]]; then
> >  echo '
> >  CPUPOOL[0]="cpu@1 null"
> > --
> > 2.25.1
> >



[linux-linus test] 178998: regressions - trouble: fail/pass/starved

2023-03-02 Thread osstest service owner
flight 178998 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/178998/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-freebsd12-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-qemuu-nested-intel  8 xen-boot  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-ws16-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-dom0pvh-xl-amd 14 guest-start   fail REGR. vs. 178042
 test-amd64-amd64-xl-pvshim8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-vhd   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-pvhv2-intel  8 xen-boot  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-win7-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-win7-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-xsm   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-credit1   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 
178042
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 8 xen-boot fail REGR. vs. 
178042
 test-amd64-amd64-xl-qemut-debianhvm-amd64  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-xl   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-shadow8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-pvhv2-amd  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-qemuu-nested-amd  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-ws16-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-examine-uefi  8 reboot  fail REGR. vs. 178042
 test-amd64-amd64-freebsd11-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-libvirt-raw  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-pygrub   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 8 xen-boot fail REGR. 
vs. 178042
 test-amd64-amd64-pair12 xen-boot/src_hostfail REGR. vs. 178042
 test-amd64-amd64-pair13 xen-boot/dst_hostfail REGR. vs. 178042
 test-amd64-amd64-libvirt-qcow2  8 xen-boot   fail REGR. vs. 178042
 test-amd64-amd64-libvirt-xsm  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-libvirt-pair 12 xen-boot/src_host   fail REGR. vs. 178042
 test-amd64-amd64-libvirt-pair 13 xen-boot/dst_host   fail REGR. vs. 178042
 test-amd64-coresched-amd64-xl  8 xen-bootfail REGR. vs. 178042
 test-arm64-arm64-xl-credit1  14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl  14 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-xl-credit2   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 178042
 test-arm64-arm64-xl-thunderx 14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl-xsm 18 guest-start/debian.repeat fail REGR. vs. 178042
 test-amd64-amd64-examine-bios  8 reboot  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-ovmf-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. 
vs. 178042
 test-amd64-amd64-xl-multivcpu  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-dom0pvh-xl-intel 14 guest-start fail REGR. vs. 178042
 test-arm64-arm64-xl-vhd  12 debian-di-installfail REGR. vs. 178042
 test-arm64-arm64-libvirt-raw 12 debian-di-installfail REGR. vs. 178042
 test-arm64-arm64-libvirt-xsm 17 guest-stop fail in 178910 REGR. vs. 178042
 test-arm64-arm64-xl-credit2 18 guest-start/debian.repeat fail in 178910 REGR. 
vs. 178042

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-xl-xsm  14 guest-start  fail in 178910 pass in 178998
 test-arm64-arm64-xl-credit2  14 guest-startfail pass in 178910
 test-arm64-arm64-libvirt-xsm 14 guest-startfail pass in 178951

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  8 xen-boot fail REGR. vs. 178042

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-credit2 15 migrate-support-check fail in 178910 never pass
 test-arm64-arm64-xl-credit2 16 saverestore-support-check fail in 178910 never 
pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-check fail in 178910 never pass
 

Re: [RFC XEN PATCH 0/7] automation, RFC prototype, Have GitLab CI built its own containers

2023-03-02 Thread Stefano Stabellini
On Thu, 2 Mar 2023, Anthony PERARD wrote:
> Patch series available in this git branch:
> https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git 
> br.gitlab-containers-auto-rebuild-v1
> 
> Hi,
> 
> I have done some research to be able to build containers in the CI. This works
> only for x86 containers as I've setup only a single x86 gitlab-runner to be
> able to run docker-in-docker.
> 
> The runner is setup to only accept jobs from a branch that is "protected" in
> gitlab. Also, one need credential to push to the container register, those are
> called "Group deploy tokens", and I've set the variables CI_DEPLOY_USER and
> CI_DEPLOY_PASSWORD in the project "xen-project/xen" (variables only visible on
> a pipeline running on a protected branch).
> 
> These patch introduce quite a lot of redundancies in jobs, 2 new jobs per
> containers which build/push containers, and duplicate most of build.yaml.
> This mean that if we go with this, we have to duplicate and keep in sync many
> jobs.
> 
> To avoid having to do the duplicated jobs by hand, I could look at
> creating a script that use "build.yaml" as input and produce the 3
> stages needed to update a container, but that script would need to be
> run by hand, as gitlab can't really use it, unless ..
> 
> I've look at generated pipeline, and they look like this in gitlab:
> https://gitlab.com/xen-project/people/anthonyper/xen/-/pipelines/777665738
> But the result of the generated/child pipeline doesn't seems to be taken into
> account in the original pipeline, which make me think that we can't use them 
> to
> generate "build.yaml". But maybe the could be use for generating the pipeline
> that will update a container.
> Doc:
> 
> https://docs.gitlab.com/ee/ci/pipelines/downstream_pipelines.html#dynamic-child-pipelines
> 
> So, with all of this, is it reasonable to test containers before
> pushing them to production? Or is it to much work? We could simply have jobs
> tacking care of rebuilding a container and pushing them to production without
> testing.

I don't think it is a good idea to duplicate build.yaml, also because
some of the containers are used in the testing stage too, so an updated
container could be OK during the build phase and break the testing
phase. We would need to duplicate both build.yaml and test.yaml, which
is not feasible.

In practice today people either:
1) re-build a container locally & test it locally before pushing
2) re-build a container locally, docker push it, then run a private
   gitlab pipeline, if it passes send out a patch to xen-devel

1) is not affected by this series
2) is also not affected because by the time the pipeline is created, the
container is already updated

However, there are cases where it would definitely be nice to have a
"button" to press to update a container. For instance, when a pipeline
failis due to a Debian unstable apt-get failure, which can be easily fixed
by updating the Debian unstable container.

So I think it would be nice to have jobs that can automatically update
the build containers but I would set them to manually trigger instead of
automatically (when: manual).


Alternatively, we could move the "containers.yaml" stage to be the first
step, rebuild the containers and push them to a "staging" area
(registry.gitlab.com/xen-project/xen/staging), run the build and test
steps fetching from the staging area instead of the regular, if all
tests pass, then push the containers to
registry.gitlab.com/xen-project/xen as last step.


> An example with the variable DO_REBUILD_CONTAINER and PUSH_CONTAINER set (and
> existing build/test jobs disabled):
> https://gitlab.com/xen-project/people/anthonyper/xen/-/pipelines/791711467
> 
> Cheers,
> 
> Anthony PERARD (7):
>   automation: Automatically rebuild debian:unstable container
>   automation: Introduce test-containers stage
>   automation: Add a template per container for build jobs.
>   automation: Adding containers build jobs and test of thoses
>   automation: Introduce DO_REBUILD_CONTAINER, to allow to rebuild a
> container
>   automation: Push container been tested
>   automation: Add some more push containers jobs
> 
>  .gitlab-ci.yml|   6 +
>  automation/build/Makefile |  14 +-
>  automation/gitlab-ci/build.yaml   | 327 --
>  automation/gitlab-ci/containers.yaml  |  98 +
>  automation/gitlab-ci/push-containers.yaml |  79 
>  automation/gitlab-ci/test-containers.yaml | 496 ++
>  6 files changed, 894 insertions(+), 126 deletions(-)
>  create mode 100644 automation/gitlab-ci/containers.yaml
>  create mode 100644 automation/gitlab-ci/push-containers.yaml
>  create mode 100644 automation/gitlab-ci/test-containers.yaml
> 
> -- 
> Anthony PERARD
> 



Re: [PATCH 2/2] automation: arm64: Create a test job for testing static shared memory on qemu

2023-03-02 Thread Stefano Stabellini
On Thu, 2 Mar 2023, jiamei.xie wrote:
> Create a new test job, called qemu-smoke-dom0less-arm64-gcc-static-shared-mem.
> 
> Adjust qemu-smoke-dom0less-arm64.sh script to accomodate the static
> shared memory test as a new test variant. The test variant is determined
> based on the first argument passed to the script. For testing static
> shared memory, the argument is 'static-shared-mem'.
> 
> The test configures two dom0less DOMUs with a static shared memory
> region and adds a check in the init script.
> 
> The check consists in comparing the contents of the 
> /proc/device-tree/reserved-memory
> xen-shmem entry with the static shared memory range and id with which
> DOMUs were configured. If the memory layout is correct, a message gets
> printed by DOMU.
> 
> At the end of the qemu run, the script searches for the specific message
> in the logs and fails if not found.
> 
> Signed-off-by: jiamei.xie 
> ---
>  automation/gitlab-ci/build.yaml   | 18 
>  automation/gitlab-ci/test.yaml| 16 ++
>  .../scripts/qemu-smoke-dom0less-arm64.sh  | 29 +++
>  3 files changed, 63 insertions(+)
> 
> diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml
> index 38bb22d860..820cc0af83 100644
> --- a/automation/gitlab-ci/build.yaml
> +++ b/automation/gitlab-ci/build.yaml
> @@ -623,6 +623,24 @@ alpine-3.12-gcc-debug-arm64-staticmem:
>CONFIG_UNSUPPORTED=y
>CONFIG_STATIC_MEMORY=y
>  
> +alpine-3.12-gcc-arm64-static-shared-mem:
> +  extends: .gcc-arm64-build
> +  variables:
> +CONTAINER: alpine:3.12-arm64v8
> +EXTRA_XEN_CONFIG: |
> +  CONFIG_UNSUPPORTED=y
> +  CONFIG_STATIC_MEMORY=y
> +  CONFIG_STATIC_SHM=y
> +
> +alpine-3.12-gcc-debug-arm64-static-shared-mem:
> +  extends: .gcc-arm64-build-debug
> +  variables:
> +CONTAINER: alpine:3.12-arm64v8
> +EXTRA_XEN_CONFIG: |
> +  CONFIG_UNSUPPORTED=y
> +  CONFIG_STATIC_MEMORY=y
> +  CONFIG_STATIC_SHM=y
> +
>  alpine-3.12-gcc-arm64-boot-cpupools:
>extends: .gcc-arm64-build
>variables:
> diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> index 5a9b88477a..f4d36babda 100644
> --- a/automation/gitlab-ci/test.yaml
> +++ b/automation/gitlab-ci/test.yaml
> @@ -149,6 +149,22 @@ qemu-smoke-dom0less-arm64-gcc-debug-staticheap:
> - *arm64-test-needs
> - alpine-3.12-gcc-debug-arm64
>  
> +qemu-smoke-dom0less-arm64-gcc-static-shared-mem:
> +  extends: .qemu-arm64
> +  script:
> +- ./automation/scripts/qemu-smoke-dom0less-arm64.sh static-shared-mem 
> 2>&1 | tee ${LOGFILE}
> +  needs:
> +- *arm64-test-needs
> +- alpine-3.12-gcc-arm64-static-shared-mem
> +
> +qemu-smoke-dom0less-arm64-gcc-debug-static-shared-mem:
> +  extends: .qemu-arm64
> +  script:
> +- ./automation/scripts/qemu-smoke-dom0less-arm64.sh static-shared-mem 
> 2>&1 | tee ${LOGFILE}
> +  needs:
> +- *arm64-test-needs
> +- alpine-3.12-gcc-debug-arm64-static-shared-mem
> +
>  qemu-smoke-dom0less-arm64-gcc-boot-cpupools:
>extends: .qemu-arm64
>script:
> diff --git a/automation/scripts/qemu-smoke-dom0less-arm64.sh 
> b/automation/scripts/qemu-smoke-dom0less-arm64.sh
> index 4e73857199..fe3a282726 100755
> --- a/automation/scripts/qemu-smoke-dom0less-arm64.sh
> +++ b/automation/scripts/qemu-smoke-dom0less-arm64.sh
> @@ -32,6 +32,25 @@ if [[ "${test_variant}" == "static-heap" ]]; then
>  domU_check="echo \"${passed}\""
>  fi
>  
> +
> +if [[ "${test_variant}" == "static-shared-mem" ]]; then
> +passed="${test_variant} test passed"
> +SHARED_MEM_HOST="5000"
> +SHARED_MEM_GUEST="400"
> +SHARED_MEM_SIZE="1000"
> +SHARED_MEM_ID="my-shared-mem-0"
> +
> +domU_check="
> +current_id=\$(cat /proc/device-tree/reserved-memory/xen-shmem@400/xen,id 
> 2>/dev/null)
> +expected_id=\"\$(echo ${SHARED_MEM_ID})\"
> +current_reg=\$(hexdump -e '16/1 \"%02x\"' 
> /proc/device-tree/reserved-memory/xen-shmem@400/reg 2>/dev/null)
> +expected_reg=$(printf \"%016x%016x\" 0x${SHARED_MEM_GUEST} 
> 0x${SHARED_MEM_SIZE})
> +if [[ \"\${expected_reg}\" == \"\${current_reg}\" && \"\${current_id}\" == 
> \"\${expected_id}\" ]]; then
> +echo \"${passed}\"
> +fi
> +"
> +fi

all good so far


>  if [[ "${test_variant}" == "boot-cpupools" ]]; then
>  # Check if domU0 (id=1) is assigned to Pool-1 with null scheduler
>  passed="${test_variant} test passed"
> @@ -133,6 +152,16 @@ if [[ "${test_variant}" == "static-mem" ]]; then
>  echo -e "\nDOMU_STATIC_MEM[0]=\"${domu_base} ${domu_size}\"" >> 
> binaries/config
>  fi
>  
> +if [[ "${test_variant}" == "static-shared-mem" ]]; then
> +echo "NUM_DOMUS=2
> +DOMU_SHARED_MEM[0]=\""0x${SHARED_MEM_HOST} 0x${SHARED_MEM_GUEST} 
> 0x${SHARED_MEM_SIZE}"\"
> +DOMU_SHARED_MEM_ID[0]="${SHARED_MEM_ID}"
> +DOMU_KERNEL[1]=\"Image\"
> +DOMU_RAMDISK[1]=\"initrd\"

Please move the second domU creation above to the general ImageBuilder
script. It is fine to start 2 

Re: [PATCH 1/2] automation: arm64: Create a test job for testing static heap on qemu

2023-03-02 Thread Stefano Stabellini
On Thu, 2 Mar 2023, jiamei.xie wrote:
> From: Jiamei Xie 
> 
> Create a new test job, called qemu-smoke-dom0less-arm64-gcc-staticheap.
> 
> Add property "xen,static-heap" under /chosen node to enable static-heap.
> If the domU can start successfully with static-heap enabled, then this
> test pass.
> 
> Signed-off-by: Jiamei Xie 

Hi Jiamei, thanks for the patch!


> ---
>  automation/gitlab-ci/test.yaml | 16 
>  .../scripts/qemu-smoke-dom0less-arm64.sh   | 18 ++
>  2 files changed, 34 insertions(+)
> 
> diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> index 1c5f400b68..5a9b88477a 100644
> --- a/automation/gitlab-ci/test.yaml
> +++ b/automation/gitlab-ci/test.yaml
> @@ -133,6 +133,22 @@ qemu-smoke-dom0less-arm64-gcc-debug-staticmem:
>  - *arm64-test-needs
>  - alpine-3.12-gcc-debug-arm64-staticmem
>  
> +qemu-smoke-dom0less-arm64-gcc-staticheap:
> + extends: .qemu-arm64
> + script:
> +   - ./automation/scripts/qemu-smoke-dom0less-arm64.sh static-heap 2>&1 | 
> tee ${LOGFILE}
> + needs:
> +   - *arm64-test-needs
> +   - alpine-3.12-gcc-arm64
> +
> +qemu-smoke-dom0less-arm64-gcc-debug-staticheap:
> + extends: .qemu-arm64
> + script:
> +   - ./automation/scripts/qemu-smoke-dom0less-arm64.sh static-heap 2>&1 | 
> tee ${LOGFILE}
> + needs:
> +   - *arm64-test-needs
> +   - alpine-3.12-gcc-debug-arm64
> +
>  qemu-smoke-dom0less-arm64-gcc-boot-cpupools:
>extends: .qemu-arm64
>script:
> diff --git a/automation/scripts/qemu-smoke-dom0less-arm64.sh 
> b/automation/scripts/qemu-smoke-dom0less-arm64.sh
> index 182a4b6c18..4e73857199 100755
> --- a/automation/scripts/qemu-smoke-dom0less-arm64.sh
> +++ b/automation/scripts/qemu-smoke-dom0less-arm64.sh
> @@ -27,6 +27,11 @@ fi
>  "
>  fi
>  
> +if [[ "${test_variant}" == "static-heap" ]]; then
> +passed="${test_variant} test passed"
> +domU_check="echo \"${passed}\""
> +fi
> +
>  if [[ "${test_variant}" == "boot-cpupools" ]]; then
>  # Check if domU0 (id=1) is assigned to Pool-1 with null scheduler
>  passed="${test_variant} test passed"
> @@ -128,6 +133,19 @@ if [[ "${test_variant}" == "static-mem" ]]; then
>  echo -e "\nDOMU_STATIC_MEM[0]=\"${domu_base} ${domu_size}\"" >> 
> binaries/config
>  fi
>  
> +if [[ "${test_variant}" == "static-heap" ]]; then
> +# ImageBuilder uses the config file to create the uboot script. 
> Devicetree
> +# will be set via the generated uboot script.
> +# The valid memory range is 0x4000 to 0x8000 as defined before.
> +# ImageBuillder sets the kernel and ramdisk range based on the file size.
> +# It will use the memory range between 0x4560 to 0x47AED1E8, so set
> +# memory range between 0x5000 and 0x8000 as static heap.

I think this is OK. One suggestion to make things more reliable would be
to change MEMORY_END to be 0x5000 so that you can be sure that
ImageBuilder won't go over the limit. You could do it just for this
test, which would be safer, but to be honest you could limit MEMORY_END
to 0x5000 for all tests in qemu-smoke-dom0less-arm64.sh because it
shouldn't really cause any problems.


> +echo  '
> +STATIC_HEAP="0x5000 0x3000"
> +# The size of static heap should be greater than the guest memory
> +DOMU_MEM[0]="128"' >> binaries/config
> +fi
> +
>  if [[ "${test_variant}" == "boot-cpupools" ]]; then
>  echo '
>  CPUPOOL[0]="cpu@1 null"
> -- 
> 2.25.1
> 



RE: [PATCH 3/7] hv: simplify sysctl registration

2023-03-02 Thread Michael Kelley (LINUX)
From: Luis Chamberlain  On Behalf Of Luis Chamberlain 
Sent: Thursday, March 2, 2023 12:46 PM
>
> register_sysctl_table() is a deprecated compatibility wrapper.
> register_sysctl() can do the directory creation for you so just use
> that.
> 
> Signed-off-by: Luis Chamberlain 
> ---
>  drivers/hv/vmbus_drv.c | 11 +--
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index d24dd65b33d4..229353f1e9c2 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1460,15 +1460,6 @@ static struct ctl_table hv_ctl_table[] = {
>   {}
>  };
> 
> -static struct ctl_table hv_root_table[] = {
> - {
> - .procname   = "kernel",
> - .mode   = 0555,
> - .child  = hv_ctl_table
> - },
> - {}
> -};
> -
>  /*
>   * vmbus_bus_init -Main vmbus driver initialization routine.
>   *
> @@ -1547,7 +1538,7 @@ static int vmbus_bus_init(void)
>* message recording won't be available in isolated
>* guests should the following registration fail.
>*/
> - hv_ctl_table_hdr = register_sysctl_table(hv_root_table);
> + hv_ctl_table_hdr = register_sysctl("kernel", hv_ctl_table);
>   if (!hv_ctl_table_hdr)
>   pr_err("Hyper-V: sysctl table register error");
> 
> --
> 2.39.1

Reviewed-by: Michael Kelley 



Re: [PATCH v2 1/2] xen/cppcheck: add a way to exclude files from the scan

2023-03-02 Thread Stefano Stabellini
On Thu, 2 Mar 2023, Luca Fancellu wrote:
> >> +Exclude file list for xen-analysis script
> >> +=
> >> +
> >> +The code analysis is performed on the Xen codebase for both MISRA 
> >> checkers and
> >> +static analysis checkers, there are some files however that needs to be 
> >> removed
> >> +from the findings report because they are not owned by Xen and they must 
> >> be kept
> >> +in sync with their origin (completely or even partially), hence we can't 
> >> easily
> >> +fix findings or deviate from them.
> > 
> > I would stay more generic and say something like:
> > 
> > The code analysis is performed on the Xen codebase for both MISRA
> > checkers and static analysis checkers, there are some files however that
> > needs to be removed from the findings report for various reasons (e.g.
> > they are imported from external sources, they generate too many false
> > positive results, etc.).
> > 
> > But what you wrote is also OK.
> 
> I’m ok with that too, I can update with your wordings
> >> 
> >> +++ b/xen/scripts/xen_analysis/exclusion_file_list.py
> >> @@ -0,0 +1,79 @@
> >> +#!/usr/bin/env python3
> >> +
> >> +import os, glob, json
> >> +from . import settings
> >> +
> >> +class ExclusionFileListError(Exception):
> >> +pass
> >> +
> >> +
> >> +def __cppcheck_path_exclude_syntax(path):
> >> +# Prepending * to the relative path to match every path where the Xen
> >> +# codebase could be
> >> +path = "*" + path
> >> +
> >> +# Check if the path is to a folder without the wildcard at the end
> >> +if not (path.endswith(".c") or path.endswith(".h") or 
> >> path.endswith("*")):
> > 
> > Isn't there a python call to check that it is actually a folder? I think
> > that would be more resilient because otherwise if someone passes a .S or
> > .cpp file it would be detected as directory.
> > 
> > 
> >> +# The path is to a folder, if it doesn't have the final /, add it
> >> +if not path.endswith("/"):
> >> +path = path + "/"
> >> +# Since the path is a folder, add a wildcard to the end so that
> >> +# cppcheck will remove every issue related with this path
> >> +path = path + "*"
> >> +
> >> +return path
> 
> Yes you are very right, here I wanted to accept the relative path to a folder 
> with
> or without the ending '/*’, but it carries on much more complexity because 
> here the
> relative path can contain wildcards in it, so we can’t use os.path.isdir() 
> which would
> fail.
> 
> At cost of being more strict on how folders shall be declared, I think it’s 
> better to
> enforce the ‘/*’ at the end of a path that is excluding a folder.
> 
> We have a previous check using glob() to ensure path with wildcards are real 
> path
> so we are safe that the passed relative path are OK.
> 
> Dropping the requirement of passing folder paths with or without the ‘/*’ 
> simplifies
> the code and this would be the final result:
> 
> 
> diff --git a/docs/misra/exclude-list.rst b/docs/misra/exclude-list.rst
> index 969539c46beb..c97431a86120 100644
> --- a/docs/misra/exclude-list.rst
> +++ b/docs/misra/exclude-list.rst
> @@ -3,11 +3,11 @@
>  Exclude file list for xen-analysis script
>  =
>  
> -The code analysis is performed on the Xen codebase for both MISRA checkers 
> and
> -static analysis checkers, there are some files however that needs to be 
> removed
> -from the findings report because they are not owned by Xen and they must be 
> kept
> -in sync with their origin (completely or even partially), hence we can't 
> easily
> -fix findings or deviate from them.
> +The code analysis is performed on the Xen codebase for both MISRA
> +checkers and static analysis checkers, there are some files however that
> +needs to be removed from the findings report for various reasons (e.g.
> +they are imported from external sources, they generate too many false
> +positive results, etc.).
>  
>  For this reason the file docs/misra/exclude-list.json is used to exclude 
> every
>  entry listed in that file from the final report.
> @@ -42,3 +42,5 @@ Here is an explanation of the fields inside an object of 
> the "content" array:
>  
>  To ease the review and the modifications of the entries, they shall be 
> listed in
>  alphabetical order referring to the rel_path field.
> +Excluded folder paths shall end with '/*' in order to match everything on 
> that
> +folder.
> diff --git a/xen/scripts/xen_analysis/exclusion_file_list.py 
> b/xen/scripts/xen_analysis/exclusion_file_list.py
> index 4a47a90f5944..871e480586bb 100644
> --- a/xen/scripts/xen_analysis/exclusion_file_list.py
> +++ b/xen/scripts/xen_analysis/exclusion_file_list.py
> @@ -12,15 +12,6 @@ def __cppcheck_path_exclude_syntax(path):
>  # codebase could be
>  path = "*" + path
>  
> -# Check if the path is to a folder without the wildcard at the end
> -if not (path.endswith(".c") or path.endswith(".h") or 
> 

Re: [ImageBuilder][PATCH 2/2] uboot-script-gen: Add support for static shared memory

2023-03-02 Thread Stefano Stabellini
On Thu, 2 Mar 2023, jiamei.xie wrote:
> Introduce support for creating shared-mem node for dom0less domUs in
> the device tree. Add the following options:
> - DOMU_SHARED_MEM[number]="HPA GPA size"
>   if specified, indicates the host physical address HPA will get mapped
>   at guest address GPA in domU and the memory of size will be reserved
>   to be shared memory.
> - DOMU_SHARED_MEM_ID[number]
>   An arbitrary string that represents the unique identifier of the shared
>   memory region, with a strict limit on the number of characters(\0
>   included)
> 
> The static shared memory is used between two dom0less domUs.
> 
> Below is an example:
> NUM_DOMUS=2
> DOMU_SHARED_MEM[0]="0x5000 0x600 0x1000"
> DOMU_SHARED_MEM_ID[0]="my-shared-mem-0"
> DOMU_SHARED_MEM[1]="0x5000 0x600 0x1000"
> DOMU_SHARED_MEM_ID[1]="my-shared-mem-0"

Rather than two separate properties, do you think it would make sense to
just use one as follows?

NUM_DOMUS=2
DOMU_SHARED_MEM[0]="my-shared-mem-0 0x5000 0x600 0x1000"
DOMU_SHARED_MEM[1]="my-shared-mem-0 0x5000 0x600 0x1000"

The good thing about bash is that it doesn't care if they are numbers or
strings :-)


> This static shared memory region is identified as "my-shared-mem-0", host
> physical address starting at 0x5000 of 256MB will be reserved to be
> shared between two domUs. It will get mapped at 0x600 in both guest
> physical address space. Both DomUs are the borrower domain, the owner
> domain is the default owner domain DOMID_IO.
> 
> Signed-off-by: jiamei.xie 
> ---
>  README.md| 18 ++
>  scripts/uboot-script-gen | 26 ++
>  2 files changed, 44 insertions(+)
> 
> diff --git a/README.md b/README.md
> index 787f413..48044ee 100644
> --- a/README.md
> +++ b/README.md
> @@ -192,6 +192,24 @@ Where:
>if specified, indicates the host physical address regions
>[baseaddr, baseaddr + size) to be reserved to the VM for static allocation.
>  
> +- DOMU_SHARED_MEM[number]="HPA GPA size" and DOMU_SHARED_MEM_ID[number]
> +  if specified, indicate the host physical address HPA will get mapped at
> +  guest address GPA in domU and the memory of size will be reserved to be
> +  shared memory. The shared memory is used between two dom0less domUs.
> +
> +  Below is an example:
> +  NUM_DOMUS=2
> +  DOMU_SHARED_MEM[0]="0x5000 0x600 0x1000"
> +  DOMU_SHARED_MEM_ID[0]="my-shared-mem-0"
> +  DOMU_SHARED_MEM[1]="0x5000 0x600 0x1000"
> +  DOMU_SHARED_MEM_ID[1]="my-shared-mem-0"
> +
> +  This static shared memory region is identified as "my-shared-mem-0", host
> +  physical address starting at 0x5000 of 256MB will be reserved to be
> +  shared between two domUs. It will get mapped at 0x600 in both guest
> +  physical address space. Both DomUs are the borrower domain, the owner
> +  domain is the default owner domain DOMID_IO.
> +
>  - DOMU_DIRECT_MAP[number] can be set to 1 or 0.
>If set to 1, the VM is direct mapped. The default is 1.
>This is only applicable when DOMU_STATIC_MEM is specified.
> diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
> index 4775293..46215c8 100755
> --- a/scripts/uboot-script-gen
> +++ b/scripts/uboot-script-gen
> @@ -204,6 +204,27 @@ function add_device_tree_static_heap()
>  dt_set "$path" "xen,static-heap" "hex" "${cells[*]}"
>  }
>  
> +function add_device_tree_static_shared_mem()
> +{
> +local path=$1
> +local domid=$2
> +local regions=$3
> +local SHARED_MEM_ID=$4
> +local cells=()
> +local SHARED_MEM_HOST=${regions%% *}
> +
> +dt_mknode "${path}" "domU${domid}-shared-mem@${SHARED_MEM_HOST}"
> +
> +for val in ${regions[@]}
> +do
> +cells+=("$(printf "0x%x 0x%x" $(($val >> 32)) $(($val & ((1 << 32) - 
> 1")
> +done
> +
> +dt_set "${path}/domU${domid}-shared-mem@${SHARED_MEM_HOST}" "compatible" 
> "str" "xen,domain-shared-memory-v1"
> +dt_set "${path}/domU${domid}-shared-mem@${SHARED_MEM_HOST}" "xen,shm-id" 
> "str" "${SHARED_MEM_ID}"
> +dt_set "${path}/domU${domid}-shared-mem@${SHARED_MEM_HOST}" 
> "xen,shared-mem" "hex" "${cells[*]}"
> +}
> +
>  function add_device_tree_cpupools()
>  {
>  local cpu
> @@ -329,6 +350,11 @@ function xen_device_tree_editing()
>  dt_set "/chosen/domU$i" "xen,enhanced" "str" "enabled"
>  fi
>  
> +if test -n "${DOMU_SHARED_MEM[i]}" -a -n "${DOMU_SHARED_MEM_ID[i]}"
> +then
> +add_device_tree_static_shared_mem "/chosen/domU${i}" "${i}" 
> "${DOMU_SHARED_MEM[i]}" "${DOMU_SHARED_MEM_ID[i]}"
> +fi
> +
>  if test "${DOMU_COLORS[$i]}"
>  then
>  local startcolor=$(echo "${DOMU_COLORS[$i]}"  | cut -d "-" -f 1)
> -- 
> 2.25.1
> 



Re: [ImageBuilder][PATCH 1/2] uboot-script-gen: Add support for static heap

2023-03-02 Thread Stefano Stabellini
On Thu, 2 Mar 2023, Michal Orzel wrote:
> Hi Jiamei,
> 
> Patch looks good apart from minor comments down below.

Just wanted to add that the patch looks OK to me too and don't have any
further comments beyond the ones Michal's already made


> On 02/03/2023 05:46, jiamei.xie wrote:
> > 
> > 
> > From: jiamei Xie 
> > 
> > Add a new config parameter to configure static heap.
> > STATIC_HEAP="baseaddr1 size1 ... baseaddrN sizeN"
> > if specified, indicates the host physical address regions
> > [baseaddr, baseaddr + size) to be reserved as static heap.
> > 
> > For instance, STATIC_HEAP="0x5000 0x3000", if specified,
> > indicates the host memory region starting from paddr 0x5000
> > with a size of 0x3000 to be reserved as static heap.
> > 
> > Signed-off-by: jiamei Xie 
> > ---
> >  README.md|  4 
> >  scripts/uboot-script-gen | 20 
> >  2 files changed, 24 insertions(+)
> > 
> > diff --git a/README.md b/README.md
> > index 814a004..787f413 100644
> > --- a/README.md
> > +++ b/README.md
> > @@ -256,6 +256,10 @@ Where:
> > 
> >  - NUM_CPUPOOLS specifies the number of boot-time cpupools to create.
> > 
> > +- STATIC_HEAP="baseaddr1 size1 ... baseaddrN sizeN"
> > +  if specified, indicates the host physical address regions
> > +  [baseaddr, baseaddr + size) to be reserved as static heap.
> As this option impacts Xen and not domUs, please call it XEN_STATIC_HEAP and 
> move
> it right after XEN_CMD documentation.
> 
> > +
> >  Then you can invoke uboot-script-gen as follows:
> > 
> >  ```
> > diff --git a/scripts/uboot-script-gen b/scripts/uboot-script-gen
> > index f07e334..4775293 100755
> > --- a/scripts/uboot-script-gen
> > +++ b/scripts/uboot-script-gen
> > @@ -189,6 +189,21 @@ function add_device_tree_static_mem()
> >  dt_set "$path" "xen,static-mem" "hex" "${cells[*]}"
> >  }
> > 
> > +function add_device_tree_static_heap()
> > +{
> > +local path=$1
> > +local regions=$2
> > +local cells=()
> > +local val
> > +
> > +for val in ${regions[@]}
> > +do
> > +cells+=("$(printf "0x%x 0x%x" $(($val >> 32)) $(($val & ((1 << 32) 
> > - 1")
> Please use split_value function instead of opencoding it.
> It will then become:
> cells+=("$(split_value $val)")
> 
> ~Michal
> 



Re: [PATCH 2/7] ipmi: simplify sysctl registration

2023-03-02 Thread Corey Minyard
On Thu, Mar 02, 2023 at 12:46:07PM -0800, Luis Chamberlain wrote:
> register_sysctl_table() is a deprecated compatibility wrapper.
> register_sysctl() can do the directory creation for you so just use
> that.

Thanks, I have included this in my tree for the next merge window.

-corey

> 
> Signed-off-by: Luis Chamberlain 
> ---
>  drivers/char/ipmi/ipmi_poweroff.c | 16 +---
>  1 file changed, 1 insertion(+), 15 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_poweroff.c 
> b/drivers/char/ipmi/ipmi_poweroff.c
> index 163ec9749e55..870659d91db2 100644
> --- a/drivers/char/ipmi/ipmi_poweroff.c
> +++ b/drivers/char/ipmi/ipmi_poweroff.c
> @@ -659,20 +659,6 @@ static struct ctl_table ipmi_table[] = {
>   { }
>  };
>  
> -static struct ctl_table ipmi_dir_table[] = {
> - { .procname = "ipmi",
> -   .mode = 0555,
> -   .child= ipmi_table },
> - { }
> -};
> -
> -static struct ctl_table ipmi_root_table[] = {
> - { .procname = "dev",
> -   .mode = 0555,
> -   .child= ipmi_dir_table },
> - { }
> -};
> -
>  static struct ctl_table_header *ipmi_table_header;
>  #endif /* CONFIG_PROC_FS */
>  
> @@ -689,7 +675,7 @@ static int __init ipmi_poweroff_init(void)
>   pr_info("Power cycle is enabled\n");
>  
>  #ifdef CONFIG_PROC_FS
> - ipmi_table_header = register_sysctl_table(ipmi_root_table);
> + ipmi_table_header = register_sysctl("dev/ipmi", ipmi_table);
>   if (!ipmi_table_header) {
>   pr_err("Unable to register powercycle sysctl\n");
>   rv = -ENOMEM;
> -- 
> 2.39.1
> 



Re: [PATCH v2 4/6] docs/about/deprecated: Deprecate the qemu-system-arm binary

2023-03-02 Thread Philippe Mathieu-Daudé

On 2/3/23 17:31, Thomas Huth wrote:

qemu-system-aarch64 is a proper superset of qemu-system-arm,
and the latter was mainly still required for 32-bit KVM support.
But this 32-bit KVM arm support has been dropped in the Linux
kernel a couple of years ago already, so we don't really need
qemu-system-arm anymore, thus deprecated it now.

Signed-off-by: Thomas Huth 
---
  docs/about/deprecated.rst | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index a30aa8dfdf..21ce70b5c9 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -45,6 +45,16 @@ run 32-bit guests by selecting a 32-bit CPU model, including 
KVM support
  on x86_64 hosts. Thus users are recommended to reconfigure their systems
  to use the ``qemu-system-x86_64`` binary instead.
  
+``qemu-system-arm`` binary (since 8.0)

+''
+
+``qemu-system-aarch64`` is a proper superset of ``qemu-system-arm``. The
+latter was mainly a requirement for running KVM on 32-bit arm hosts, but
+this 32-bit KVM support has been removed some years ago already (see:


s/some/few/?


+https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=541ad0150ca4
+). Thus the QEMU project will drop the ``qemu-system-arm`` binary in a
+future release. Use ``qemu-system-aarch64`` instead.


If we unify, wouldn't it be simpler to name the single qemu-system
binary emulating various ARM architectures as 'qemu-system-arm'?



Re: [PATCH v2 0/6] Deprecate support for 32-bit x86 and arm hosts

2023-03-02 Thread Philippe Mathieu-Daudé

On 2/3/23 17:31, Thomas Huth wrote:

We're struggling quite badly with our CI minutes on the shared
gitlab runners, so we urgently need to think of ways to cut down
our supported build and target environments. qemu-system-i386 and
qemu-system-arm are not really required anymore, since nobody uses
KVM on the corresponding systems for production anymore, and the
-x86_64 and -arch64 variants are a proper superset of those binaries.
So it's time to deprecate them and the corresponding 32-bit host
environments now.

This is a follow-up patch series from the previous discussion here:

  https://lore.kernel.org/qemu-devel/20230130114428.1297295-1-th...@redhat.com/

where people still mentioned that there is still interest in certain
support for 32-bit host hardware. But as far as I could see, there is
no real need for 32-bit x86 host support and for system emulation on
32-bit arm hosts anymore, so it should be fine if we drop these host
environments soon (these are also the two architectures that contribute
the most to the long test times in our CI, so we would benefit a lot by
dropping those).


It is not clear from your cover that the deprecation only concern system
emulation on these hosts, not user emulation.

I wonder about tools. Apparently they depend on sysemu now. I was
building a 'configure --enable-tools --disable-system' but now it
is empty.



Re: [PATCH v2 6/6] gitlab-ci.d/crossbuilds: Drop the 32-bit arm system emulation jobs

2023-03-02 Thread Philippe Mathieu-Daudé

On 2/3/23 17:31, Thomas Huth wrote:

Hardly anybody still uses 32-bit arm environments for running QEMU,
so let's stop wasting our scarce CI minutes with these jobs.

Signed-off-by: Thomas Huth 
---
  .gitlab-ci.d/crossbuilds.yml | 14 --
  1 file changed, 14 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




[xen-unstable test] 178965: tolerable trouble: fail/pass/starved

2023-03-02 Thread osstest service owner
flight 178965 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/178965/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-installfail pass in 178922
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-installfail pass in 178922
 test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat  fail pass in 178922

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 178922
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 178922
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 178922
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 178922
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 178922
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 178922
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 178922
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 178922
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 178922
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  b84fdf521b306cce64388fe57ee6c7c00f9d3e76
baseline version:
 xen  b84fdf521b306cce64388fe57ee6c7c00f9d3e76

Last test of basis   178965  2023-03-02 09:32:57 Z0 days
Testing same since  (not found) 0 attempts

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-arm64  pass
 build-armhf 

[PATCH 3/7] hv: simplify sysctl registration

2023-03-02 Thread Luis Chamberlain
register_sysctl_table() is a deprecated compatibility wrapper.
register_sysctl() can do the directory creation for you so just use
that.

Signed-off-by: Luis Chamberlain 
---
 drivers/hv/vmbus_drv.c | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index d24dd65b33d4..229353f1e9c2 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1460,15 +1460,6 @@ static struct ctl_table hv_ctl_table[] = {
{}
 };
 
-static struct ctl_table hv_root_table[] = {
-   {
-   .procname   = "kernel",
-   .mode   = 0555,
-   .child  = hv_ctl_table
-   },
-   {}
-};
-
 /*
  * vmbus_bus_init -Main vmbus driver initialization routine.
  *
@@ -1547,7 +1538,7 @@ static int vmbus_bus_init(void)
 * message recording won't be available in isolated
 * guests should the following registration fail.
 */
-   hv_ctl_table_hdr = register_sysctl_table(hv_root_table);
+   hv_ctl_table_hdr = register_sysctl("kernel", hv_ctl_table);
if (!hv_ctl_table_hdr)
pr_err("Hyper-V: sysctl table register error");
 
-- 
2.39.1




[PATCH 0/7] sysctl: slowly deprecate register_sysctl_table()

2023-03-02 Thread Luis Chamberlain
As the large array of sysctls in kernel/sysctl.c is reduced we get to
the point of wanting to optimize how we register sysctls by only dealing
with flat simple structures, with no subdirectories. In particular the
last empty element should not be needed. We'll get there, and save some
memory, but as we move forward that path will be come the more relevant
path to use in the sysctl registration. It is much simpler as it avoids
recursion.

Turns out we can also convert existing users of register_sysctl_table()
which just need their subdirectories created for them. This effort
addresses most users of register_sysctl_table() in drivers/ except
parport -- that needs a bit more review.

This is part of the process to deprecate older sysctl users which uses
APIs which can incur recursion, but don't need it [0]. This is the
second effort.

Yes -- we'll get to the point *each* of these conversions means saving
one empty syctl, but that change needs a bit more careful review before
merging. But since these conversion are also deleting tables for
subdirectories, the delta in size of the kernel should not incrase
really.

The most complex change is the sgi-xp change which does deal with
a case where we have a subdirectory with an entry, I just split
that in two registrations. No point in keeping recursion just for
a few minor if we can simplify code around. More eyeballs / review /
testing on that change is appreciated.

Sending these out early so they can get tested properly early on
linux-next. I'm happy to take these via sysctl-next [0] but since
I don' think register_sysctl_table() will be nuked on v6.4 I think
it's fine for each of these to go into each respective tree. I can
pick up last stragglers on sysctl-next. If you want me to take this
via sysctl-next too, just let me know and I'm happy to do that. Either
way works.

[0] https://lkml.kernel.org/r/20230302202826.776286-1-mcg...@kernel.org

Luis Chamberlain (7):
  scsi: simplify sysctl registration with register_sysctl()
  ipmi: simplify sysctl registration
  hv: simplify sysctl registration
  md: simplify sysctl registration
  sgi-xp: simplify sysctl registration
  tty: simplify sysctl registration
  xen: simplify sysctl registration for balloon

 drivers/char/ipmi/ipmi_poweroff.c | 16 +---
 drivers/hv/vmbus_drv.c| 11 +--
 drivers/md/md.c   | 22 +-
 drivers/misc/sgi-xp/xpc_main.c| 24 ++--
 drivers/scsi/scsi_sysctl.c| 16 +---
 drivers/tty/tty_io.c  | 20 +---
 drivers/xen/balloon.c | 20 +---
 7 files changed, 16 insertions(+), 113 deletions(-)

-- 
2.39.1




[PATCH 6/7] tty: simplify sysctl registration

2023-03-02 Thread Luis Chamberlain
register_sysctl_table() is a deprecated compatibility wrapper.
register_sysctl_init() can do the directory creation for you so just use
that

Signed-off-by: Luis Chamberlain 
---
 drivers/tty/tty_io.c | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 36fb945fdad4..766750e355ac 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3614,31 +3614,13 @@ static struct ctl_table tty_table[] = {
{ }
 };
 
-static struct ctl_table tty_dir_table[] = {
-   {
-   .procname   = "tty",
-   .mode   = 0555,
-   .child  = tty_table,
-   },
-   { }
-};
-
-static struct ctl_table tty_root_table[] = {
-   {
-   .procname   = "dev",
-   .mode   = 0555,
-   .child  = tty_dir_table,
-   },
-   { }
-};
-
 /*
  * Ok, now we can initialize the rest of the tty devices and can count
  * on memory allocations, interrupts etc..
  */
 int __init tty_init(void)
 {
-   register_sysctl_table(tty_root_table);
+   register_sysctl_init("dev/tty", tty_table);
cdev_init(_cdev, _fops);
if (cdev_add(_cdev, MKDEV(TTYAUX_MAJOR, 0), 1) ||
register_chrdev_region(MKDEV(TTYAUX_MAJOR, 0), 1, "/dev/tty") < 0)
-- 
2.39.1




[PATCH 7/7] xen: simplify sysctl registration for balloon

2023-03-02 Thread Luis Chamberlain
register_sysctl_table() is a deprecated compatibility wrapper.
register_sysctl_init() can do the directory creation for you so just
use that.

Signed-off-by: Luis Chamberlain 
---
 drivers/xen/balloon.c | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 617a7f4f07a8..586a1673459e 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -97,24 +97,6 @@ static struct ctl_table balloon_table[] = {
{ }
 };
 
-static struct ctl_table balloon_root[] = {
-   {
-   .procname   = "balloon",
-   .mode   = 0555,
-   .child  = balloon_table,
-   },
-   { }
-};
-
-static struct ctl_table xen_root[] = {
-   {
-   .procname   = "xen",
-   .mode   = 0555,
-   .child  = balloon_root,
-   },
-   { }
-};
-
 #else
 #define xen_hotplug_unpopulated 0
 #endif
@@ -747,7 +729,7 @@ static int __init balloon_init(void)
 #ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
set_online_page_callback(_online_page);
register_memory_notifier(_memory_nb);
-   register_sysctl_table(xen_root);
+   register_sysctl_init("xen/balloon", balloon_table);
 #endif
 
balloon_add_regions();
-- 
2.39.1




[PATCH 1/7] scsi: simplify sysctl registration with register_sysctl()

2023-03-02 Thread Luis Chamberlain
register_sysctl_table() is a deprecated compatibility wrapper.
register_sysctl() can do the directory creation for you so just use that.

Signed-off-by: Luis Chamberlain 
---
 drivers/scsi/scsi_sysctl.c | 16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/drivers/scsi/scsi_sysctl.c b/drivers/scsi/scsi_sysctl.c
index 7259704a7f52..7f0914ea168f 100644
--- a/drivers/scsi/scsi_sysctl.c
+++ b/drivers/scsi/scsi_sysctl.c
@@ -21,25 +21,11 @@ static struct ctl_table scsi_table[] = {
{ }
 };
 
-static struct ctl_table scsi_dir_table[] = {
-   { .procname = "scsi",
- .mode = 0555,
- .child= scsi_table },
-   { }
-};
-
-static struct ctl_table scsi_root_table[] = {
-   { .procname = "dev",
- .mode = 0555,
- .child= scsi_dir_table },
-   { }
-};
-
 static struct ctl_table_header *scsi_table_header;
 
 int __init scsi_init_sysctl(void)
 {
-   scsi_table_header = register_sysctl_table(scsi_root_table);
+   scsi_table_header = register_sysctl("dev/scsi", scsi_table);
if (!scsi_table_header)
return -ENOMEM;
return 0;
-- 
2.39.1




[PATCH 2/7] ipmi: simplify sysctl registration

2023-03-02 Thread Luis Chamberlain
register_sysctl_table() is a deprecated compatibility wrapper.
register_sysctl() can do the directory creation for you so just use
that.

Signed-off-by: Luis Chamberlain 
---
 drivers/char/ipmi/ipmi_poweroff.c | 16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_poweroff.c 
b/drivers/char/ipmi/ipmi_poweroff.c
index 163ec9749e55..870659d91db2 100644
--- a/drivers/char/ipmi/ipmi_poweroff.c
+++ b/drivers/char/ipmi/ipmi_poweroff.c
@@ -659,20 +659,6 @@ static struct ctl_table ipmi_table[] = {
{ }
 };
 
-static struct ctl_table ipmi_dir_table[] = {
-   { .procname = "ipmi",
- .mode = 0555,
- .child= ipmi_table },
-   { }
-};
-
-static struct ctl_table ipmi_root_table[] = {
-   { .procname = "dev",
- .mode = 0555,
- .child= ipmi_dir_table },
-   { }
-};
-
 static struct ctl_table_header *ipmi_table_header;
 #endif /* CONFIG_PROC_FS */
 
@@ -689,7 +675,7 @@ static int __init ipmi_poweroff_init(void)
pr_info("Power cycle is enabled\n");
 
 #ifdef CONFIG_PROC_FS
-   ipmi_table_header = register_sysctl_table(ipmi_root_table);
+   ipmi_table_header = register_sysctl("dev/ipmi", ipmi_table);
if (!ipmi_table_header) {
pr_err("Unable to register powercycle sysctl\n");
rv = -ENOMEM;
-- 
2.39.1




[PATCH 4/7] md: simplify sysctl registration

2023-03-02 Thread Luis Chamberlain
register_sysctl_table() is a deprecated compatibility wrapper.
register_sysctl() can do the directory creation for you so just use
that.

Signed-off-by: Luis Chamberlain 
---
 drivers/md/md.c | 22 +-
 1 file changed, 1 insertion(+), 21 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 927a43db5dfb..546b1b81eb28 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -322,26 +322,6 @@ static struct ctl_table raid_table[] = {
{ }
 };
 
-static struct ctl_table raid_dir_table[] = {
-   {
-   .procname   = "raid",
-   .maxlen = 0,
-   .mode   = S_IRUGO|S_IXUGO,
-   .child  = raid_table,
-   },
-   { }
-};
-
-static struct ctl_table raid_root_table[] = {
-   {
-   .procname   = "dev",
-   .maxlen = 0,
-   .mode   = 0555,
-   .child  = raid_dir_table,
-   },
-   {  }
-};
-
 static int start_readonly;
 
 /*
@@ -9650,7 +9630,7 @@ static int __init md_init(void)
mdp_major = ret;
 
register_reboot_notifier(_notifier);
-   raid_table_header = register_sysctl_table(raid_root_table);
+   raid_table_header = register_sysctl("dev/raid", raid_table);
 
md_geninit();
return 0;
-- 
2.39.1




[PATCH 5/7] sgi-xp: simplify sysctl registration

2023-03-02 Thread Luis Chamberlain
Although this driver is a good use case for having a directory
that is not other directories and then subdirectories with more
entries, the usage of register_sysctl_table() can recurse and
increases complexity so to avoid that just split out the
registration to each directory with its own entries.

register_sysctl_table() is a deprecated compatibility wrapper.
register_sysctl() can do the directory creation for you so just use
that.

Signed-off-by: Luis Chamberlain 
---
 drivers/misc/sgi-xp/xpc_main.c | 24 ++--
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/drivers/misc/sgi-xp/xpc_main.c b/drivers/misc/sgi-xp/xpc_main.c
index b2c3c22fc13c..6da509d692bb 100644
--- a/drivers/misc/sgi-xp/xpc_main.c
+++ b/drivers/misc/sgi-xp/xpc_main.c
@@ -93,7 +93,7 @@ int xpc_disengage_timelimit = XPC_DISENGAGE_DEFAULT_TIMELIMIT;
 static int xpc_disengage_min_timelimit;/* = 0 */
 static int xpc_disengage_max_timelimit = 120;
 
-static struct ctl_table xpc_sys_xpc_hb_dir[] = {
+static struct ctl_table xpc_sys_xpc_hb[] = {
{
 .procname = "hb_interval",
 .data = _hb_interval,
@@ -112,11 +112,7 @@ static struct ctl_table xpc_sys_xpc_hb_dir[] = {
 .extra2 = _hb_check_max_interval},
{}
 };
-static struct ctl_table xpc_sys_xpc_dir[] = {
-   {
-.procname = "hb",
-.mode = 0555,
-.child = xpc_sys_xpc_hb_dir},
+static struct ctl_table xpc_sys_xpc[] = {
{
 .procname = "disengage_timelimit",
 .data = _disengage_timelimit,
@@ -127,14 +123,9 @@ static struct ctl_table xpc_sys_xpc_dir[] = {
 .extra2 = _disengage_max_timelimit},
{}
 };
-static struct ctl_table xpc_sys_dir[] = {
-   {
-.procname = "xpc",
-.mode = 0555,
-.child = xpc_sys_xpc_dir},
-   {}
-};
+
 static struct ctl_table_header *xpc_sysctl;
+static struct ctl_table_header *xpc_sysctl_hb;
 
 /* non-zero if any remote partition disengage was timed out */
 int xpc_disengage_timedout;
@@ -1041,6 +1032,8 @@ xpc_do_exit(enum xp_retval reason)
 
if (xpc_sysctl)
unregister_sysctl_table(xpc_sysctl);
+   if (xpc_sysctl_hb)
+   unregister_sysctl_table(xpc_sysctl_hb);
 
xpc_teardown_partitions();
 
@@ -1243,7 +1236,8 @@ xpc_init(void)
goto out_1;
}
 
-   xpc_sysctl = register_sysctl_table(xpc_sys_dir);
+   xpc_sysctl = register_sysctl("xpc", xpc_sys_xpc);
+   xpc_sysctl_hb = register_sysctl("xpc/hb", xpc_sys_xpc_hb);
 
/*
 * Fill the partition reserved page with the information needed by
@@ -1308,6 +1302,8 @@ xpc_init(void)
(void)unregister_die_notifier(_die_notifier);
(void)unregister_reboot_notifier(_reboot_notifier);
 out_2:
+   if (xpc_sysctl_hb)
+   unregister_sysctl_table(xpc_sysctl_hb);
if (xpc_sysctl)
unregister_sysctl_table(xpc_sysctl);
 
-- 
2.39.1




Re: [PATCH v2 2/3] xen/riscv: initialize .bss section

2023-03-02 Thread Andrew Cooper
On 02/03/2023 3:55 pm, Oleksii wrote:
> On Thu, 2023-03-02 at 15:22 +0100, Jan Beulich wrote:
>> On 02.03.2023 14:23, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/riscv64/head.S
>>> +++ b/xen/arch/riscv/riscv64/head.S
>>> @@ -13,6 +13,15 @@ ENTRY(start)
>>>  lla a6, _dtb_base
>>>  REG_S   a1, (a6)
>>>  
>>> +    la  a3, __bss_start
>>> +    la  a4, __bss_end
>>> +    ble a4, a3, clear_bss_done
>> While it may be that .bss is indeed empty right now, even short term
>> it won't be, and never will. I'd drop this conditional (and in
>> particular the label), inserting a transient item into .bss for the
>> time being. As soon as your patch introducing page tables has landed,
>> there will be multiple pages worth of .bss.
> If I understand you correctly you suggested declare some variable:
>int dummy_bss __attribute__((unused));
>
> Then .bss won't be zero:
>$ riscv64-linux-gnu-objdump -x xen/xen-syms | grep -i dummy_bss
>80205000 g O .bss   0004 .hidden dummy_bss
>
> And when page tables will be ready it will be needed to remove
> dummy_bss.

Well - to be deleted when the first real bss user appears, but yes -
that will probably be the pagetable series.

>
> Another one option is to update linker script ( looks better then
> previous one ):
> --- a/xen/arch/riscv/xen.lds.S
> +++ b/xen/arch/riscv/xen.lds.S
> @@ -140,6 +140,7 @@ SECTIONS
>  . = ALIGN(SMP_CACHE_BYTES);
>  __per_cpu_data_end = .;
>  *(.bss .bss.*)
> +. = . + 1;
>  . = ALIGN(POINTER_ALIGN);
>  __bss_end = .;
>  } :text
>
> If one of the options is fine then to be honest I am not sure that I
> understand why it is better than have 3 instructions which will be
> unnecessary when first bss variable will be introduced. And actually
> the same will be with item in bss, it will become unnecessary when
> something from bss will be introduced.
>
> I am OK with one of the mentioned above options but still would like
> to understand what are advantages.

A one-line delete in a C file deletion is most obviously-safe of the 3
options to be performed at some later date, when we've started
forgetting the specific details in this patch.

>> Also are this and ...
>>
>>> +clear_bss:
>>> +    REG_S   zero, (a3)
>>> +    add a3, a3, RISCV_SZPTR
>>> +    blt a3, a4, clear_bss
>> ... this branch actually the correct ones? I'd expect the unsigned
>> flavors to be used when comparing addresses. It may not matter here
>> and/or right now, but it'll set a bad precedent unless you expect
>> to only ever work on addresses which have the sign bit clear.
> I'll change blt to bltu.

This should indeed an unsigned compare.  It doesn't explode in practice
because paging is disabled and RISC-V's MAXPHYADDR is 56 bits so doesn't
set the sign bit.

~Andrew



Re: [PATCH v2 1/3] xen/riscv: read/save hart_id and dtb_base passed by bootloader

2023-03-02 Thread Andrew Cooper
On 02/03/2023 2:53 pm, Oleksii wrote:
> On Thu, 2023-03-02 at 14:02 +, Andrew Cooper wrote:
>> On 02/03/2023 1:23 pm, Oleksii Kurochko wrote:
>>> +
>>> +    /*
>>> + * DTB base is passed by a bootloader
>>> + */
>>> +_dtb_base:
>>> +    RISCV_PTR 0x0
>>> diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
>>> index 1c87899e8e..d9723fe1c0 100644
>>> --- a/xen/arch/riscv/setup.c
>>> +++ b/xen/arch/riscv/setup.c
>>> @@ -7,7 +7,8 @@
>>>  unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
>>>  __aligned(STACK_SIZE);
>>>  
>>> -void __init noreturn start_xen(void)
>>> +void __init noreturn start_xen(unsigned long bootcpu_id,
>>> +   unsigned long dtb_base)
>> To be clear, this change should be this hunk exactly as it is, and a
>> comment immediately ahead of ENTRY(start) describing the entry ABI.
>>
>> There is no need currently to change any of the asm code.
> I think that I'll use s2 and s3 to save bootcpu_id.
>
> But I am unsure I understand why the asm code shouldn't be changed.

Because nothing in the asm code (right now) touches any of the a registers.

Therefore the parameters that OpenSBI prepared for start() (i.e. a0 and
a1 here) are still correct for start_xen().

If, and only if, we need to modify a* for other reasons in start() do we
need to preserve their values somehow.

> If I understand you correctly I can write in a comment ahead of
> ENTRY(start) that a0, and a1 are reserved for hart_id and dtb_base
> which are passed from a bootloader but it will work only if start_xen
> will be only C function called from head.S.

Not quite.  You want a comment explaining what the OpenSBI -> start()
ABI is.  So people know what a0/etc is at ENTRY(start).

Here is an example from a different project:
https://github.com/TrenchBoot/secure-kernel-loader/blob/master/head.S#L52-L68


There is no need to do unnecessary work (i.e. preserving them right
now), until you find a reason to need to spill them.  Right now, there's
not need, and this isn't obviously going to change in the short term.

~Andrew



Re: [PATCH v2 6/6] gitlab-ci.d/crossbuilds: Drop the 32-bit arm system emulation jobs

2023-03-02 Thread Daniel P . Berrangé
On Thu, Mar 02, 2023 at 05:31:06PM +0100, Thomas Huth wrote:
> Hardly anybody still uses 32-bit arm environments for running QEMU,
> so let's stop wasting our scarce CI minutes with these jobs.
> 
> Signed-off-by: Thomas Huth 
> ---
>  .gitlab-ci.d/crossbuilds.yml | 14 --
>  1 file changed, 14 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 5/6] docs/about/deprecated: Deprecate 32-bit arm hosts

2023-03-02 Thread Daniel P . Berrangé
On Thu, Mar 02, 2023 at 05:31:05PM +0100, Thomas Huth wrote:
> For running QEMU in system emulation mode, the user needs a rather
> strong host system, i.e. not only an embedded low-frequency controller.
> All recent beefy arm host machines should support 64-bit now, it's
> unlikely that anybody is still seriously using QEMU on a 32-bit arm
> CPU, so we deprecate the 32-bit arm hosts here to finally save use
> some time and precious CI minutes.
> 
> Signed-off-by: Thomas Huth 
> ---
>  docs/about/deprecated.rst | 9 +
>  1 file changed, 9 insertions(+)

Reviewed-by: Daniel P. Berrangé 

> diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
> index 21ce70b5c9..c7113a7510 100644
> --- a/docs/about/deprecated.rst
> +++ b/docs/about/deprecated.rst
> @@ -229,6 +229,15 @@ discontinue it. Since all recent x86 hardware from the 
> past >10 years
>  is capable of the 64-bit x86 extensions, a corresponding 64-bit OS
>  should be used instead.
>  
> +System emulation on 32-bit arm hosts (since 8.0)
> +
> +
> +Since QEMU needs a strong host machine for running full system emulation, and
> +all recent powerful arm hosts support 64-bit, the QEMU project deprecates the
> +support for running any system emulation on 32-bit arm hosts in general. Use
> +64-bit arm hosts for system emulation instead. (Note: "user" mode emulation
> +continuous to be supported on 32-bit arm hosts, too)

s/continuous/continues/

s/,too/, as well as command line tools like qemu-img, qemu-nbd, etc/

> +
>  
>  QEMU API (QAPI) events
>  --
> -- 
> 2.31.1
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 4/6] docs/about/deprecated: Deprecate the qemu-system-arm binary

2023-03-02 Thread Daniel P . Berrangé
On Thu, Mar 02, 2023 at 05:31:04PM +0100, Thomas Huth wrote:
> qemu-system-aarch64 is a proper superset of qemu-system-arm,
> and the latter was mainly still required for 32-bit KVM support.
> But this 32-bit KVM arm support has been dropped in the Linux
> kernel a couple of years ago already, so we don't really need
> qemu-system-arm anymore, thus deprecated it now.
> 
> Signed-off-by: Thomas Huth 
> ---
>  docs/about/deprecated.rst | 10 ++
>  1 file changed, 10 insertions(+)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 3/6] gitlab-ci.d/crossbuilds: Drop the i386 jobs

2023-03-02 Thread Daniel P . Berrangé
On Thu, Mar 02, 2023 at 05:31:03PM +0100, Thomas Huth wrote:
> Hardly anybody still uses 32-bit x86 environments for running QEMU,
> so let's stop wasting our scarce CI minutes with these jobs.
> 
> Signed-off-by: Thomas Huth 
> ---
>  .gitlab-ci.d/crossbuilds.yml | 16 
>  1 file changed, 16 deletions(-)

Reviewed-by: Daniel P. Berrangé 

There's still the mingw 32-bit x86 build, but probably wolrth
keeping that until we actually stop 32-bit from a technical
POV, because Stefan still publishes the 32-bit windows
installers currently

Similarly  the dockerfile can stay in case someone wants to
reproduce a flaw locally

Reviewed-by: Daniel P. Berrangé 


> diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml
> index 101416080c..3ce51adf77 100644
> --- a/.gitlab-ci.d/crossbuilds.yml
> +++ b/.gitlab-ci.d/crossbuilds.yml
> @@ -43,22 +43,6 @@ cross-arm64-user:
>variables:
>  IMAGE: debian-arm64-cross
>  
> -cross-i386-system:
> -  extends: .cross_system_build_job
> -  needs:
> -job: i386-fedora-cross-container
> -  variables:
> -IMAGE: fedora-i386-cross
> -MAKE_CHECK_ARGS: check-qtest
> -
> -cross-i386-user:
> -  extends: .cross_user_build_job
> -  needs:
> -job: i386-fedora-cross-container
> -  variables:
> -IMAGE: fedora-i386-cross
> -MAKE_CHECK_ARGS: check
> -
>  cross-i386-tci:
>extends: .cross_accel_build_job
>timeout: 60m
> -- 
> 2.31.1
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 2/6] docs/about/deprecated: Deprecate 32-bit x86 hosts

2023-03-02 Thread Daniel P . Berrangé
On Thu, Mar 02, 2023 at 05:31:02PM +0100, Thomas Huth wrote:
> Hardly anybody still uses 32-bit x86 hosts today, so we should start
> deprecating them to stop wasting our time and CI minutes here.
> For example, there are also still some unresolved problems with these:
> When emulating 64-bit binaries in user mode, TCG does not honor atomicity
> for 64-bit accesses, which is "perhaps worse than not working at all"
> (quoting Richard). Let's simply make it clear that people should use
> 64-bit x86 hosts nowadays and we do not intend to fix/maintain the old
> 32-bit stuff.
> 
> Signed-off-by: Thomas Huth 
> ---
>  docs/about/deprecated.rst | 12 
>  1 file changed, 12 insertions(+)

Reviewed-by: Daniel P. Berrangé 

> 
> diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
> index 11700adac9..a30aa8dfdf 100644
> --- a/docs/about/deprecated.rst
> +++ b/docs/about/deprecated.rst
> @@ -208,6 +208,18 @@ CI coverage support may bitrot away before the 
> deprecation process
>  completes. The little endian variants of MIPS (both 32 and 64 bit) are
>  still a supported host architecture.
>  
> +32-bit x86 hosts (since 8.0)
> +
> +
> +Support for 32-bit x86 host deployments is increasingly uncommon in
> +mainstream OS distributions given the widespread availability of 64-bit
> +x86 hardware. The QEMU project no longer considers 32-bit x86 support
> +to be an effective use of its limited resources, and thus intends to
> +discontinue it. Since all recent x86 hardware from the past >10 years
> +is capable of the 64-bit x86 extensions, a corresponding 64-bit OS
> +should be used instead.
> +
> +
>  QEMU API (QAPI) events
>  --
>  
> -- 
> 2.31.1
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 1/6] docs/about/deprecated: Deprecate the qemu-system-i386 binary

2023-03-02 Thread Daniel P . Berrangé
On Thu, Mar 02, 2023 at 05:31:01PM +0100, Thomas Huth wrote:
> Hardly anybody really requires the i386 binary anymore, since the
> qemu-system-x86_64 binary is a proper superset. So let's deprecate
> the 32-bit variant now, so that we can finally stop wasting our time
> and CI minutes with this.

The first sentence isn't quite true wrt to KVM. Change slightly to:

Aside from not supporting KVM on 32-bit hosts, the qemu-system-x86_64
binary is a proper superset of the qemu-system-i386 binary. With the
32-bit host support being deprecated, it is now also possible to
deprecate the qemu-system-i386 binary.

> With regards to 32-bit KVM support in the x86 Linux kernel,
> the developers confirmed that they do not need a recent
> qemu-system-i386 binary here:
> 
>  https://lore.kernel.org/kvm/y%2ffkts5ajfy0h...@google.com/
> 
> Signed-off-by: Thomas Huth 
> ---
>  docs/about/deprecated.rst | 12 
>  1 file changed, 12 insertions(+)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




[RFC XEN PATCH 7/7] automation: Add some more push containers jobs

2023-03-02 Thread Anthony PERARD
Signed-off-by: Anthony PERARD 
---

Notes:
WARNING: This is an incomplete list of jobs needed to push.

 automation/gitlab-ci/push-containers.yaml | 49 +++
 1 file changed, 49 insertions(+)

diff --git a/automation/gitlab-ci/push-containers.yaml 
b/automation/gitlab-ci/push-containers.yaml
index d7e7e2b9e2..3785e29250 100644
--- a/automation/gitlab-ci/push-containers.yaml
+++ b/automation/gitlab-ci/push-containers.yaml
@@ -18,6 +18,33 @@
   after_script:
 - docker logout
 
+push-archlinux-current-container:
+  variables:
+BUILD_CONTAINER: archlinux/current
+  extends:
+- .push-container-build-tmpl
+  needs:
+- test-archlinux-gcc
+- test-archlinux-gcc-debug
+
+push-debian-stretch-32-container:
+  variables:
+BUILD_CONTAINER: debian/stretch-i386
+  extends:
+- .push-container-build-tmpl
+  needs:
+- test-debian-stretch-32-clang-debug
+- test-debian-stretch-32-gcc-debug
+
+push-debian-unstable-32-container:
+  variables:
+BUILD_CONTAINER: debian/unstable-i386
+  extends:
+- .push-container-build-tmpl
+  needs:
+- test-debian-unstable-32-clang-debug
+- test-debian-unstable-32-gcc-debug
+
 push-ubuntu-xenial-container:
   variables:
 BUILD_CONTAINER: ubuntu/xenial
@@ -28,3 +55,25 @@ push-ubuntu-xenial-container:
 - test-ubuntu-xenial-clang-debug
 - test-ubuntu-xenial-gcc
 - test-ubuntu-xenial-gcc-debug
+
+push-ubuntu-bionic-container:
+  variables:
+BUILD_CONTAINER: ubuntu/bionic
+  extends:
+- .push-container-build-tmpl
+  needs:
+- test-ubuntu-bionic-clang
+- test-ubuntu-bionic-clang-debug
+- test-ubuntu-bionic-gcc
+- test-ubuntu-bionic-gcc-debug
+
+push-ubuntu-focal-container:
+  variables:
+BUILD_CONTAINER: ubuntu/focal
+  extends:
+- .push-container-build-tmpl
+  needs:
+- test-ubuntu-focal-gcc
+- test-ubuntu-focal-gcc-debug
+- test-ubuntu-focal-clang
+- test-ubuntu-focal-clang-debug
-- 
Anthony PERARD




[RFC XEN PATCH 6/7] automation: Push container been tested

2023-03-02 Thread Anthony PERARD
Now, we can run a pipeline and set two variables to have a container
been rebuilt, tested, and pushed.

Variables:
DO_REBUILD_CONTAINER = "ubuntu/xenial"
PUSH_CONTAINER = 1

Or if PUSH_CONTAINER is set on a gitlab project "xen-project/xen", a
change on the dockerfile can result in a container been rebuild when
the change is pushed to staging.

The push-containers stage pull the container been tested and retag it
before pushing it. So both tagged container with and without "-test"
suffix are the same.

Signed-off-by: Anthony PERARD 
---

Notes:
Something that could be added is to check that the container that we are
going to push is the same one that have been tested. Maybe by comparing
"digest", or maybe by using a suffix that is only generated by the
current pipeline.

 .gitlab-ci.yml|  2 ++
 automation/build/Makefile | 12 +
 automation/gitlab-ci/push-containers.yaml | 30 +++
 3 files changed, 44 insertions(+)
 create mode 100644 automation/gitlab-ci/push-containers.yaml

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index ed5383ab50..0cd45ad001 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -3,9 +3,11 @@ stages:
   - test
   - containers
   - test-containers
+  - push-containers
 
 include:
   - 'automation/gitlab-ci/build.yaml'
   - 'automation/gitlab-ci/test.yaml'
   - 'automation/gitlab-ci/containers.yaml'
   - 'automation/gitlab-ci/test-containers.yaml'
+  - 'automation/gitlab-ci/push-containers.yaml'
diff --git a/automation/build/Makefile b/automation/build/Makefile
index 5515938878..14d1320b23 100644
--- a/automation/build/Makefile
+++ b/automation/build/Makefile
@@ -21,6 +21,18 @@ include yocto/yocto.inc
$(DOCKER_CMD) push 
$(REGISTRY)/$(@D):$(@F)$(BUILD_CONTAINER_SUFFIX); \
fi
 
+# rule used by GitLab CI jobs, to push a container that as just been built and
+# tested. It override the rule used to build a container.
+ifdef PUSH_TEST_CONTAINER_SUFFIX
+%: %.dockerfile
+   $(if $(BUILD_CONTAINER_SUFFIX),$(error BUILD_CONTAINER_SUFFIX should 
not be set anymore))
+   $(DOCKER_CMD) pull $(REGISTRY)/$(@D):$(@F)$(PUSH_TEST_CONTAINER_SUFFIX)
+   $(DOCKER_CMD) image tag 
$(REGISTRY)/$(@D):$(@F)$(PUSH_TEST_CONTAINER_SUFFIX) $(REGISTRY)/$(@D):$(@F)
+   @if [ ! -z $${PUSH+x} ]; then \
+   $(DOCKER_CMD) push $(REGISTRY)/$(@D):$(@F); \
+   fi
+endif
+
 .PHONY: all clean
 all: $(CONTAINERS)
 
diff --git a/automation/gitlab-ci/push-containers.yaml 
b/automation/gitlab-ci/push-containers.yaml
new file mode 100644
index 00..d7e7e2b9e2
--- /dev/null
+++ b/automation/gitlab-ci/push-containers.yaml
@@ -0,0 +1,30 @@
+.push-container-build-tmpl:
+  stage: push-containers
+  image: docker:stable
+  tags:
+- container-builder
+  rules:
+- if: $PUSH_CONTAINER != "1"
+  when: never
+- !reference [.container-build-tmpl, rules]
+  services:
+- docker:dind
+  before_script:
+- apk add make
+- docker info
+- docker login -u $CI_DEPLOY_USER -p $CI_DEPLOY_PASSWORD $CI_REGISTRY
+  script:
+- make -C automation/build ${BUILD_CONTAINER} PUSH=1 
PUSH_TEST_CONTAINER_SUFFIX=-test
+  after_script:
+- docker logout
+
+push-ubuntu-xenial-container:
+  variables:
+BUILD_CONTAINER: ubuntu/xenial
+  extends:
+- .push-container-build-tmpl
+  needs:
+- test-ubuntu-xenial-clang
+- test-ubuntu-xenial-clang-debug
+- test-ubuntu-xenial-gcc
+- test-ubuntu-xenial-gcc-debug
-- 
Anthony PERARD




[RFC XEN PATCH 3/7] automation: Add a template per container for build jobs.

2023-03-02 Thread Anthony PERARD
Have one template per container, which each build job will extend.
This will allow to add more variable which are linked to a used
container.

Signed-off-by: Anthony PERARD 
---
 automation/gitlab-ci/build.yaml | 327 
 1 file changed, 202 insertions(+), 125 deletions(-)

diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml
index 38bb22d860..1f186bf346 100644
--- a/automation/gitlab-ci/build.yaml
+++ b/automation/gitlab-ci/build.yaml
@@ -255,285 +255,346 @@
 
 # Jobs below this line
 
-archlinux-gcc:
-  extends: .gcc-x86-64-build
+.container-archlinux:
   variables:
 CONTAINER: archlinux:current
 
+archlinux-gcc:
+  extends:
+- .gcc-x86-64-build
+- .container-archlinux
+
 archlinux-gcc-debug:
-  extends: .gcc-x86-64-build-debug
-  variables:
-CONTAINER: archlinux:current
+  extends:
+- .gcc-x86-64-build-debug
+- .container-archlinux
 
-centos-7-gcc:
-  extends: .gcc-x86-64-build
+.container-centos-7:
   variables:
 CONTAINER: centos:7
 
+centos-7-gcc:
+  extends:
+- .gcc-x86-64-build
+- .container-centos-7
+
 centos-7-gcc-debug:
-  extends: .gcc-x86-64-build-debug
-  variables:
-CONTAINER: centos:7
+  extends:
+- .gcc-x86-64-build-debug
+- .container-centos-7
 
-debian-stretch-clang:
-  extends: .clang-x86-64-build
+.container-debian-stretch:
   variables:
 CONTAINER: debian:stretch
 
+debian-stretch-clang:
+  extends:
+- .clang-x86-64-build
+- .container-debian-stretch
+
 debian-stretch-clang-debug:
-  extends: .clang-x86-64-build-debug
-  variables:
-CONTAINER: debian:stretch
+  extends:
+- .clang-x86-64-build-debug
+- .container-debian-stretch
 
 debian-stretch-clang-8:
-  extends: .clang-8-x86-64-build
-  variables:
-CONTAINER: debian:stretch
+  extends:
+- .clang-8-x86-64-build
+- .container-debian-stretch
 
 debian-stretch-clang-8-debug:
-  extends: .clang-8-x86-64-build-debug
-  variables:
-CONTAINER: debian:stretch
+  extends:
+- .clang-8-x86-64-build-debug
+- .container-debian-stretch
 
 debian-stretch-gcc:
-  extends: .gcc-x86-64-build
-  variables:
-CONTAINER: debian:stretch
+  extends:
+- .gcc-x86-64-build
+- .container-debian-stretch
 
 debian-stretch-gcc-debug:
-  extends: .gcc-x86-64-build-debug
-  variables:
-CONTAINER: debian:stretch
+  extends:
+- .gcc-x86-64-build-debug
+- .container-debian-stretch
 
-debian-stretch-32-clang-debug:
-  extends: .clang-x86-32-build-debug
+.container-debian-stretch-32:
   variables:
 CONTAINER: debian:stretch-i386
 
+debian-stretch-32-clang-debug:
+  extends:
+- .clang-x86-32-build-debug
+- .container-debian-stretch-32
+
 debian-stretch-32-gcc-debug:
-  extends: .gcc-x86-32-build-debug
-  variables:
-CONTAINER: debian:stretch-i386
+  extends:
+- .gcc-x86-32-build-debug
+- .container-debian-stretch-32
 
 debian-buster-gcc-ibt:
-  extends: .gcc-x86-64-build
+  extends:
+- .gcc-x86-64-build
   variables:
 CONTAINER: debian:buster-gcc-ibt
 RANDCONFIG: y
 EXTRA_FIXED_RANDCONFIG: |
   CONFIG_XEN_IBT=y
 
-debian-unstable-clang:
-  extends: .clang-x86-64-build
+.container-debian-unstable:
   variables:
 CONTAINER: debian:unstable
 
+debian-unstable-clang:
+  extends:
+- .clang-x86-64-build
+- .container-debian-unstable
+
 debian-unstable-clang-debug:
-  extends: .clang-x86-64-build-debug
-  variables:
-CONTAINER: debian:unstable
+  extends:
+- .clang-x86-64-build-debug
+- .container-debian-unstable
 
 debian-unstable-gcc:
-  extends: .gcc-x86-64-build
-  variables:
-CONTAINER: debian:unstable
+  extends:
+- .gcc-x86-64-build
+- .container-debian-unstable
 
 debian-unstable-gcc-debug:
-  extends: .gcc-x86-64-build-debug
-  variables:
-CONTAINER: debian:unstable
+  extends:
+- .gcc-x86-64-build-debug
+- .container-debian-unstable
 
 debian-unstable-gcc-randconfig:
-  extends: .gcc-x86-64-build
+  extends:
+- .gcc-x86-64-build
+- .container-debian-unstable
   variables:
-CONTAINER: debian:unstable
 RANDCONFIG: y
 
 debian-unstable-gcc-debug-randconfig:
-  extends: .gcc-x86-64-build-debug
+  extends:
+- .gcc-x86-64-build-debug
+- .container-debian-unstable
   variables:
-CONTAINER: debian:unstable
 RANDCONFIG: y
 
-debian-unstable-32-clang-debug:
-  extends: .clang-x86-32-build-debug
+.container-debian-unstable-32:
   variables:
 CONTAINER: debian:unstable-i386
 
+debian-unstable-32-clang-debug:
+  extends:
+- .clang-x86-32-build-debug
+- .container-debian-unstable-32
+
 debian-unstable-32-gcc-debug:
-  extends: .gcc-x86-32-build-debug
-  variables:
-CONTAINER: debian:unstable-i386
+  extends:
+- .gcc-x86-32-build-debug
+- .container-debian-unstable-32
 
 fedora-gcc:
-  extends: .gcc-x86-64-build
+  extends:
+- .gcc-x86-64-build
   variables:
 CONTAINER: fedora:29
 
 fedora-gcc-debug:
-  extends: .gcc-x86-64-build-debug
+  extends:
+- 

[RFC XEN PATCH 0/7] automation, RFC prototype, Have GitLab CI built its own containers

2023-03-02 Thread Anthony PERARD
Patch series available in this git branch:
https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git 
br.gitlab-containers-auto-rebuild-v1

Hi,

I have done some research to be able to build containers in the CI. This works
only for x86 containers as I've setup only a single x86 gitlab-runner to be
able to run docker-in-docker.

The runner is setup to only accept jobs from a branch that is "protected" in
gitlab. Also, one need credential to push to the container register, those are
called "Group deploy tokens", and I've set the variables CI_DEPLOY_USER and
CI_DEPLOY_PASSWORD in the project "xen-project/xen" (variables only visible on
a pipeline running on a protected branch).

These patch introduce quite a lot of redundancies in jobs, 2 new jobs per
containers which build/push containers, and duplicate most of build.yaml.
This mean that if we go with this, we have to duplicate and keep in sync many
jobs.

To avoid having to do the duplicated jobs by hand, I could look at
creating a script that use "build.yaml" as input and produce the 3
stages needed to update a container, but that script would need to be
run by hand, as gitlab can't really use it, unless ..

I've look at generated pipeline, and they look like this in gitlab:
https://gitlab.com/xen-project/people/anthonyper/xen/-/pipelines/777665738
But the result of the generated/child pipeline doesn't seems to be taken into
account in the original pipeline, which make me think that we can't use them to
generate "build.yaml". But maybe the could be use for generating the pipeline
that will update a container.
Doc:

https://docs.gitlab.com/ee/ci/pipelines/downstream_pipelines.html#dynamic-child-pipelines

So, with all of this, is it reasonable to test containers before
pushing them to production? Or is it to much work? We could simply have jobs
tacking care of rebuilding a container and pushing them to production without
testing.

An example with the variable DO_REBUILD_CONTAINER and PUSH_CONTAINER set (and
existing build/test jobs disabled):
https://gitlab.com/xen-project/people/anthonyper/xen/-/pipelines/791711467

Cheers,

Anthony PERARD (7):
  automation: Automatically rebuild debian:unstable container
  automation: Introduce test-containers stage
  automation: Add a template per container for build jobs.
  automation: Adding containers build jobs and test of thoses
  automation: Introduce DO_REBUILD_CONTAINER, to allow to rebuild a
container
  automation: Push container been tested
  automation: Add some more push containers jobs

 .gitlab-ci.yml|   6 +
 automation/build/Makefile |  14 +-
 automation/gitlab-ci/build.yaml   | 327 --
 automation/gitlab-ci/containers.yaml  |  98 +
 automation/gitlab-ci/push-containers.yaml |  79 
 automation/gitlab-ci/test-containers.yaml | 496 ++
 6 files changed, 894 insertions(+), 126 deletions(-)
 create mode 100644 automation/gitlab-ci/containers.yaml
 create mode 100644 automation/gitlab-ci/push-containers.yaml
 create mode 100644 automation/gitlab-ci/test-containers.yaml

-- 
Anthony PERARD




[RFC XEN PATCH 5/7] automation: Introduce DO_REBUILD_CONTAINER, to allow to rebuild a container

2023-03-02 Thread Anthony PERARD
This allow to start a pipeline manually and set a variable to test the
build of a single container, e.g.
DO_REBUILD_CONTAINER = ubuntu/xenial

Signed-off-by: Anthony PERARD 
---
 automation/gitlab-ci/containers.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/automation/gitlab-ci/containers.yaml 
b/automation/gitlab-ci/containers.yaml
index a6d61980b1..9074bfe6f1 100644
--- a/automation/gitlab-ci/containers.yaml
+++ b/automation/gitlab-ci/containers.yaml
@@ -7,6 +7,7 @@
 - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == "staging"
   changes:
 - automation/build/${BUILD_CONTAINER}.dockerfile
+- if: $DO_REBUILD_CONTAINER == $BUILD_CONTAINER
   services:
 - docker:dind
   before_script:
-- 
Anthony PERARD




[RFC XEN PATCH 4/7] automation: Adding containers build jobs and test of thoses

2023-03-02 Thread Anthony PERARD
For the test-containers jobs, mostly copy from "build.yaml", rename
'.container-*-tmpl' templates to '.container-*-testtmpl', prefix build
jobs with "test-", add BUILD_CONTAINER and "needs" to container
template.

Signed-off-by: Anthony PERARD 
---

Notes:
WARNING: This is an incomplete list of the containers that can be
rebuilt and tested.

 automation/gitlab-ci/containers.yaml  |  74 ++-
 automation/gitlab-ci/test-containers.yaml | 257 ++
 2 files changed, 330 insertions(+), 1 deletion(-)

diff --git a/automation/gitlab-ci/containers.yaml 
b/automation/gitlab-ci/containers.yaml
index ace93eaccf..a6d61980b1 100644
--- a/automation/gitlab-ci/containers.yaml
+++ b/automation/gitlab-ci/containers.yaml
@@ -18,8 +18,80 @@
   after_script:
 - docker logout
 
-debian-unstable-container:
+archlinux-current-container:
+  variables:
+BUILD_CONTAINER: archlinux/current
+  extends:
+- .container-build-tmpl
+
+centos-7-2-container:
+  variables:
+BUILD_CONTAINER: centos/7.2
+  extends:
+- .container-build-tmpl
+
+centos-7-container:
+  variables:
+BUILD_CONTAINER: centos/7
+  extends:
+- .container-build-tmpl
+
+debian-jessie-container:
+  variables:
+BUILD_CONTAINER: debian/jessie
+  extends:
+- .container-build-tmpl
+
+debian-jessie-32-container:
+  variables:
+BUILD_CONTAINER: debian/jessie-i386
+  extends:
+- .container-build-tmpl
+
+debian-stretch-container:
+  variables:
+BUILD_CONTAINER: debian/stretch
+  extends:
+- .container-build-tmpl
+
+debian-stretch-32-container:
+  variables:
+BUILD_CONTAINER: debian/stretch-i386
   extends:
 - .container-build-tmpl
+
+debian-unstable-container:
   variables:
 BUILD_CONTAINER: debian/unstable
+  extends:
+- .container-build-tmpl
+
+debian-unstable-32-container:
+  variables:
+BUILD_CONTAINER: debian/unstable-i386
+  extends:
+- .container-build-tmpl
+
+ubuntu-trusty-container:
+  variables:
+BUILD_CONTAINER: ubuntu/trusty
+  extends:
+- .container-build-tmpl
+
+ubuntu-xenial-container:
+  variables:
+BUILD_CONTAINER: ubuntu/xenial
+  extends:
+- .container-build-tmpl
+
+ubuntu-bionic-container:
+  variables:
+BUILD_CONTAINER: ubuntu/bionic
+  extends:
+- .container-build-tmpl
+
+ubuntu-focal-container:
+  variables:
+BUILD_CONTAINER: ubuntu/focal
+  extends:
+- .container-build-tmpl
diff --git a/automation/gitlab-ci/test-containers.yaml 
b/automation/gitlab-ci/test-containers.yaml
index 5dbf3902ff..4d5c6ba364 100644
--- a/automation/gitlab-ci/test-containers.yaml
+++ b/automation/gitlab-ci/test-containers.yaml
@@ -197,6 +197,148 @@
 
 # Jobs below this line
 
+.container-archlinux-testtmpl:
+  variables:
+CONTAINER: archlinux:current
+BUILD_CONTAINER: archlinux/current
+  needs:
+- archlinux-current-container
+
+test-archlinux-gcc:
+  extends:
+- .gcc-x86-64-testbuild
+- .container-archlinux-testtmpl
+
+test-archlinux-gcc-debug:
+  extends:
+- .gcc-x86-64-testbuild-debug
+- .container-archlinux-testtmpl
+
+.container-centos-7-testtmpl:
+  variables:
+CONTAINER: centos:7
+BUILD_CONTAINER: centos/7
+  needs:
+- centos-7-container
+
+test-centos-7-gcc:
+  extends:
+- .gcc-x86-64-testbuild
+- .container-centos-7-testtmpl
+
+test-centos-7-gcc-debug:
+  extends:
+- .gcc-x86-64-testbuild-debug
+- .container-centos-7-testtmpl
+
+.container-debian-jessie-testtmpl:
+  variables:
+CONTAINER: debian:jessie
+BUILD_CONTAINER: debian/jessie
+  needs:
+- debian-jessie-container
+
+test-debian-jessie-clang:
+  extends:
+- .clang-x86-64-testbuild
+- .container-debian-jessie-testtmpl
+
+test-debian-jessie-clang-debug:
+  extends:
+- .clang-x86-64-testbuild-debug
+- .container-debian-jessie-testtmpl
+
+test-debian-jessie-gcc:
+  extends:
+- .gcc-x86-64-testbuild
+- .container-debian-jessie-testtmpl
+
+test-debian-jessie-gcc-debug:
+  extends:
+- .gcc-x86-64-testbuild-debug
+- .container-debian-jessie-testtmpl
+
+.container-debian-32-jessie-testtmpl:
+  variables:
+CONTAINER: debian:jessie-i386
+BUILD_CONTAINER: debian/jessie-i386
+  needs:
+- debian-jessie-32-container
+
+test-debian-jessie-32-clang:
+  extends:
+- .clang-x86-32-testbuild
+- .container-debian-32-jessie-testtmpl
+
+test-debian-jessie-32-clang-debug:
+  extends:
+- .clang-x86-32-testbuild-debug
+- .container-debian-32-jessie-testtmpl
+
+test-debian-jessie-32-gcc:
+  extends:
+- .gcc-x86-32-testbuild
+- .container-debian-32-jessie-testtmpl
+
+test-debian-jessie-32-gcc-debug:
+  extends:
+- .gcc-x86-32-testbuild-debug
+- .container-debian-32-jessie-testtmpl
+
+.container-debian-stretch-testtmpl:
+  variables:
+CONTAINER: debian:stretch
+BUILD_CONTAINER: debian/stretch
+  needs:
+- debian-stretch-container
+
+test-debian-stretch-clang:
+  extends:
+- .clang-x86-64-testbuild
+- .container-debian-stretch-testtmpl
+

[RFC XEN PATCH 1/7] automation: Automatically rebuild debian:unstable container

2023-03-02 Thread Anthony PERARD
Only run this on the staging branch, whenever the dockerfile changes.

Allow to set a suffix when building containers, to be able to test it
before changing the one in production.

Using "rules" instead of "only" as this allow to use variables in the
"changes" rules. Also, "rules" is the preferred keyword as
"only/except" isn't being actively developed in GitLab.

Use $CI_PIPELINE_SOURCE==push to evaluate "rules:changes" only on
push. In most other cases, "rules:changes" evaluate to true so
checking CI_PIPELINE_SOURCE is important.

Signed-off-by: Anthony PERARD 
---
 .gitlab-ci.yml   |  2 ++
 automation/build/Makefile|  4 ++--
 automation/gitlab-ci/containers.yaml | 25 +
 3 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 automation/gitlab-ci/containers.yaml

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index c8bd7519d5..c5d499b321 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -1,7 +1,9 @@
 stages:
   - build
   - test
+  - containers
 
 include:
   - 'automation/gitlab-ci/build.yaml'
   - 'automation/gitlab-ci/test.yaml'
+  - 'automation/gitlab-ci/containers.yaml'
diff --git a/automation/build/Makefile b/automation/build/Makefile
index 4df43b0407..5515938878 100644
--- a/automation/build/Makefile
+++ b/automation/build/Makefile
@@ -16,9 +16,9 @@ help:
 include yocto/yocto.inc
 
 %: %.dockerfile ## Builds containers
-   $(DOCKER_CMD) build --pull -t $(REGISTRY)/$(@D):$(@F) -f $< $(

[RFC XEN PATCH 2/7] automation: Introduce test-containers stage

2023-03-02 Thread Anthony PERARD
Jobs in the "test-containers" stage will be used to check that the
newly built container is working fine, and that it could be used in
production.

Need to rename jobs name compared to "build.yaml", adding "test-"
prefix to all build jobs.

Need also to rename templates as many of them are used with "extends"
which look for "jobs" and template across all the yaml files. Mostly
change "build" to "testbuild".

Introduce a job template per container, as we've got three
"variables", CONTAINER, BUILD_CONTAINER, and a job dependency.

Signed-off-by: Anthony PERARD 
---

Notes:
It is probably possible to share many of the templates with
"build.yaml", by changing some of the templates and the way link between
them.

 .gitlab-ci.yml|   2 +
 automation/gitlab-ci/test-containers.yaml | 239 ++
 2 files changed, 241 insertions(+)
 create mode 100644 automation/gitlab-ci/test-containers.yaml

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index c5d499b321..ed5383ab50 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -2,8 +2,10 @@ stages:
   - build
   - test
   - containers
+  - test-containers
 
 include:
   - 'automation/gitlab-ci/build.yaml'
   - 'automation/gitlab-ci/test.yaml'
   - 'automation/gitlab-ci/containers.yaml'
+  - 'automation/gitlab-ci/test-containers.yaml'
diff --git a/automation/gitlab-ci/test-containers.yaml 
b/automation/gitlab-ci/test-containers.yaml
new file mode 100644
index 00..5dbf3902ff
--- /dev/null
+++ b/automation/gitlab-ci/test-containers.yaml
@@ -0,0 +1,239 @@
+.testbuild-tmpl: 
+  stage: test-containers
+  image: registry.gitlab.com/xen-project/xen/${CONTAINER}-test
+  script:
+- ./automation/scripts/build 2>&1 | tee build.log
+  artifacts:
+paths:
+  - binaries/
+  - xen-config
+  - '*.log'
+  - '*/*.log'
+when: always
+  rules: !reference [.container-build-tmpl, rules]
+
+.gcc-tmpl:
+  variables: 
+CC: gcc
+CXX: g++
+
+.clang-tmpl:
+  variables: 
+CC: clang
+CXX: clang++
+clang: y
+
+.clang-8-tmpl:
+  variables: 
+CC: clang-8
+CXX: clang++-8
+LD: ld.lld-8
+clang: y
+
+.x86-64-testbuild-tmpl:
+  <<: *testbuild
+  variables:
+XEN_TARGET_ARCH: x86_64
+  tags:
+- x86_64
+
+.x86-64-testbuild:
+  extends: .x86-64-testbuild-tmpl
+  variables:
+debug: n
+
+.x86-64-testbuild-debug:
+  extends: .x86-64-testbuild-tmpl
+  variables:
+debug: y
+
+.x86-32-testbuild-tmpl:
+  <<: *testbuild
+  variables:
+XEN_TARGET_ARCH: x86_32
+  tags:
+- x86_32
+
+.x86-32-testbuild:
+  extends: .x86-32-testbuild-tmpl
+  variables:
+debug: n
+
+.x86-32-testbuild-debug:
+  extends: .x86-32-testbuild-tmpl
+  variables:
+debug: y
+
+.gcc-x86-64-testbuild:
+  extends: .x86-64-testbuild
+  variables:
+<<: *gcc
+
+.gcc-x86-64-testbuild-debug:
+  extends: .x86-64-testbuild-debug
+  variables:
+<<: *gcc
+
+.gcc-x86-32-testbuild:
+  extends: .x86-32-testbuild
+  variables:
+<<: *gcc
+
+.gcc-x86-32-testbuild-debug:
+  extends: .x86-32-testbuild-debug
+  variables:
+<<: *gcc
+
+.clang-x86-64-testbuild:
+  extends: .x86-64-testbuild
+  variables:
+<<: *clang
+
+.clang-x86-64-testbuild-debug:
+  extends: .x86-64-testbuild-debug
+  variables:
+<<: *clang
+
+.clang-8-x86-64-testbuild:
+  extends: .x86-64-testbuild
+  variables:
+<<: *clang-8
+
+.clang-8-x86-64-testbuild-debug:
+  extends: .x86-64-testbuild-debug
+  variables:
+<<: *clang-8
+
+.clang-x86-32-testbuild:
+  extends: .x86-32-testbuild
+  variables:
+<<: *clang
+
+.clang-x86-32-testbuild-debug:
+  extends: .x86-32-testbuild-debug
+  variables:
+<<: *clang
+
+.arm32-cross-testbuild-tmpl:
+  <<: *testbuild
+  variables:
+XEN_TARGET_ARCH: arm32
+  tags:
+- x86_64
+
+.arm32-cross-testbuild:
+  extends: .arm32-cross-testbuild-tmpl
+  variables:
+debug: n
+
+.arm32-cross-testbuild-debug:
+  extends: .arm32-cross-testbuild-tmpl
+  variables:
+debug: y
+
+.gcc-arm32-cross-testbuild:
+  extends: .arm32-cross-testbuild
+  variables:
+<<: *gcc
+
+.gcc-arm32-cross-testbuild-debug:
+  extends: .arm32-cross-testbuild-debug
+  variables:
+<<: *gcc
+
+.arm64-testbuild-tmpl:
+  <<: *testbuild
+  variables:
+XEN_TARGET_ARCH: arm64
+  tags:
+- arm64
+
+.arm64-testbuild:
+  extends: .arm64-testbuild-tmpl
+  variables:
+debug: n
+
+.arm64-testbuild-debug:
+  extends: .arm64-testbuild-tmpl
+  variables:
+debug: y
+
+.gcc-arm64-testbuild:
+  extends: .arm64-testbuild
+  variables:
+<<: *gcc
+
+.gcc-arm64-testbuild-debug:
+  extends: .arm64-testbuild-debug
+  variables:
+<<: *gcc
+
+.riscv64-cross-testbuild-tmpl:
+  <<: *testbuild
+  variables:
+XEN_TARGET_ARCH: riscv64
+  tags:
+- x86_64
+
+.riscv64-cross-testbuild:
+  extends: .riscv64-cross-testbuild-tmpl
+  variables:
+debug: n
+
+.riscv64-cross-testbuild-debug:
+  extends: .riscv64-cross-testbuild-tmpl
+  variables:
+debug: y
+
+.gcc-riscv64-cross-testbuild:
+  extends: 

Re: [PATCH v1] xen/arm: align *(.proc.info) in the linker script

2023-03-02 Thread Oleksii
On Thu, 2023-03-02 at 14:50 +, Julien Grall wrote:
> Hi Oleksii,
> 
> On 02/03/2023 07:34, Oleksii wrote:
> > Hi Julien,
> > > > > On Wed, 2023-03-01 at 16:21 +, Julien Grall wrote:
> > > > > > Hi Oleksii,
> > > > > > 
> > > > > > On 01/03/2023 16:14, Oleksii Kurochko wrote:
> > > > > > > During testing of bug.h's macros generic implementation
> > > > > > > yocto-
> > > > > > > qemuarm
> > > > > > > job crashed with data abort:
> > > > > > 
> > > > > > The commit message is lacking some information. You are
> > > > > > telling
> > > > > > us
> > > > > > that
> > > > > > there is an error when building with your series, but this
> > > > > > doesn't
> > > > > > tell
> > > > > > me why this is the correct fix.
> > > > > > 
> > > > > > This is also why I asked to have the xen binary because I
> > > > > > want
> > > > > > to
> > > > > > check
> > > > > > whether this was a latent bug in Xen or your series
> > > > > > effectively
> > > > > > introduce a bug.
> > > > > > 
> > > > > > Note that regardless what I just wrote this is a good idea
> > > > > > to
> > > > > > align
> > > > > > __proc_info_start. I will try to have a closer look later
> > > > > > and
> > > > > > propose
> > > > > > a
> > > > > > commit message and/or any action for your other series.
> > > > > Regarding binaries please take a look here:
> > > > > https://lore.kernel.org/xen-devel/aa2862eacccfb0574859bf4cda8f4992baa5d2e1.ca...@gmail.com/
> > > > > 
> > > > > I am not sure if you get my answer as I had the message from
> > > > > delivery
> > > > > server that it was blocked for some reason.
> > > > 
> > > > I got the answer. The problem now is gitlab only keep the
> > > > artifact
> > > > for
> > > > the latest build and it only provide a zImage (having the ELF
> > > > would
> > > > be
> > > > easier).
> > > > 
> > > > I will try to reproduce the error on my end.
> > > 
> > > I managed to reproduce it. It looks like that after your bug
> > > patch,
> > > *(.rodata.*) will not be end on a 4-byte boundary. Before your
> > > patch,
> > > all the messages will be in .rodata.str. Now they are in
> > > .bug_frames.*,
> > > so there some difference in .rodata.*.
> > > 
> > > That said, it is not entirely clear why we never seen the issue
> > > before
> > > because my guessing there are no guarantee that .rodata.* will be
> > > suitably aligned.
> > > 
> > > Anyway, here a proposal for the commit message:
> > > 
> > > "
> > > xen/arm: Ensure the start *(.proc.info) of is 4-byte aligned
> > > 
> > > The entries in *(.proc.info) are expected to be 4-byte aligned
> > > and
> > > the
> > > compiler will access them using 4-byte load instructions. On
> > > Arm32,
> > > the
> > > alignment is strictly enforced by the processor and will result
> > > to a
> > > data abort if it is not correct.
> > > 
> > > However, the linker script doesn't encode this requirement. So we
> > > are
> > > at
> > > the mercy of the compiler/linker to have aligned the previous
> > > sections
> > > suitably.
> > > 
> > > This was spotted when trying to use the upcoming generic bug
> > > infrastructure with the compiler provided by Yocto.
> > > 
> > > Link:
> > > https://lore.kernel.org/xen-devel/6735859208c6dcb7320f89664ae298005f70827b.ca...@gmail.com/
> > > "
> > > 
> > > If you are happy with the proposed commit message, then I can
> > > update
> > > it
> > > while committing.
> > I am happy with the proposed commit message.
> 
> Thanks. With that:
> 
> Reviewed-by: Julien Grall 
> 
> I have addressed Jan's comment and committed the patch.
> 
Thanks a lot.

Not generic bug feature is unblock.
I'll wait for comments till tomorrow.
If it won't be any that will sent new patch series.

~ Oleksii



Re: [PATCH v2 3/3] xen/riscv: disable fpu

2023-03-02 Thread Oleksii
On Thu, 2023-03-02 at 14:20 +, Andrew Cooper wrote:
> On 02/03/2023 1:23 pm, Oleksii Kurochko wrote:
> > Disable FPU to detect illegal usage of floating point in kernel
> > space.
> > 
> > Signed-off-by: Oleksii Kurochko 
> > ---
> > Changes since v1:
> >  * Rebase on top of two previous patches.
> > ---
> 
> Apologies - I meant to ask these on the previous series, but didn't
> get
> around to it.
> 
> Why do we disable interrupts at the very start of start(), but only
> disable the FPU at the start of C ?
I decided to do at the start of start_xen() as it's the first C
function and before that there is only assembler where we can control
not to use FPU.

But to be 100% sure we can move to the start() function.
Could you please share your thoughts about?
> 
> To start with, doesn't OpenSBI have a starting ABI spec?  What does
> that
> say on the matter of the enablement of these features on entry into
> the
> environment?
I tried to find specific OpenSBI ABI spec before and, unfortunately, i
didn't find any. Only docs in their repo:
https://github.com/riscv-software-src/opensbi/blob/master/docs/firmware/fw.md
My expactation was that such information should be part of RISC-V
SBI/ABI which OpenSBI implements but it is mentioned only SBI functions
that should be implemented.

I look at OpenSBI code and it looks like it disables interrupts before
jump to hypervisor:
https://github.com/riscv-software-src/opensbi/blob/master/lib/sbi/sbi_hart.c#L805
But it doesn't do anything with FPU.

Thereby I can't be sure that it's mandatory or not for OpenSBI to
disable/enable interrupts, FPU and so on.

If you have or saw the OpenSBI starting ABI please let me know.

> 
> Either way, my gut feeling is that these disables (if necessary to
> begin
> with) should be together, rather than split like this.
> 
> 
> That aside, while I can see the value of checking this now, won't we
> have to delete this again in order to allow for context switching a
> vCPUs FPU register state?
Not really.

My expectation that we will have the function similar to:
void cpu_vcpu_fp_init(...)
{
riscv_regs(vcpu)->sstatus &= ~SSTATUS_FS;
if (riscv_isa_extension_available(riscv_priv(vcpu)->isa, f) ||
riscv_isa_extension_available(riscv_priv(vcpu)->isa, d))
riscv_regs(vcpu)->sstatus |= SSTATUS_FS_INITIAL;
else

memset(_priv(vcpu)->fp, 0, sizeof(riscv_priv(vcpu)-
>fp));
}


~ Oleksii



[xen-unstable-smoke test] 178990: tolerable trouble: pass/starved - PUSHED

2023-03-02 Thread osstest service owner
flight 178990 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/178990/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  380a8c0c65bfb84dab54ab4641cca1387cc41edb
baseline version:
 xen  b84fdf521b306cce64388fe57ee6c7c00f9d3e76

Last test of basis   178802  2023-02-28 20:01:47 Z1 days
Testing same since   178990  2023-03-02 15:00:26 Z0 days1 attempts


People who touched revisions under test:
  Oleksii Kurochko 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  starved 
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  starved 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   b84fdf521b..380a8c0c65  380a8c0c65bfb84dab54ab4641cca1387cc41edb -> smoke



[PATCH v2 6/6] gitlab-ci.d/crossbuilds: Drop the 32-bit arm system emulation jobs

2023-03-02 Thread Thomas Huth
Hardly anybody still uses 32-bit arm environments for running QEMU,
so let's stop wasting our scarce CI minutes with these jobs.

Signed-off-by: Thomas Huth 
---
 .gitlab-ci.d/crossbuilds.yml | 14 --
 1 file changed, 14 deletions(-)

diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml
index 3ce51adf77..419b0c2fe1 100644
--- a/.gitlab-ci.d/crossbuilds.yml
+++ b/.gitlab-ci.d/crossbuilds.yml
@@ -1,13 +1,6 @@
 include:
   - local: '/.gitlab-ci.d/crossbuild-template.yml'
 
-cross-armel-system:
-  extends: .cross_system_build_job
-  needs:
-job: armel-debian-cross-container
-  variables:
-IMAGE: debian-armel-cross
-
 cross-armel-user:
   extends: .cross_user_build_job
   needs:
@@ -15,13 +8,6 @@ cross-armel-user:
   variables:
 IMAGE: debian-armel-cross
 
-cross-armhf-system:
-  extends: .cross_system_build_job
-  needs:
-job: armhf-debian-cross-container
-  variables:
-IMAGE: debian-armhf-cross
-
 cross-armhf-user:
   extends: .cross_user_build_job
   needs:
-- 
2.31.1




[PATCH v2 5/6] docs/about/deprecated: Deprecate 32-bit arm hosts

2023-03-02 Thread Thomas Huth
For running QEMU in system emulation mode, the user needs a rather
strong host system, i.e. not only an embedded low-frequency controller.
All recent beefy arm host machines should support 64-bit now, it's
unlikely that anybody is still seriously using QEMU on a 32-bit arm
CPU, so we deprecate the 32-bit arm hosts here to finally save use
some time and precious CI minutes.

Signed-off-by: Thomas Huth 
---
 docs/about/deprecated.rst | 9 +
 1 file changed, 9 insertions(+)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 21ce70b5c9..c7113a7510 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -229,6 +229,15 @@ discontinue it. Since all recent x86 hardware from the 
past >10 years
 is capable of the 64-bit x86 extensions, a corresponding 64-bit OS
 should be used instead.
 
+System emulation on 32-bit arm hosts (since 8.0)
+
+
+Since QEMU needs a strong host machine for running full system emulation, and
+all recent powerful arm hosts support 64-bit, the QEMU project deprecates the
+support for running any system emulation on 32-bit arm hosts in general. Use
+64-bit arm hosts for system emulation instead. (Note: "user" mode emulation
+continuous to be supported on 32-bit arm hosts, too)
+
 
 QEMU API (QAPI) events
 --
-- 
2.31.1




[PATCH v2 4/6] docs/about/deprecated: Deprecate the qemu-system-arm binary

2023-03-02 Thread Thomas Huth
qemu-system-aarch64 is a proper superset of qemu-system-arm,
and the latter was mainly still required for 32-bit KVM support.
But this 32-bit KVM arm support has been dropped in the Linux
kernel a couple of years ago already, so we don't really need
qemu-system-arm anymore, thus deprecated it now.

Signed-off-by: Thomas Huth 
---
 docs/about/deprecated.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index a30aa8dfdf..21ce70b5c9 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -45,6 +45,16 @@ run 32-bit guests by selecting a 32-bit CPU model, including 
KVM support
 on x86_64 hosts. Thus users are recommended to reconfigure their systems
 to use the ``qemu-system-x86_64`` binary instead.
 
+``qemu-system-arm`` binary (since 8.0)
+''
+
+``qemu-system-aarch64`` is a proper superset of ``qemu-system-arm``. The
+latter was mainly a requirement for running KVM on 32-bit arm hosts, but
+this 32-bit KVM support has been removed some years ago already (see:
+https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=541ad0150ca4
+). Thus the QEMU project will drop the ``qemu-system-arm`` binary in a
+future release. Use ``qemu-system-aarch64`` instead.
+
 
 System emulator command line arguments
 --
-- 
2.31.1




[PATCH v2 3/6] gitlab-ci.d/crossbuilds: Drop the i386 jobs

2023-03-02 Thread Thomas Huth
Hardly anybody still uses 32-bit x86 environments for running QEMU,
so let's stop wasting our scarce CI minutes with these jobs.

Signed-off-by: Thomas Huth 
---
 .gitlab-ci.d/crossbuilds.yml | 16 
 1 file changed, 16 deletions(-)

diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml
index 101416080c..3ce51adf77 100644
--- a/.gitlab-ci.d/crossbuilds.yml
+++ b/.gitlab-ci.d/crossbuilds.yml
@@ -43,22 +43,6 @@ cross-arm64-user:
   variables:
 IMAGE: debian-arm64-cross
 
-cross-i386-system:
-  extends: .cross_system_build_job
-  needs:
-job: i386-fedora-cross-container
-  variables:
-IMAGE: fedora-i386-cross
-MAKE_CHECK_ARGS: check-qtest
-
-cross-i386-user:
-  extends: .cross_user_build_job
-  needs:
-job: i386-fedora-cross-container
-  variables:
-IMAGE: fedora-i386-cross
-MAKE_CHECK_ARGS: check
-
 cross-i386-tci:
   extends: .cross_accel_build_job
   timeout: 60m
-- 
2.31.1




[PATCH v2 2/6] docs/about/deprecated: Deprecate 32-bit x86 hosts

2023-03-02 Thread Thomas Huth
Hardly anybody still uses 32-bit x86 hosts today, so we should start
deprecating them to stop wasting our time and CI minutes here.
For example, there are also still some unresolved problems with these:
When emulating 64-bit binaries in user mode, TCG does not honor atomicity
for 64-bit accesses, which is "perhaps worse than not working at all"
(quoting Richard). Let's simply make it clear that people should use
64-bit x86 hosts nowadays and we do not intend to fix/maintain the old
32-bit stuff.

Signed-off-by: Thomas Huth 
---
 docs/about/deprecated.rst | 12 
 1 file changed, 12 insertions(+)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 11700adac9..a30aa8dfdf 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -208,6 +208,18 @@ CI coverage support may bitrot away before the deprecation 
process
 completes. The little endian variants of MIPS (both 32 and 64 bit) are
 still a supported host architecture.
 
+32-bit x86 hosts (since 8.0)
+
+
+Support for 32-bit x86 host deployments is increasingly uncommon in
+mainstream OS distributions given the widespread availability of 64-bit
+x86 hardware. The QEMU project no longer considers 32-bit x86 support
+to be an effective use of its limited resources, and thus intends to
+discontinue it. Since all recent x86 hardware from the past >10 years
+is capable of the 64-bit x86 extensions, a corresponding 64-bit OS
+should be used instead.
+
+
 QEMU API (QAPI) events
 --
 
-- 
2.31.1




[PATCH v2 1/6] docs/about/deprecated: Deprecate the qemu-system-i386 binary

2023-03-02 Thread Thomas Huth
Hardly anybody really requires the i386 binary anymore, since the
qemu-system-x86_64 binary is a proper superset. So let's deprecate
the 32-bit variant now, so that we can finally stop wasting our time
and CI minutes with this.

With regards to 32-bit KVM support in the x86 Linux kernel,
the developers confirmed that they do not need a recent
qemu-system-i386 binary here:

 https://lore.kernel.org/kvm/y%2ffkts5ajfy0h...@google.com/

Signed-off-by: Thomas Huth 
---
 docs/about/deprecated.rst | 12 
 1 file changed, 12 insertions(+)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index 15084f7bea..11700adac9 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -34,6 +34,18 @@ deprecating the build option and no longer defend it in CI. 
The
 ``--enable-gcov`` build option remains for analysis test case
 coverage.
 
+``qemu-system-i386`` binary (since 8.0)
+'''
+
+The ``qemu-system-i386`` binary was mainly useful for running with KVM
+on 32-bit x86 hosts, but most Linux distributions already removed their
+support for 32-bit x86 kernels, so hardly anybody still needs this. The
+``qemu-system-x86_64`` binary is a proper superset and can be used to
+run 32-bit guests by selecting a 32-bit CPU model, including KVM support
+on x86_64 hosts. Thus users are recommended to reconfigure their systems
+to use the ``qemu-system-x86_64`` binary instead.
+
+
 System emulator command line arguments
 --
 
-- 
2.31.1




[PATCH v2 0/6] Deprecate support for 32-bit x86 and arm hosts

2023-03-02 Thread Thomas Huth
We're struggling quite badly with our CI minutes on the shared
gitlab runners, so we urgently need to think of ways to cut down
our supported build and target environments. qemu-system-i386 and
qemu-system-arm are not really required anymore, since nobody uses
KVM on the corresponding systems for production anymore, and the
-x86_64 and -arch64 variants are a proper superset of those binaries.
So it's time to deprecate them and the corresponding 32-bit host
environments now.

This is a follow-up patch series from the previous discussion here:

 https://lore.kernel.org/qemu-devel/20230130114428.1297295-1-th...@redhat.com/

where people still mentioned that there is still interest in certain
support for 32-bit host hardware. But as far as I could see, there is
no real need for 32-bit x86 host support and for system emulation on
32-bit arm hosts anymore, so it should be fine if we drop these host
environments soon (these are also the two architectures that contribute
the most to the long test times in our CI, so we would benefit a lot by
dropping those).

v2:
- Split binary and host deprecation into separate patches
- Added patches to immediately drop the jobs from the CI

Thomas Huth (6):
  docs/about/deprecated: Deprecate the qemu-system-i386 binary
  docs/about/deprecated: Deprecate 32-bit x86 hosts
  gitlab-ci.d/crossbuilds: Drop the i386 jobs
  docs/about/deprecated: Deprecate the qemu-system-arm binary
  docs/about/deprecated: Deprecate 32-bit arm hosts
  gitlab-ci.d/crossbuilds: Drop the 32-bit arm system emulation jobs

 docs/about/deprecated.rst| 43 
 .gitlab-ci.d/crossbuilds.yml | 30 -
 2 files changed, 43 insertions(+), 30 deletions(-)

-- 
2.31.1




[linux-linus test] 178951: regressions - trouble: fail/pass/starved

2023-03-02 Thread osstest service owner
flight 178951 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/178951/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-freebsd12-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-qemuu-nested-intel  8 xen-boot  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-ws16-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-dom0pvh-xl-amd 14 guest-start   fail REGR. vs. 178042
 test-amd64-amd64-xl-pvshim8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-vhd   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-pvhv2-intel  8 xen-boot  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-win7-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-win7-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-xsm   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-credit1   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 8 xen-boot fail REGR. vs. 
178042
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 8 xen-boot fail REGR. vs. 
178042
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-xl   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-shadow8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-pvhv2-amd  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-qemuu-nested-amd  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-ws16-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-examine-uefi  8 reboot  fail REGR. vs. 178042
 test-amd64-amd64-freebsd11-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-libvirt-raw  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-pygrub   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 8 xen-boot fail REGR. 
vs. 178042
 test-amd64-amd64-pair12 xen-boot/src_hostfail REGR. vs. 178042
 test-amd64-amd64-pair13 xen-boot/dst_hostfail REGR. vs. 178042
 test-amd64-amd64-libvirt-qcow2  8 xen-boot   fail REGR. vs. 178042
 test-amd64-amd64-libvirt-xsm  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-libvirt-pair 12 xen-boot/src_host   fail REGR. vs. 178042
 test-amd64-amd64-libvirt-pair 13 xen-boot/dst_host   fail REGR. vs. 178042
 test-amd64-coresched-amd64-xl  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-debianhvm-amd64  8 xen-bootfail REGR. vs. 178042
 test-arm64-arm64-xl-credit1  14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl-xsm  14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl  14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-libvirt-xsm 17 guest-stop   fail REGR. vs. 178042
 test-amd64-amd64-xl-credit2   8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 178042
 test-arm64-arm64-xl-thunderx 14 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-ovmf-amd64  8 xen-boot fail REGR. vs. 178042
 test-amd64-amd64-examine  8 reboot   fail REGR. vs. 178042
 test-amd64-amd64-examine-bios  8 reboot  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. 
vs. 178042
 test-amd64-amd64-xl-multivcpu  8 xen-bootfail REGR. vs. 178042
 test-amd64-amd64-dom0pvh-xl-intel 14 guest-start fail REGR. vs. 178042
 test-arm64-arm64-xl-vhd  12 debian-di-installfail REGR. vs. 178042
 test-arm64-arm64-libvirt-raw 12 debian-di-installfail REGR. vs. 178042
 test-arm64-arm64-xl-credit2 18 guest-start/debian.repeat fail in 178910 REGR. 
vs. 178042

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-xl-credit2  14 guest-startfail pass in 178910

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds  8 xen-boot fail REGR. vs. 178042

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-credit2 15 migrate-support-check fail in 178910 never pass
 test-arm64-arm64-xl-credit2 16 saverestore-support-check fail in 178910 never 
pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  

Re: [PATCH v2 2/3] xen/riscv: initialize .bss section

2023-03-02 Thread Jan Beulich
On 02.03.2023 16:55, Oleksii wrote:
> On Thu, 2023-03-02 at 15:22 +0100, Jan Beulich wrote:
>> On 02.03.2023 14:23, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/riscv64/head.S
>>> +++ b/xen/arch/riscv/riscv64/head.S
>>> @@ -13,6 +13,15 @@ ENTRY(start)
>>>  lla a6, _dtb_base
>>>  REG_S   a1, (a6)
>>>  
>>> +    la  a3, __bss_start
>>> +    la  a4, __bss_end
>>> +    ble a4, a3, clear_bss_done
>>
>> While it may be that .bss is indeed empty right now, even short term
>> it won't be, and never will. I'd drop this conditional (and in
>> particular the label), inserting a transient item into .bss for the
>> time being. As soon as your patch introducing page tables has landed,
>> there will be multiple pages worth of .bss.
> If I understand you correctly you suggested declare some variable:
>int dummy_bss __attribute__((unused));
> 
> Then .bss won't be zero:
>$ riscv64-linux-gnu-objdump -x xen/xen-syms | grep -i dummy_bss
>80205000 g O .bss   0004 .hidden dummy_bss
> 
> And when page tables will be ready it will be needed to remove
> dummy_bss.
> 
> Another one option is to update linker script ( looks better then
> previous one ):
> --- a/xen/arch/riscv/xen.lds.S
> +++ b/xen/arch/riscv/xen.lds.S
> @@ -140,6 +140,7 @@ SECTIONS
>  . = ALIGN(SMP_CACHE_BYTES);
>  __per_cpu_data_end = .;
>  *(.bss .bss.*)
> +. = . + 1;
>  . = ALIGN(POINTER_ALIGN);
>  __bss_end = .;
>  } :text

Right, I did think of this as an alternative solution as well. Either
is fine with me.

> If one of the options is fine then to be honest I am not sure that I
> understand why it is better than have 3 instructions which will be
> unnecessary when first bss variable will be introduced. And actually
> the same will be with item in bss, it will become unnecessary when
> something from bss will be introduced.
> 
> I am OK with one of the mentioned above options but still would like
> to understand what are advantages.

You could also remove the branch and the label once .bss is no longer
empty. It'll just raise needless questions if that's left in long
term. Plus - I'm not a maintainer, I'm only voicing suggestions ...

Jan



Re: [PATCH v2 2/3] xen/riscv: initialize .bss section

2023-03-02 Thread Oleksii
On Thu, 2023-03-02 at 15:22 +0100, Jan Beulich wrote:
> On 02.03.2023 14:23, Oleksii Kurochko wrote:
> > --- a/xen/arch/riscv/riscv64/head.S
> > +++ b/xen/arch/riscv/riscv64/head.S
> > @@ -13,6 +13,15 @@ ENTRY(start)
> >  lla a6, _dtb_base
> >  REG_S   a1, (a6)
> >  
> > +    la  a3, __bss_start
> > +    la  a4, __bss_end
> > +    ble a4, a3, clear_bss_done
> 
> While it may be that .bss is indeed empty right now, even short term
> it won't be, and never will. I'd drop this conditional (and in
> particular the label), inserting a transient item into .bss for the
> time being. As soon as your patch introducing page tables has landed,
> there will be multiple pages worth of .bss.
If I understand you correctly you suggested declare some variable:
   int dummy_bss __attribute__((unused));

Then .bss won't be zero:
   $ riscv64-linux-gnu-objdump -x xen/xen-syms | grep -i dummy_bss
   80205000 g O .bss   0004 .hidden dummy_bss

And when page tables will be ready it will be needed to remove
dummy_bss.

Another one option is to update linker script ( looks better then
previous one ):
--- a/xen/arch/riscv/xen.lds.S
+++ b/xen/arch/riscv/xen.lds.S
@@ -140,6 +140,7 @@ SECTIONS
 . = ALIGN(SMP_CACHE_BYTES);
 __per_cpu_data_end = .;
 *(.bss .bss.*)
+. = . + 1;
 . = ALIGN(POINTER_ALIGN);
 __bss_end = .;
 } :text

If one of the options is fine then to be honest I am not sure that I
understand why it is better than have 3 instructions which will be
unnecessary when first bss variable will be introduced. And actually
the same will be with item in bss, it will become unnecessary when
something from bss will be introduced.

I am OK with one of the mentioned above options but still would like
to understand what are advantages.

> 
> Also are this and ...
> 
> > +clear_bss:
> > +    REG_S   zero, (a3)
> > +    add a3, a3, RISCV_SZPTR
> > +    blt a3, a4, clear_bss
> 
> ... this branch actually the correct ones? I'd expect the unsigned
> flavors to be used when comparing addresses. It may not matter here
> and/or right now, but it'll set a bad precedent unless you expect
> to only ever work on addresses which have the sign bit clear.
I'll change blt to bltu.

~ Oleksii



Re: [PATCH v6 1/5] xen/arm32: head: Widen the use of the temporary mapping

2023-03-02 Thread Bertrand Marquis
Hi Julien,

> On 2 Mar 2023, at 15:59, Julien Grall  wrote:
> 
> From: Julien Grall 
> 
> At the moment, the temporary mapping is only used when the virtual
> runtime region of Xen is clashing with the physical region.
> 
> In follow-up patches, we will rework how secondary CPU bring-up works
> and it will be convenient to use the fixmap area for accessing
> the root page-table (it is per-cpu).
> 
> Rework the code to use temporary mapping when the Xen physical address
> is not overlapping with the temporary mapping.
> 
> This also has the advantage to simplify the logic to identity map
> Xen.
> 
> Signed-off-by: Julien Grall 
> Reviewed-by: Henry Wang 
> Tested-by: Henry Wang 
> Reviewed-by: Michal Orzel 
Reviewed-by: Bertrand Marquis 

Cheers
Bertrand

> 
> 
> 
> Even if this patch is rewriting part of the previous patch, I decided
> to keep them separated to help the review.
> 
> The "follow-up patches" are still in draft at the moment. I still haven't
> find a way to split them nicely and not require too much more work
> in the coloring side.
> 
> I have provided some medium-term goal in the cover letter.
> 
>Changes in v6:
>- Add Henry's reviewed-by and tested-by tag
>- Add Michal's reviewed-by
>- Add newline in remove_identity_mapping for clarity
> 
>Changes in v5:
>- Fix typo in a comment
>- No need to link boot_{second, third}_id again if we need to
>  create a temporary area.
> 
>Changes in v3:
>- Resolve conflicts after switching from "ldr rX, " to
>  "mov_w rX, " in a previous patch
> 
>Changes in v2:
>- Patch added
> ---
> xen/arch/arm/arm32/head.S | 86 ---
> 1 file changed, 16 insertions(+), 70 deletions(-)
> 
> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> index df51550baa8a..9befffd85079 100644
> --- a/xen/arch/arm/arm32/head.S
> +++ b/xen/arch/arm/arm32/head.S
> @@ -459,7 +459,6 @@ ENDPROC(cpu_init)
> create_page_tables:
> /* Prepare the page-tables for mapping Xen */
> mov_w r0, XEN_VIRT_START
> -create_table_entry boot_pgtable, boot_second, r0, 1
> create_table_entry boot_second, boot_third, r0, 2
> 
> /* Setup boot_third: */
> @@ -479,70 +478,37 @@ create_page_tables:
> cmp   r1, #(XEN_PT_LPAE_ENTRIES<<3) /* 512*8-byte entries per page */
> blo   1b
> 
> -/*
> - * If Xen is loaded at exactly XEN_VIRT_START then we don't
> - * need an additional 1:1 mapping, the virtual mapping will
> - * suffice.
> - */
> -cmp   r9, #XEN_VIRT_START
> -moveq pc, lr
> -
> /*
>  * Setup the 1:1 mapping so we can turn the MMU on. Note that
>  * only the first page of Xen will be part of the 1:1 mapping.
> - *
> - * In all the cases, we will link boot_third_id. So create the
> - * mapping in advance.
>  */
> +create_table_entry boot_pgtable, boot_second_id, r9, 1
> +create_table_entry boot_second_id, boot_third_id, r9, 2
> create_mapping_entry boot_third_id, r9, r9
> 
> /*
> - * Find the first slot used. If the slot is not XEN_FIRST_SLOT,
> - * then the 1:1 mapping will use its own set of page-tables from
> - * the second level.
> + * Find the first slot used. If the slot is not the same
> + * as TEMPORARY_AREA_FIRST_SLOT, then we will want to switch
> + * to the temporary mapping before jumping to the runtime
> + * virtual mapping.
>  */
> get_table_slot r1, r9, 1 /* r1 := first slot */
> -cmp   r1, #XEN_FIRST_SLOT
> -beq   1f
> -create_table_entry boot_pgtable, boot_second_id, r9, 1
> -b link_from_second_id
> -
> -1:
> -/*
> - * Find the second slot used. If the slot is XEN_SECOND_SLOT, then 
> the
> - * 1:1 mapping will use its own set of page-tables from the
> - * third level.
> - */
> -get_table_slot r1, r9, 2 /* r1 := second slot */
> -cmp   r1, #XEN_SECOND_SLOT
> -beq   virtphys_clash
> -create_table_entry boot_second, boot_third_id, r9, 2
> -b link_from_third_id
> +cmp   r1, #TEMPORARY_AREA_FIRST_SLOT
> +bne   use_temporary_mapping
> 
> -link_from_second_id:
> -create_table_entry boot_second_id, boot_third_id, r9, 2
> -link_from_third_id:
> -/* Good news, we are not clashing with Xen virtual mapping */
> +mov_w r0, XEN_VIRT_START
> +create_table_entry boot_pgtable, boot_second, r0, 1
> mov   r12, #0/* r12 := temporary mapping not created 
> */
> mov   pc, lr
> 
> -virtphys_clash:
> +use_temporary_mapping:
> /*
> - * The identity map clashes with boot_third. Link boot_first_id and
> - * map Xen to a temporary mapping. See switch_to_runtime_mapping
> - 

[RFC PATCH v1 13/25] hw/xen: Add xenstore operations to allow redirection to internal emulation

2023-03-02 Thread David Woodhouse
From: Paul Durrant 

Signed-off-by: Paul Durrant 
Signed-off-by: David Woodhouse 
---
 accel/xen/xen-all.c |  11 +-
 hw/char/xen_console.c   |   2 +-
 hw/i386/kvm/xen_xenstore.c  |   3 -
 hw/i386/kvm/xenstore_impl.h |   8 +-
 hw/xen/xen-bus-helper.c |  62 +++
 hw/xen/xen-bus.c| 261 
 hw/xen/xen-legacy-backend.c | 119 +++--
 hw/xen/xen-operations.c | 198 +
 hw/xen/xen_devconfig.c  |   4 +-
 hw/xen/xen_pt_graphics.c|   1 -
 hw/xen/xen_pvdev.c  |  49 +-
 include/hw/xen/xen-bus-helper.h |  26 +--
 include/hw/xen/xen-bus.h|  17 +-
 include/hw/xen/xen-legacy-backend.h |   6 +-
 include/hw/xen/xen_backend_ops.h| 163 +
 include/hw/xen/xen_common.h |   1 -
 include/hw/xen/xen_pvdev.h  |   2 +-
 softmmu/globals.c   |   1 +
 18 files changed, 525 insertions(+), 409 deletions(-)

diff --git a/accel/xen/xen-all.c b/accel/xen/xen-all.c
index e85e4aeba5..425216230f 100644
--- a/accel/xen/xen-all.c
+++ b/accel/xen/xen-all.c
@@ -90,12 +90,15 @@ void xenstore_store_pv_console_info(int i, Chardev *chr)
 }
 
 
-static void xenstore_record_dm_state(struct xs_handle *xs, const char *state)
+static void xenstore_record_dm_state(const char *state)
 {
+struct xs_handle *xs;
 char path[50];
 
+/* We now have everything we need to set the xenstore entry. */
+xs = xs_open(0);
 if (xs == NULL) {
-error_report("xenstore connection not initialized");
+fprintf(stderr, "Could not contact XenStore\n");
 exit(1);
 }
 
@@ -109,6 +112,8 @@ static void xenstore_record_dm_state(struct xs_handle *xs, 
const char *state)
 error_report("error recording dm state");
 exit(1);
 }
+
+xs_close(xs);
 }
 
 
@@ -117,7 +122,7 @@ static void xen_change_state_handler(void *opaque, bool 
running,
 {
 if (running) {
 /* record state running */
-xenstore_record_dm_state(xenstore, "running");
+xenstore_record_dm_state("running");
 }
 }
 
diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index e9cef3e1ef..ad8638a86d 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -181,7 +181,7 @@ static int con_init(struct XenLegacyDevice *xendev)
 const char *output;
 
 /* setup */
-dom = xs_get_domain_path(xenstore, con->xendev.dom);
+dom = qemu_xen_xs_get_domain_path(xenstore, con->xendev.dom);
 if (!xendev->dev) {
 snprintf(con->console, sizeof(con->console), "%s/console", dom);
 } else {
diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 5a8e38aae7..bab40d1a04 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -38,9 +38,6 @@
 #define TYPE_XEN_XENSTORE "xen-xenstore"
 OBJECT_DECLARE_SIMPLE_TYPE(XenXenstoreState, XEN_XENSTORE)
 
-#define XEN_PAGE_SHIFT 12
-#define XEN_PAGE_SIZE (1ULL << XEN_PAGE_SHIFT)
-
 #define ENTRIES_PER_FRAME_V1 (XEN_PAGE_SIZE / sizeof(grant_entry_v1_t))
 #define ENTRIES_PER_FRAME_V2 (XEN_PAGE_SIZE / sizeof(grant_entry_v2_t))
 
diff --git a/hw/i386/kvm/xenstore_impl.h b/hw/i386/kvm/xenstore_impl.h
index bbe2391e2e..0df2a91aae 100644
--- a/hw/i386/kvm/xenstore_impl.h
+++ b/hw/i386/kvm/xenstore_impl.h
@@ -12,13 +12,7 @@
 #ifndef QEMU_XENSTORE_IMPL_H
 #define QEMU_XENSTORE_IMPL_H
 
-typedef uint32_t xs_transaction_t;
-
-#define XBT_NULL 0
-
-#define XS_PERM_NONE  0x00
-#define XS_PERM_READ  0x01
-#define XS_PERM_WRITE 0x02
+#include "hw/xen/xen_backend_ops.h"
 
 typedef struct XenstoreImplState XenstoreImplState;
 
diff --git a/hw/xen/xen-bus-helper.c b/hw/xen/xen-bus-helper.c
index 5a1e12b374..b2b2cc9c5d 100644
--- a/hw/xen/xen-bus-helper.c
+++ b/hw/xen/xen-bus-helper.c
@@ -10,6 +10,7 @@
 #include "hw/xen/xen-bus.h"
 #include "hw/xen/xen-bus-helper.h"
 #include "qapi/error.h"
+#include "trace.h"
 
 #include 
 
@@ -46,34 +47,28 @@ const char *xs_strstate(enum xenbus_state state)
 return "INVALID";
 }
 
-void xs_node_create(struct xs_handle *xsh, xs_transaction_t tid,
-const char *node, struct xs_permissions perms[],
-unsigned int nr_perms, Error **errp)
+void xs_node_create(struct qemu_xs_handle *h, xs_transaction_t tid,
+const char *node, unsigned int owner, unsigned int domid,
+unsigned int perms, Error **errp)
 {
 trace_xs_node_create(node);
 
-if (!xs_write(xsh, tid, node, "", 0)) {
+if (!qemu_xen_xs_create(h, tid, owner, domid, perms, node)) {
 error_setg_errno(errp, errno, "failed to create node '%s'", node);
-return;
-}
-
-if (!xs_set_permissions(xsh, tid, node, perms, nr_perms)) {
-error_setg_errno(errp, errno, "failed to set node '%s' permissions",
- node);
 }
 }
 
-void xs_node_destroy(struct xs_handle *xsh, 

[RFC PATCH v1 22/25] hw/xen: Add emulated implementation of XenStore operations

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Now that we have an internal implementation of XenStore, we can populate
the xenstore_backend_ops to allow PV backends to talk to it.

Watches can't be processed with immediate callbacks because that would
call back into XenBus code recursively. Defer them to a QEMUBH to be run
as appropriate from the main loop. We use a QEMUBH per XS handle, and it
walks all the watches (there shouldn't be many per handle) to fire any
which have pending events. We *could* have done it differently but this
allows us to use the same struct watch_event as we have for the guest
side, and keeps things relatively simple.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_xenstore.c | 273 -
 1 file changed, 269 insertions(+), 4 deletions(-)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index bab40d1a04..028f80499e 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -49,7 +49,7 @@ struct XenXenstoreState {
 /*< public >*/
 
 XenstoreImplState *impl;
-GList *watch_events;
+GList *watch_events; /* for the guest */
 
 MemoryRegion xenstore_page;
 struct xenstore_domain_interface *xs;
@@ -73,6 +73,8 @@ struct XenXenstoreState *xen_xenstore_singleton;
 static void xen_xenstore_event(void *opaque);
 static void fire_watch_cb(void *opaque, const char *path, const char *token);
 
+static struct xenstore_backend_ops emu_xenstore_backend_ops;
+
 static void G_GNUC_PRINTF (4, 5) relpath_printf(XenXenstoreState *s,
 GList *perms,
 const char *relpath,
@@ -169,6 +171,8 @@ static void xen_xenstore_realize(DeviceState *dev, Error 
**errp)
 relpath_printf(s, perms, "feature", "%s", "");
 
 g_list_free_full(perms, g_free);
+
+xen_xenstore_ops = _xenstore_backend_ops;
 }
 
 static bool xen_xenstore_is_needed(void *opaque)
@@ -1305,6 +1309,15 @@ struct watch_event {
 char *token;
 };
 
+static void free_watch_event(struct watch_event *ev)
+{
+if (ev) {
+g_free(ev->path);
+g_free(ev->token);
+g_free(ev);
+}
+}
+
 static void queue_watch(XenXenstoreState *s, const char *path,
 const char *token)
 {
@@ -1351,9 +1364,7 @@ static void process_watch_events(XenXenstoreState *s)
 deliver_watch(s, ev->path, ev->token);
 
 s->watch_events = g_list_remove(s->watch_events, ev);
-g_free(ev->path);
-g_free(ev->token);
-g_free(ev);
+free_watch_event(ev);
 }
 
 static void xen_xenstore_event(void *opaque)
@@ -1443,3 +1454,257 @@ int xen_xenstore_reset(void)
 
 return 0;
 }
+
+struct qemu_xs_handle {
+XenstoreImplState *impl;
+GList *watches;
+QEMUBH *watch_bh;
+};
+
+struct qemu_xs_watch {
+struct qemu_xs_handle *h;
+char *path;
+xs_watch_fn fn;
+void *opaque;
+GList *events;
+};
+
+static char *xs_be_get_domain_path(struct qemu_xs_handle *h, unsigned int 
domid)
+{
+return g_strdup_printf("/local/domain/%u", domid);
+}
+
+static char **xs_be_directory(struct qemu_xs_handle *h, xs_transaction_t t,
+  const char *path, unsigned int *num)
+{
+GList *items = NULL, *l;
+unsigned int i = 0;
+char **items_ret;
+int err;
+
+err = xs_impl_directory(h->impl, DOMID_QEMU, t, path, NULL, );
+if (err) {
+errno = err;
+return NULL;
+}
+
+items_ret = g_new0(char *, g_list_length(items) + 1);
+*num = 0;
+for (l = items; l; l = l->next) {
+items_ret[i++] = l->data;
+(*num)++;
+}
+g_list_free(items);
+return items_ret;
+}
+
+static void *xs_be_read(struct qemu_xs_handle *h, xs_transaction_t t,
+const char *path, unsigned int *len)
+{
+GByteArray *data = g_byte_array_new();
+bool free_segment = false;
+int err;
+
+err = xs_impl_read(h->impl, DOMID_QEMU, t, path, data);
+if (err) {
+free_segment = true;
+errno = err;
+} else {
+if (len) {
+*len = data->len;
+}
+/* The xen-bus-helper code expects to get NUL terminated string! */
+g_byte_array_append(data, (void *)"", 1);
+}
+
+return g_byte_array_free(data, free_segment);
+}
+
+static bool xs_be_write(struct qemu_xs_handle *h, xs_transaction_t t,
+const char *path, const void *data, unsigned int len)
+{
+GByteArray *gdata = g_byte_array_new();
+int err;
+
+g_byte_array_append(gdata, data, len);
+err = xs_impl_write(h->impl, DOMID_QEMU, t, path, gdata);
+g_byte_array_unref(gdata);
+if (err) {
+errno = err;
+return false;
+}
+return true;
+}
+
+static bool xs_be_create(struct qemu_xs_handle *h, xs_transaction_t t,
+ unsigned int owner, unsigned int domid,
+ unsigned int perms, const char *path)
+{
+g_autoptr(GByteArray) data = 

[RFC PATCH v1 05/25] hw/xen: Watches on XenStore transactions

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Firing watches on the nodes that still exist is relatively easy; just
walk the tree and look at the nodes with refcount of one.

Firing watches on *deleted* nodes is more fun. We add 'modified_in_tx'
and 'deleted_in_tx' flags to each node. Nodes with those flags cannot
be shared, as they will always be unique to the transaction in which
they were created.

When xs_node_walk would need to *create* a node as scaffolding and it
encounters a deleted_in_tx node, it can resurrect it simply by clearing
its deleted_in_tx flag. If that node originally had any *data*, they're
gone, and the modified_in_tx flag will have been set when it was first
deleted.

We then attempt to send appropriate watches when the transaction is
committed, properly delete the deleted_in_tx nodes, and remove the
modified_in_tx flag from the others.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xenstore_impl.c | 151 ++-
 tests/unit/test-xs-node.c   | 231 +++-
 2 files changed, 380 insertions(+), 2 deletions(-)

diff --git a/hw/i386/kvm/xenstore_impl.c b/hw/i386/kvm/xenstore_impl.c
index e5074ab1ec..380f8003ec 100644
--- a/hw/i386/kvm/xenstore_impl.c
+++ b/hw/i386/kvm/xenstore_impl.c
@@ -32,6 +32,8 @@ typedef struct XsNode {
 GByteArray *content;
 GHashTable *children;
 uint64_t gencnt;
+bool deleted_in_tx;
+bool modified_in_tx;
 #ifdef XS_NODE_UNIT_TEST
 gchar *name; /* debug only */
 #endif
@@ -153,6 +155,13 @@ static XsNode *xs_node_copy(XsNode *old)
 XsNode *n = xs_node_new();
 
 n->gencnt = old->gencnt;
+
+#ifdef XS_NODE_UNIT_TEST
+if (n->name) {
+n->name = g_strdup(old->name);
+}
+#endif
+
 if (old->children) {
 n->children = g_hash_table_new_full(g_str_hash, g_str_equal, g_free,
 (GDestroyNotify)xs_node_unref);
@@ -221,6 +230,9 @@ struct walk_op {
 bool mutating;
 bool create_dirs;
 bool in_transaction;
+
+/* Tracking during recursion so we know which is first. */
+bool deleted_in_tx;
 };
 
 static void fire_watches(struct walk_op *op, bool parents)
@@ -277,6 +289,9 @@ static int xs_node_add_content(XsNode **n, struct walk_op 
*op)
 g_byte_array_unref((*n)->content);
 }
 (*n)->content = g_byte_array_ref(data);
+if (op->tx_id != XBT_NULL) {
+(*n)->modified_in_tx = true;
+}
 return 0;
 }
 
@@ -333,10 +348,62 @@ static int node_rm_recurse(gpointer key, gpointer value, 
gpointer user_data)
 return this_inplace;
 }
 
+static XsNode *xs_node_copy_deleted(XsNode *old, struct walk_op *op);
+static void copy_deleted_recurse(gpointer key, gpointer value,
+ gpointer user_data)
+{
+struct walk_op *op = user_data;
+GHashTable *siblings = op->op_opaque2;
+XsNode *n = xs_node_copy_deleted(value, op);
+
+/*
+ * Reinsert the deleted_in_tx copy of the node into the parent's
+ * 'children' hash table. Having stashed it from op->op_opaque2
+ * before the recursive call to xs_node_copy_deleted() scribbled
+ * over it.
+ */
+g_hash_table_insert(siblings, g_strdup(key), n);
+}
+
+static XsNode *xs_node_copy_deleted(XsNode *old, struct walk_op *op)
+{
+XsNode *n = xs_node_new();
+
+n->gencnt = old->gencnt;
+
+#ifdef XS_NODE_UNIT_TEST
+if (old->name) {
+n->name = g_strdup(old->name);
+}
+#endif
+
+if (old->children) {
+n->children = g_hash_table_new_full(g_str_hash, g_str_equal, g_free,
+(GDestroyNotify)xs_node_unref);
+op->op_opaque2 = n->children;
+g_hash_table_foreach(old->children, copy_deleted_recurse, op);
+}
+n->deleted_in_tx = true;
+/* If it gets resurrected we only fire a watch if it lost its content */
+if (old->content) {
+n->modified_in_tx = true;
+}
+op->new_nr_nodes--;
+return n;
+}
+
 static int xs_node_rm(XsNode **n, struct walk_op *op)
 {
 bool this_inplace = op->inplace;
 
+if (op->tx_id != XBT_NULL) {
+/* It's not trivial to do inplace handling for this one */
+XsNode *old = *n;
+*n = xs_node_copy_deleted(old, op);
+xs_node_unref(old);
+return 0;
+}
+
 /* Fire watches for, and count, nodes in the subtree which get deleted */
 if ((*n)->children) {
 g_hash_table_foreach_remove((*n)->children, node_rm_recurse, op);
@@ -408,6 +475,10 @@ static int xs_node_walk(XsNode **n, struct walk_op *op)
 }
 
 if (child) {
+if (child->deleted_in_tx) {
+assert(child->ref == 1);
+/* Cannot actually set child->deleted_in_tx = false until later */
+}
 xs_node_ref(child);
 /*
  * Now we own it too. But if we can modify inplace, that's going to
@@ -475,6 +546,15 @@ static int xs_node_walk(XsNode **n, struct walk_op *op)
 xs_node_unref(old);
 }
 
+/*
+ * If we 

[RFC PATCH v1 24/25] hw/xen: Implement soft reset for emulated gnttab

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

This is only part of it; we will also need to get the PV back end drivers
to tear down their own mappings (or do it for them, but they kind of need
to stop using the pointers too).

Some more work on the actual PV back ends and xen-bus code is going to be
needed to really make soft reset and migration fully functional, and this
part is the basis for that.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_gnttab.c  | 26 --
 hw/i386/kvm/xen_gnttab.h  |  1 +
 target/i386/kvm/xen-emu.c |  5 +
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c
index 2bf91d36c0..21c30e3659 100644
--- a/hw/i386/kvm/xen_gnttab.c
+++ b/hw/i386/kvm/xen_gnttab.c
@@ -72,13 +72,11 @@ static void xen_gnttab_realize(DeviceState *dev, Error 
**errp)
 error_setg(errp, "Xen grant table support is for Xen emulation");
 return;
 }
-s->nr_frames = 0;
 s->max_frames = kvm_xen_get_gnttab_max_frames();
 memory_region_init_ram(>gnt_frames, OBJECT(dev), "xen:grant_table",
XEN_PAGE_SIZE * s->max_frames, _abort);
 memory_region_set_enabled(>gnt_frames, true);
 s->entries.v1 = memory_region_get_ram_ptr(>gnt_frames);
-memset(s->entries.v1, 0, XEN_PAGE_SIZE * s->max_frames);
 
 /* Create individual page-sizes aliases for overlays */
 s->gnt_aliases = (void *)g_new0(MemoryRegion, s->max_frames);
@@ -90,8 +88,11 @@ static void xen_gnttab_realize(DeviceState *dev, Error 
**errp)
 s->gnt_frame_gpas[i] = INVALID_GPA;
 }
 
+s->nr_frames = 0;
+memset(s->entries.v1, 0, XEN_PAGE_SIZE * s->max_frames);
 s->entries.v1[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
 s->entries.v1[GNTTAB_RESERVED_XENSTORE].frame = XEN_SPECIAL_PFN(XENSTORE);
+
 qemu_mutex_init(>gnt_lock);
 
 xen_gnttab_singleton = s;
@@ -523,3 +524,24 @@ static struct gnttab_backend_ops emu_gnttab_backend_ops = {
 .unmap = xen_be_gnttab_unmap,
 };
 
+int xen_gnttab_reset(void)
+{
+XenGnttabState *s = xen_gnttab_singleton;
+
+if (!s) {
+return -ENOTSUP;
+}
+
+QEMU_LOCK_GUARD(>gnt_lock);
+
+s->nr_frames = 0;
+
+memset(s->entries.v1, 0, XEN_PAGE_SIZE * s->max_frames);
+
+s->entries.v1[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
+s->entries.v1[GNTTAB_RESERVED_XENSTORE].frame = XEN_SPECIAL_PFN(XENSTORE);
+
+memset(s->map_track, 0, s->max_frames * ENTRIES_PER_FRAME_V1);
+
+return 0;
+}
diff --git a/hw/i386/kvm/xen_gnttab.h b/hw/i386/kvm/xen_gnttab.h
index 3bdbe96191..ee215239b0 100644
--- a/hw/i386/kvm/xen_gnttab.h
+++ b/hw/i386/kvm/xen_gnttab.h
@@ -13,6 +13,7 @@
 #define QEMU_XEN_GNTTAB_H
 
 void xen_gnttab_create(void);
+int xen_gnttab_reset(void);
 int xen_gnttab_map_page(uint64_t idx, uint64_t gfn);
 
 struct gnttab_set_version;
diff --git a/target/i386/kvm/xen-emu.c b/target/i386/kvm/xen-emu.c
index bad3131d08..0bb6c601c9 100644
--- a/target/i386/kvm/xen-emu.c
+++ b/target/i386/kvm/xen-emu.c
@@ -1406,6 +1406,11 @@ int kvm_xen_soft_reset(void)
 return err;
 }
 
+err = xen_gnttab_reset();
+if (err) {
+return err;
+}
+
 err = xen_xenstore_reset();
 if (err) {
 return err;
-- 
2.39.0




[RFC PATCH v1 21/25] hw/xen: Add emulated implementation of grant table operations

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

This is limited to mapping a single grant at a time, because under Xen the
pages are mapped *contiguously* into qemu's address space, and that's very
hard to do when those pages actually come from anonymous mappings in qemu
in the first place.

Eventually perhaps we can look at using shared mappings of actual objects
for system RAM, and then we can make new mappings of the same backing
store (be it deleted files, shmem, whatever). But for now let's stick to
a page at a time.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_gnttab.c | 299 ++-
 1 file changed, 296 insertions(+), 3 deletions(-)

diff --git a/hw/i386/kvm/xen_gnttab.c b/hw/i386/kvm/xen_gnttab.c
index 1e691ded32..2bf91d36c0 100644
--- a/hw/i386/kvm/xen_gnttab.c
+++ b/hw/i386/kvm/xen_gnttab.c
@@ -22,6 +22,7 @@
 
 #include "hw/sysbus.h"
 #include "hw/xen/xen.h"
+#include "hw/xen/xen_backend_ops.h"
 #include "xen_overlay.h"
 #include "xen_gnttab.h"
 
@@ -34,11 +35,10 @@
 #define TYPE_XEN_GNTTAB "xen-gnttab"
 OBJECT_DECLARE_SIMPLE_TYPE(XenGnttabState, XEN_GNTTAB)
 
-#define XEN_PAGE_SHIFT 12
-#define XEN_PAGE_SIZE (1ULL << XEN_PAGE_SHIFT)
-
 #define ENTRIES_PER_FRAME_V1 (XEN_PAGE_SIZE / sizeof(grant_entry_v1_t))
 
+static struct gnttab_backend_ops emu_gnttab_backend_ops;
+
 struct XenGnttabState {
 /*< private >*/
 SysBusDevice busdev;
@@ -57,6 +57,8 @@ struct XenGnttabState {
 MemoryRegion gnt_frames;
 MemoryRegion *gnt_aliases;
 uint64_t *gnt_frame_gpas;
+
+uint8_t *map_track;
 };
 
 struct XenGnttabState *xen_gnttab_singleton;
@@ -88,9 +90,15 @@ static void xen_gnttab_realize(DeviceState *dev, Error 
**errp)
 s->gnt_frame_gpas[i] = INVALID_GPA;
 }
 
+s->entries.v1[GNTTAB_RESERVED_XENSTORE].flags = GTF_permit_access;
+s->entries.v1[GNTTAB_RESERVED_XENSTORE].frame = XEN_SPECIAL_PFN(XENSTORE);
 qemu_mutex_init(>gnt_lock);
 
 xen_gnttab_singleton = s;
+
+s->map_track = g_new0(uint8_t, s->max_frames * ENTRIES_PER_FRAME_V1);
+
+xen_gnttab_ops = _gnttab_backend_ops;
 }
 
 static int xen_gnttab_post_load(void *opaque, int version_id)
@@ -230,3 +238,288 @@ int xen_gnttab_query_size_op(struct gnttab_query_size 
*size)
 size->max_nr_frames = s->max_frames;
 return 0;
 }
+
+/* Track per-open refs, to allow close() to clean up. */
+struct active_ref {
+MemoryRegionSection mrs;
+void *virtaddr;
+uint32_t refcnt;
+int prot;
+};
+
+static void gnt_unref(XenGnttabState *s, grant_ref_t ref,
+  MemoryRegionSection *mrs, int prot)
+{
+if (mrs && mrs->mr) {
+if (prot & PROT_WRITE) {
+memory_region_set_dirty(mrs->mr, mrs->offset_within_region,
+XEN_PAGE_SIZE);
+}
+memory_region_unref(mrs->mr);
+mrs->mr = NULL;
+}
+assert(s->map_track[ref] != 0);
+
+if (--s->map_track[ref] == 0) {
+grant_entry_v1_t *gnt_p = >entries.v1[ref];
+qatomic_and(_p->flags, (uint16_t)~(GTF_reading | GTF_writing));
+}
+}
+
+static uint64_t gnt_ref(XenGnttabState *s, grant_ref_t ref, int prot)
+{
+uint16_t mask = GTF_type_mask | GTF_sub_page;
+grant_entry_v1_t gnt, *gnt_p;
+int retries = 0;
+
+if (ref >= s->max_frames * ENTRIES_PER_FRAME_V1 ||
+s->map_track[ref] == UINT8_MAX) {
+return INVALID_GPA;
+}
+
+if (prot & PROT_WRITE) {
+mask |= GTF_readonly;
+}
+
+gnt_p = >entries.v1[ref];
+
+/*
+ * The guest can legitimately be changing the GTF_readonly flag. Allow
+ * that, but don't let a malicious guest cause a livelock.
+ */
+for (retries = 0; retries < 5; retries++) {
+uint16_t new_flags;
+
+/* Read the entry before an atomic operation on its flags */
+gnt = *(volatile grant_entry_v1_t *)gnt_p;
+
+if ((gnt.flags & mask) != GTF_permit_access ||
+gnt.domid != DOMID_QEMU) {
+return INVALID_GPA;
+}
+
+new_flags = gnt.flags | GTF_reading;
+if (prot & PROT_WRITE) {
+new_flags |= GTF_writing;
+}
+
+if (qatomic_cmpxchg(_p->flags, gnt.flags, new_flags) == gnt.flags) 
{
+return (uint64_t)gnt.frame << XEN_PAGE_SHIFT;
+}
+}
+
+return INVALID_GPA;
+}
+
+struct xengntdev_handle {
+GHashTable *active_maps;
+};
+
+static int xen_be_gnttab_set_max_grants(struct xengntdev_handle *xgt,
+uint32_t nr_grants)
+{
+return 0;
+}
+
+static void *xen_be_gnttab_map_refs(struct xengntdev_handle *xgt,
+uint32_t count, uint32_t domid,
+uint32_t *refs, int prot)
+{
+XenGnttabState *s = xen_gnttab_singleton;
+struct active_ref *act;
+
+if (!s) {
+errno = ENOTSUP;
+return NULL;
+}
+
+if (domid != xen_domid) {
+errno = EINVAL;
+return NULL;
+}
+
+if (!count || count 

[RFC PATCH v1 01/25] hw/xen: Add xenstore wire implementation and implementation stubs

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

This implements the basic wire protocol for the XenStore commands, punting
all the actual implementation to xs_impl_* functions which all just return
errors for now.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/meson.build |   1 +
 hw/i386/kvm/trace-events|  15 +
 hw/i386/kvm/xen_xenstore.c  | 871 +++-
 hw/i386/kvm/xenstore_impl.c | 117 +
 hw/i386/kvm/xenstore_impl.h |  58 +++
 5 files changed, 1054 insertions(+), 8 deletions(-)
 create mode 100644 hw/i386/kvm/xenstore_impl.c
 create mode 100644 hw/i386/kvm/xenstore_impl.h

diff --git a/hw/i386/kvm/meson.build b/hw/i386/kvm/meson.build
index 82dd6ae7c6..6621ba5cd7 100644
--- a/hw/i386/kvm/meson.build
+++ b/hw/i386/kvm/meson.build
@@ -9,6 +9,7 @@ i386_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files(
   'xen_evtchn.c',
   'xen_gnttab.c',
   'xen_xenstore.c',
+  'xenstore_impl.c',
   ))
 
 i386_ss.add_all(when: 'CONFIG_KVM', if_true: i386_kvm_ss)
diff --git a/hw/i386/kvm/trace-events b/hw/i386/kvm/trace-events
index b83c3eb965..e4c82de6f3 100644
--- a/hw/i386/kvm/trace-events
+++ b/hw/i386/kvm/trace-events
@@ -3,3 +3,18 @@ kvm_xen_unmap_pirq(int pirq, int gsi) "pirq %d gsi %d"
 kvm_xen_get_free_pirq(int pirq, int type) "pirq %d type %d"
 kvm_xen_bind_pirq(int pirq, int port) "pirq %d port %d"
 kvm_xen_unmask_pirq(int pirq, char *dev, int vector) "pirq %d dev %s vector %d"
+xenstore_error(unsigned int id, unsigned int tx_id, const char *err) "req %u 
tx %u err %s"
+xenstore_read(unsigned int tx_id, const char *path) "tx %u path %s"
+xenstore_write(unsigned int tx_id, const char *path) "tx %u path %s"
+xenstore_mkdir(unsigned int tx_id, const char *path) "tx %u path %s"
+xenstore_directory(unsigned int tx_id, const char *path) "tx %u path %s"
+xenstore_directory_part(unsigned int tx_id, const char *path, unsigned int 
offset) "tx %u path %s offset %u"
+xenstore_transaction_start(unsigned int new_tx) "new_tx %u"
+xenstore_transaction_end(unsigned int tx_id, bool commit) "tx %u commit %d"
+xenstore_rm(unsigned int tx_id, const char *path) "tx %u path %s"
+xenstore_get_perms(unsigned int tx_id, const char *path) "tx %u path %s"
+xenstore_set_perms(unsigned int tx_id, const char *path) "tx %u path %s"
+xenstore_watch(const char *path, const char *token) "path %s token %s"
+xenstore_unwatch(const char *path, const char *token) "path %s token %s"
+xenstore_reset_watches(void) ""
+xenstore_watch_event(const char *path, const char *token) "path %s token %s"
diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 14193ef3f9..64d8f1a38f 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -28,6 +28,10 @@
 #include "sysemu/kvm.h"
 #include "sysemu/kvm_xen.h"
 
+#include "trace.h"
+
+#include "xenstore_impl.h"
+
 #include "hw/xen/interface/io/xs_wire.h"
 #include "hw/xen/interface/event_channel.h"
 
@@ -47,6 +51,9 @@ struct XenXenstoreState {
 SysBusDevice busdev;
 /*< public >*/
 
+XenstoreImplState *impl;
+GList *watch_events;
+
 MemoryRegion xenstore_page;
 struct xenstore_domain_interface *xs;
 uint8_t req_data[XENSTORE_HEADER_SIZE + XENSTORE_PAYLOAD_MAX];
@@ -64,6 +71,7 @@ struct XenXenstoreState {
 struct XenXenstoreState *xen_xenstore_singleton;
 
 static void xen_xenstore_event(void *opaque);
+static void fire_watch_cb(void *opaque, const char *path, const char *token);
 
 static void xen_xenstore_realize(DeviceState *dev, Error **errp)
 {
@@ -89,6 +97,8 @@ static void xen_xenstore_realize(DeviceState *dev, Error 
**errp)
 }
 aio_set_fd_handler(qemu_get_aio_context(), xen_be_evtchn_fd(s->eh), true,
xen_xenstore_event, NULL, NULL, NULL, s);
+
+s->impl = xs_impl_create();
 }
 
 static bool xen_xenstore_is_needed(void *opaque)
@@ -213,20 +223,761 @@ static void reset_rsp(XenXenstoreState *s)
 s->rsp_offset = 0;
 }
 
+static void xs_error(XenXenstoreState *s, unsigned int id,
+ xs_transaction_t tx_id, int errnum)
+{
+struct xsd_sockmsg *rsp = (struct xsd_sockmsg *)s->rsp_data;
+const char *errstr = NULL;
+
+for (unsigned int i = 0; i < ARRAY_SIZE(xsd_errors); i++) {
+struct xsd_errors *xsd_error = _errors[i];
+
+if (xsd_error->errnum == errnum) {
+errstr = xsd_error->errstring;
+break;
+}
+}
+assert(errstr);
+
+trace_xenstore_error(id, tx_id, errstr);
+
+rsp->type = XS_ERROR;
+rsp->req_id = id;
+rsp->tx_id = tx_id;
+rsp->len = (uint32_t)strlen(errstr) + 1;
+
+memcpy([1], errstr, rsp->len);
+}
+
+static void xs_ok(XenXenstoreState *s, unsigned int type, unsigned int req_id,
+  xs_transaction_t tx_id)
+{
+struct xsd_sockmsg *rsp = (struct xsd_sockmsg *)s->rsp_data;
+const char *okstr = "OK";
+
+rsp->type = type;
+rsp->req_id = req_id;
+rsp->tx_id = tx_id;
+rsp->len = (uint32_t)strlen(okstr) + 1;
+
+memcpy([1], okstr, rsp->len);
+}
+

[RFC PATCH v1 07/25] hw/xen: Implement core serialize/deserialize methods for xenstore_impl

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

In fact I think we want to only serialize the contents of the domain's
path in /local/domain/${domid} and leave the rest to be recreated? Will
defer to Paul for that.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_xenstore.c  |  25 +-
 hw/i386/kvm/xenstore_impl.c | 574 +++-
 hw/i386/kvm/xenstore_impl.h |   5 +
 tests/unit/test-xs-node.c   | 236 ++-
 4 files changed, 824 insertions(+), 16 deletions(-)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 3b409e3817..1b1358ad4c 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -66,6 +66,9 @@ struct XenXenstoreState {
 evtchn_port_t guest_port;
 evtchn_port_t be_port;
 struct xenevtchn_handle *eh;
+
+uint8_t *impl_state;
+uint32_t impl_state_size;
 };
 
 struct XenXenstoreState *xen_xenstore_singleton;
@@ -109,16 +112,26 @@ static bool xen_xenstore_is_needed(void *opaque)
 static int xen_xenstore_pre_save(void *opaque)
 {
 XenXenstoreState *s = opaque;
+GByteArray *save;
 
 if (s->eh) {
 s->guest_port = xen_be_evtchn_get_guest_port(s->eh);
 }
+
+g_free(s->impl_state);
+save = xs_impl_serialize(s->impl);
+s->impl_state = save->data;
+s->impl_state_size = save->len;
+g_byte_array_free(save, false);
+
 return 0;
 }
 
 static int xen_xenstore_post_load(void *opaque, int ver)
 {
 XenXenstoreState *s = opaque;
+GByteArray *save;
+int ret;
 
 /*
  * As qemu/dom0, rebind to the guest's port. The Windows drivers may
@@ -135,7 +148,13 @@ static int xen_xenstore_post_load(void *opaque, int ver)
 }
 s->be_port = be_port;
 }
-return 0;
+
+save = g_byte_array_new_take(s->impl_state, s->impl_state_size);
+s->impl_state = NULL;
+s->impl_state_size = 0;
+
+ret = xs_impl_deserialize(s->impl, save, xen_domid, fire_watch_cb, s);
+return ret;
 }
 
 static const VMStateDescription xen_xenstore_vmstate = {
@@ -155,6 +174,10 @@ static const VMStateDescription xen_xenstore_vmstate = {
 VMSTATE_BOOL(rsp_pending, XenXenstoreState),
 VMSTATE_UINT32(guest_port, XenXenstoreState),
 VMSTATE_BOOL(fatal_error, XenXenstoreState),
+VMSTATE_UINT32(impl_state_size, XenXenstoreState),
+VMSTATE_VARRAY_UINT32_ALLOC(impl_state, XenXenstoreState,
+impl_state_size, 0,
+vmstate_info_uint8, uint8_t),
 VMSTATE_END_OF_LIST()
 }
 };
diff --git a/hw/i386/kvm/xenstore_impl.c b/hw/i386/kvm/xenstore_impl.c
index 7988bde88f..82e7ae06f5 100644
--- a/hw/i386/kvm/xenstore_impl.c
+++ b/hw/i386/kvm/xenstore_impl.c
@@ -37,6 +37,7 @@ typedef struct XsNode {
 uint64_t gencnt;
 bool deleted_in_tx;
 bool modified_in_tx;
+unsigned int serialized_tx;
 #ifdef XS_NODE_UNIT_TEST
 gchar *name; /* debug only */
 #endif
@@ -68,6 +69,7 @@ struct XenstoreImplState {
 unsigned int nr_domu_transactions;
 unsigned int root_tx;
 unsigned int last_tx;
+bool serialized;
 };
 
 
@@ -1156,8 +1158,10 @@ int xs_impl_set_perms(XenstoreImplState *s, unsigned int 
dom_id,
 return xs_node_walk(n, );
 }
 
-int xs_impl_watch(XenstoreImplState *s, unsigned int dom_id, const char *path,
-  const char *token, xs_impl_watch_fn fn, void *opaque)
+static int do_xs_impl_watch(XenstoreImplState *s, unsigned int dom_id,
+const char *path, const char *token,
+xs_impl_watch_fn fn, void *opaque)
+
 {
 char abspath[XENSTORE_ABS_PATH_MAX + 1];
 XsWatch *w, *l;
@@ -1200,12 +1204,22 @@ int xs_impl_watch(XenstoreImplState *s, unsigned int 
dom_id, const char *path,
 s->nr_domu_watches++;
 }
 
-/* A new watch should fire immediately */
-fn(opaque, path, token);
-
 return 0;
 }
 
+int xs_impl_watch(XenstoreImplState *s, unsigned int dom_id, const char *path,
+  const char *token, xs_impl_watch_fn fn, void *opaque)
+{
+int ret = do_xs_impl_watch(s, dom_id, path, token, fn, opaque);
+
+if (!ret) {
+/* A new watch should fire immediately */
+fn(opaque, path, token);
+}
+
+return ret;
+}
+
 static XsWatch *free_watch(XenstoreImplState *s, XsWatch *w)
 {
 XsWatch *next = w->next;
@@ -1361,3 +1375,553 @@ XenstoreImplState *xs_impl_create(unsigned int dom_id)
 s->root_tx = s->last_tx = 1;
 return s;
 }
+
+
+static void clear_serialized_tx(gpointer key, gpointer value, gpointer opaque)
+{
+XsNode *n = value;
+
+n->serialized_tx = XBT_NULL;
+if (n->children) {
+g_hash_table_foreach(n->children, clear_serialized_tx, NULL);
+}
+}
+
+static void clear_tx_serialized_tx(gpointer key, gpointer value,
+   gpointer opaque)
+{
+XsTransaction *t = value;
+
+clear_serialized_tx(NULL, t->root, NULL);
+}
+
+static void write_be32(GByteArray *save, 

[RFC PATCH v1 09/25] hw/xen: Add evtchn operations to allow redirection to internal emulation

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

The existing implementation calling into the real libxenevtchn moves to
a new file hw/xen/xen-operations.c, and is called via a function table
which in a subsequent commit will also be able to invoke the emulated
event channel support.

Signed-off-by: David Woodhouse 
Signed-off-by: Paul Durrant 
---
 hw/9pfs/xen-9p-backend.c|  24 +++---
 hw/i386/xen/xen-hvm.c   |  27 ---
 hw/xen/meson.build  |   1 +
 hw/xen/xen-bus.c|  22 +++---
 hw/xen/xen-legacy-backend.c |   8 +-
 hw/xen/xen-operations.c |  71 +
 hw/xen/xen_pvdev.c  |  12 +--
 include/hw/xen/xen-bus.h|   1 +
 include/hw/xen/xen-legacy-backend.h |   1 +
 include/hw/xen/xen_backend_ops.h| 118 
 include/hw/xen/xen_common.h |  12 ---
 include/hw/xen/xen_pvdev.h  |   1 +
 softmmu/globals.c   |   1 +
 13 files changed, 242 insertions(+), 57 deletions(-)
 create mode 100644 hw/xen/xen-operations.c
 create mode 100644 include/hw/xen/xen_backend_ops.h

diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index 65c4979c3c..864bdaf952 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -241,7 +241,7 @@ static void xen_9pfs_push_and_notify(V9fsPDU *pdu)
 xen_wmb();
 
 ring->inprogress = false;
-xenevtchn_notify(ring->evtchndev, ring->local_port);
+qemu_xen_evtchn_notify(ring->evtchndev, ring->local_port);
 
 qemu_bh_schedule(ring->bh);
 }
@@ -324,8 +324,8 @@ static void xen_9pfs_evtchn_event(void *opaque)
 Xen9pfsRing *ring = opaque;
 evtchn_port_t port;
 
-port = xenevtchn_pending(ring->evtchndev);
-xenevtchn_unmask(ring->evtchndev, port);
+port = qemu_xen_evtchn_pending(ring->evtchndev);
+qemu_xen_evtchn_unmask(ring->evtchndev, port);
 
 qemu_bh_schedule(ring->bh);
 }
@@ -337,10 +337,10 @@ static void xen_9pfs_disconnect(struct XenLegacyDevice 
*xendev)
 
 for (i = 0; i < xen_9pdev->num_rings; i++) {
 if (xen_9pdev->rings[i].evtchndev != NULL) {
-qemu_set_fd_handler(xenevtchn_fd(xen_9pdev->rings[i].evtchndev),
-NULL, NULL, NULL);
-xenevtchn_unbind(xen_9pdev->rings[i].evtchndev,
- xen_9pdev->rings[i].local_port);
+
qemu_set_fd_handler(qemu_xen_evtchn_fd(xen_9pdev->rings[i].evtchndev),
+NULL, NULL, NULL);
+qemu_xen_evtchn_unbind(xen_9pdev->rings[i].evtchndev,
+   xen_9pdev->rings[i].local_port);
 xen_9pdev->rings[i].evtchndev = NULL;
 }
 }
@@ -447,12 +447,12 @@ static int xen_9pfs_connect(struct XenLegacyDevice 
*xendev)
 xen_9pdev->rings[i].inprogress = false;
 
 
-xen_9pdev->rings[i].evtchndev = xenevtchn_open(NULL, 0);
+xen_9pdev->rings[i].evtchndev = qemu_xen_evtchn_open();
 if (xen_9pdev->rings[i].evtchndev == NULL) {
 goto out;
 }
-qemu_set_cloexec(xenevtchn_fd(xen_9pdev->rings[i].evtchndev));
-xen_9pdev->rings[i].local_port = xenevtchn_bind_interdomain
+qemu_set_cloexec(qemu_xen_evtchn_fd(xen_9pdev->rings[i].evtchndev));
+xen_9pdev->rings[i].local_port = qemu_xen_evtchn_bind_interdomain
 (xen_9pdev->rings[i].evtchndev,
  xendev->dom,
  xen_9pdev->rings[i].evtchn);
@@ -463,8 +463,8 @@ static int xen_9pfs_connect(struct XenLegacyDevice *xendev)
 goto out;
 }
 xen_pv_printf(xendev, 2, "bind evtchn port %d\n", xendev->local_port);
-qemu_set_fd_handler(xenevtchn_fd(xen_9pdev->rings[i].evtchndev),
-xen_9pfs_evtchn_event, NULL, _9pdev->rings[i]);
+qemu_set_fd_handler(qemu_xen_evtchn_fd(xen_9pdev->rings[i].evtchndev),
+xen_9pfs_evtchn_event, NULL, _9pdev->rings[i]);
 }
 
 xen_9pdev->security_model = xenstore_read_be_str(xendev, "security_model");
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index e5a1dd19f4..cb1d24f592 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -761,7 +761,7 @@ static ioreq_t *cpu_get_ioreq(XenIOState *state)
 int i;
 evtchn_port_t port;
 
-port = xenevtchn_pending(state->xce_handle);
+port = qemu_xen_evtchn_pending(state->xce_handle);
 if (port == state->bufioreq_local_port) {
 timer_mod(state->buffered_io_timer,
 BUFFER_IO_MAX_DELAY + qemu_clock_get_ms(QEMU_CLOCK_REALTIME));
@@ -780,7 +780,7 @@ static ioreq_t *cpu_get_ioreq(XenIOState *state)
 }
 
 /* unmask the wanted port again */
-xenevtchn_unmask(state->xce_handle, port);
+qemu_xen_evtchn_unmask(state->xce_handle, port);
 
 /* get the io packet from shared memory */
 

[RFC PATCH v1 04/25] hw/xen: Implement XenStore transactions

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Given that the whole thing supported copy on write from the beginning,
transactions end up being fairly simple. On starting a transaction, just
take a ref of the existing root; swap it back in on a successful commit.

The main tree has a transaction ID too, and we keep a record of the last
transaction ID given out. if the main tree is ever modified when it isn't
the latest, it gets a new transaction ID.

A commit can only succeed if the main tree hasn't moved on since it was
forked. Strictly speaking, the XenStore protocol allows a transaction to
succeed as long as nothing *it* read or wrote has changed in the interim,
but no implementations do that; *any* change is sufficient to abort a
transaction.

This does not yet fire watches on the changed nodes on a commit. That bit
is more fun and will come in a follow-on commit.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xenstore_impl.c | 150 ++--
 tests/unit/test-xs-node.c   | 118 
 2 files changed, 262 insertions(+), 6 deletions(-)

diff --git a/hw/i386/kvm/xenstore_impl.c b/hw/i386/kvm/xenstore_impl.c
index 2e464af93a..e5074ab1ec 100644
--- a/hw/i386/kvm/xenstore_impl.c
+++ b/hw/i386/kvm/xenstore_impl.c
@@ -46,13 +46,56 @@ typedef struct XsWatch {
 int rel_prefix;
 } XsWatch;
 
+typedef struct XsTransaction {
+XsNode *root;
+unsigned int nr_nodes;
+unsigned int base_tx;
+unsigned int tx_id;
+unsigned int dom_id;
+} XsTransaction;
+
 struct XenstoreImplState {
 XsNode *root;
 unsigned int nr_nodes;
 GHashTable *watches;
 unsigned int nr_domu_watches;
+GHashTable *transactions;
+unsigned int nr_domu_transactions;
+unsigned int root_tx;
+unsigned int last_tx;
 };
 
+
+static void nobble_tx(gpointer key, gpointer value, gpointer user_data)
+{
+unsigned int *new_tx_id = user_data;
+XsTransaction *tx = value;
+
+if (tx->base_tx == *new_tx_id) {
+/* Transactions based on XBT_NULL will always fail */
+tx->base_tx = XBT_NULL;
+}
+}
+
+static inline unsigned int next_tx(struct XenstoreImplState *s)
+{
+unsigned int tx_id;
+
+/* Find the next TX id which isn't either XBT_NULL or in use. */
+do {
+tx_id = ++s->last_tx;
+} while (tx_id == XBT_NULL || tx_id == s->root_tx ||
+ g_hash_table_lookup(s->transactions, GINT_TO_POINTER(tx_id)));
+
+/*
+ * It is vanishingly unlikely, but ensure that no outstanding transaction
+ * is based on the (previous incarnation of the) newly-allocated TX id.
+ */
+g_hash_table_foreach(s->transactions, nobble_tx, _id);
+
+return tx_id;
+}
+
 static inline XsNode *xs_node_new(void)
 {
 XsNode *n = g_new0(XsNode, 1);
@@ -159,6 +202,7 @@ struct walk_op {
 
 GList *watches;
 unsigned int dom_id;
+unsigned int tx_id;
 
 /* The number of nodes which will exist in the tree if this op succeeds. */
 unsigned int new_nr_nodes;
@@ -176,6 +220,7 @@ struct walk_op {
 bool inplace;
 bool mutating;
 bool create_dirs;
+bool in_transaction;
 };
 
 static void fire_watches(struct walk_op *op, bool parents)
@@ -183,7 +228,7 @@ static void fire_watches(struct walk_op *op, bool parents)
 GList *l = NULL;
 XsWatch *w;
 
-if (!op->mutating) {
+if (!op->mutating || op->in_transaction) {
 return;
 }
 
@@ -450,10 +495,23 @@ static int xs_node_walk(XsNode **n, struct walk_op *op)
 assert(!op->watches);
 /*
  * On completing the recursion back up the path walk and reaching the
- * top, assign the new node count if the operation was successful.
+ * top, assign the new node count if the operation was successful. If
+ * the main tree was changed, bump its tx ID so that outstanding
+ * transactions correctly fail. But don't bump it every time; only
+ * if it makes a difference.
  */
 if (!err && op->mutating) {
-op->s->nr_nodes = op->new_nr_nodes;
+if (!op->in_transaction) {
+if (op->s->root_tx != op->s->last_tx) {
+op->s->root_tx = next_tx(op->s);
+}
+op->s->nr_nodes = op->new_nr_nodes;
+} else {
+XsTransaction *tx = g_hash_table_lookup(op->s->transactions,
+
GINT_TO_POINTER(op->tx_id));
+assert(tx);
+tx->nr_nodes = op->new_nr_nodes;
+}
 }
 }
 return err;
@@ -535,14 +593,23 @@ static int init_walk_op(XenstoreImplState *s, struct 
walk_op *op,
 op->inplace = true;
 op->mutating = false;
 op->create_dirs = false;
+op->in_transaction = false;
 op->dom_id = dom_id;
+op->tx_id = tx_id;
 op->s = s;
 
 if (tx_id == XBT_NULL) {
 *rootp = >root;
 op->new_nr_nodes = s->nr_nodes;
 } else {
-return ENOENT;
+

[RFC PATCH v1 25/25] i386/xen: Initialize Xen backends from pc_basic_device_init() for emulation

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Now that all the work is done to enable the PV backends to work without
actual Xen, instantiate the bus from pc_basic_device_init() for emulated
mode.

This allows us finally to launch an emulated Xen guest with PV disk.

   qemu-system-x86_64 -serial mon:stdio -M q35 -cpu host -display none \
 -m 1G -smp 2 -accel kvm,xen-version=0x4000a,kernel-irqchip=split \
 -kernel bzImage -append "console=ttyS0 root=/dev/xvda1" \
 -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \
 -device xen-disk,drive=disk,vdev=xvda

If we use -M pc instead of q35, we can even add an IDE disk and boot a
guest image normally through grub. But q35 gives us AHCI and that isn't
unplugged by the Xen magic, so the guests ends up seeing "both" disks.

Signed-off-by: David Woodhouse 
---
 hw/i386/pc.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index fd17ce7a94..3fe028c86c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -102,6 +102,11 @@
 #include "trace.h"
 #include CONFIG_DEVICES
 
+#ifdef CONFIG_XEN_EMU
+#include "hw/xen/xen-legacy-backend.h"
+#include "hw/xen/xen-bus.h"
+#endif
+
 /*
  * Helper for setting model-id for CPU models that changed model-id
  * depending on QEMU versions up to QEMU 2.4.
@@ -1318,6 +1323,8 @@ void pc_basic_device_init(struct PCMachineState *pcms,
 if (pcms->bus) {
 pci_create_simple(pcms->bus, -1, "xen-platform");
 }
+xen_bus_init();
+xen_be_init();
 }
 #endif
 
-- 
2.39.0




[RFC PATCH v1 23/25] hw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_xenstore.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 028f80499e..f9b7387024 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -21,6 +21,7 @@
 
 #include "hw/sysbus.h"
 #include "hw/xen/xen.h"
+#include "hw/xen/xen_backend_ops.h"
 #include "xen_overlay.h"
 #include "xen_evtchn.h"
 #include "xen_xenstore.h"
@@ -34,6 +35,7 @@
 
 #include "hw/xen/interface/io/xs_wire.h"
 #include "hw/xen/interface/event_channel.h"
+#include "hw/xen/interface/grant_table.h"
 
 #define TYPE_XEN_XENSTORE "xen-xenstore"
 OBJECT_DECLARE_SIMPLE_TYPE(XenXenstoreState, XEN_XENSTORE)
@@ -66,6 +68,9 @@ struct XenXenstoreState {
 
 uint8_t *impl_state;
 uint32_t impl_state_size;
+
+struct xengntdev_handle *gt;
+void *granted_xs;
 };
 
 struct XenXenstoreState *xen_xenstore_singleton;
@@ -1452,6 +1457,17 @@ int xen_xenstore_reset(void)
 }
 s->be_port = err;
 
+/*
+ * We don't actually access the guest's page through the grant, because
+ * this isn't real Xen, and we can just use the page we gave it in the
+ * first place. Map the grant anyway, mostly for cosmetic purposes so
+ * it *looks* like it's in use in the guest-visible grant table.
+ */
+s->gt = qemu_xen_gnttab_open();
+uint32_t xs_gntref = GNTTAB_RESERVED_XENSTORE;
+s->granted_xs = qemu_xen_gnttab_map_refs(s->gt, 1, xen_domid, _gntref,
+ PROT_READ | PROT_WRITE);
+
 return 0;
 }
 
-- 
2.39.0




[RFC PATCH v1 00/25] Enable PV backends with Xen/KVM emulation

2023-03-02 Thread David Woodhouse
Now that the basic platform support is hopefully on the cusp of being 
merged, here's phase 2 which wires up the XenBus and PV back ends.

It starts with a basic single-tenant internal implementation of a 
XenStore, with a copy-on-write tree, watches, transactions, quotas.

Then we introduce operations tables for the grant table, event channel,
foreignmen and xenstore operations so that in addition to using the Xen
libraries for those, QEMU can use its internal emulated versions.

A little bit of cleaning up of header files, and we can enable the build
of xen-bus in the CONFIG_XEN_EMU build, and run a Xen guest with an
actual PV disk...

   qemu-system-x86_64 -serial mon:stdio -M q35 -display none -m 1G -smp 2 \
  -accel kvm,xen-version=0x4000e,kernel-irqchip=split \
  -kernel bzImage -append "console=ttyS0 root=/dev/xvda1 selinux=0" \
  -drive file=/var/lib/libvirt/images/fedora28.qcow2,if=none,id=disk \
  -device xen-disk,drive=disk,vdev=xvda

The main thing that isn't working here is migration. I've implemented it 
for the internal xenstore and the unit tests exercise it, but the 
existing PV back ends don't support it, perhaps partly because support 
for guest transparent live migration support isn't upstream in Xen yet. 
So the disk doesn't come back correctly after migration. I'm content 
with that for 8.0 though.

The other pre-existing constraint is that only the block back end has
yet been ported to the "new" XenBus infrastructure, and is actually
capable of creating its own backend nodes. Again, I can live with
that for 8.0. Maybe this will motivate us to finally get round to
converting the rest off XenLegacyBackend and killing it.

We also don't have a simple way to perform grant mapping of multiple
guest pages to contiguous addresses, as we can under real Xen. So we
don't advertise max-ring-page-order for xen-disk in the emulated mode.
Fixing that — if we actually want to — would probably require mapping
RAM from an actual backing store object, so that it can be mapped again
at a different location for the PV back end to see.

David Woodhouse (21):
  hw/xen: Add xenstore wire implementation and implementation stubs
  hw/xen: Add basic XenStore tree walk and write/read/directory support
  hw/xen: Implement XenStore watches
  hw/xen: Implement XenStore transactions
  hw/xen: Watches on XenStore transactions
  hw/xen: Implement core serialize/deserialize methods for xenstore_impl
  hw/xen: Add evtchn operations to allow redirection to internal emulation
  hw/xen: Add gnttab operations to allow redirection to internal emulation
  hw/xen: Pass grant ref to gnttab unmap operation
  hw/xen: Add foreignmem operations to allow redirection to internal 
emulation
  hw/xen: Move xenstore_store_pv_console_info to xen_console.c
  hw/xen: Use XEN_PAGE_SIZE in PV backend drivers
  hw/xen: Rename xen_common.h to xen_native.h
  hw/xen: Build PV backend drivers for CONFIG_XEN_BUS
  hw/xen: Only advertise ring-page-order for xen-block if gnttab supports it
  hw/xen: Hook up emulated implementation for event channel operations
  hw/xen: Add emulated implementation of grant table operations
  hw/xen: Add emulated implementation of XenStore operations
  hw/xen: Map guest XENSTORE_PFN grant in emulated Xenstore
  hw/xen: Implement soft reset for emulated gnttab
  i386/xen: Initialize Xen backends from pc_basic_device_init() for 
emulation

Paul Durrant (4):
  hw/xen: Implement XenStore permissions
  hw/xen: Create initial XenStore nodes
  hw/xen: Add xenstore operations to allow redirection to internal emulation
  hw/xen: Avoid crash when backend watch fires too early

 accel/xen/xen-all.c   |   69 +-
 hw/9pfs/meson.build   |2 +-
 hw/9pfs/xen-9p-backend.c  |   32 +-
 hw/block/dataplane/meson.build|2 +-
 hw/block/dataplane/xen-block.c|   12 +-
 hw/block/meson.build  |2 +-
 hw/block/xen-block.c  |   12 +-
 hw/char/meson.build   |2 +-
 hw/char/xen_console.c |   57 +-
 hw/display/meson.build|2 +-
 hw/display/xenfb.c|   32 +-
 hw/i386/kvm/meson.build   |1 +
 hw/i386/kvm/trace-events  |   15 +
 hw/i386/kvm/xen_evtchn.c  |   15 +
 hw/i386/kvm/xen_gnttab.c  |  325 -
 hw/i386/kvm/xen_gnttab.h  |1 +
 hw/i386/kvm/xen_xenstore.c| 1250 +++-
 hw/i386/kvm/xenstore_impl.c   | 1927 +
 hw/i386/kvm/xenstore_impl.h   |   63 +
 hw/i386/pc.c  |7 +
 hw/i386/pc_piix.c |4 +-
 hw/i386/xen/xen-hvm.c  

[RFC PATCH v1 10/25] hw/xen: Add gnttab operations to allow redirection to internal emulation

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Move the existing code using libxengnttab to xen-operations.c and allow
the operations to be redirected so that we can add emulation of grant
table mapping for backend drivers.

In emulation, mapping more than one grant ref to be virtually contiguous
would be fairly difficult. The best way to do it might be to make the
ram_block mappings actually backed by a file (shmem or a deleted file,
perhaps) so that we can have multiple *shared* mappings of it. But that
would be fairly intrusive.

Making the backend drivers cope with page *lists* instead of expecting
the mapping to be contiguous is also non-trivial, since some structures
would actually *cross* page boundaries (e.g. the 32-bit blkif responses
which are 12 bytes).

So for now, we'll support only single-page mappings in emulation. Add a
XEN_GNTTAB_OP_FEATURE_MAP_MULTIPLE flag to indicate that the native Xen
implementation *does* support multi-page maps, and a helper function to
query it.

Signed-off-by: David Woodhouse 
Signed-off-by: Paul Durrant 
---
 hw/xen/xen-bus.c| 112 ++--
 hw/xen/xen-legacy-backend.c | 125 ++
 hw/xen/xen-operations.c | 157 
 hw/xen/xen_pvdev.c  |   2 +-
 include/hw/xen/xen-bus.h|   3 +-
 include/hw/xen/xen-legacy-backend.h |  13 +--
 include/hw/xen/xen_backend_ops.h| 100 ++
 include/hw/xen/xen_common.h |  39 ---
 softmmu/globals.c   |   1 +
 9 files changed, 280 insertions(+), 272 deletions(-)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index d0b1ae93da..b247e86f28 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -947,7 +947,7 @@ static void xen_device_frontend_destroy(XenDevice *xendev)
 void xen_device_set_max_grant_refs(XenDevice *xendev, unsigned int nr_refs,
Error **errp)
 {
-if (xengnttab_set_max_grants(xendev->xgth, nr_refs)) {
+if (qemu_xen_gnttab_set_max_grants(xendev->xgth, nr_refs)) {
 error_setg_errno(errp, errno, "xengnttab_set_max_grants failed");
 }
 }
@@ -956,9 +956,8 @@ void *xen_device_map_grant_refs(XenDevice *xendev, uint32_t 
*refs,
 unsigned int nr_refs, int prot,
 Error **errp)
 {
-void *map = xengnttab_map_domain_grant_refs(xendev->xgth, nr_refs,
-xendev->frontend_id, refs,
-prot);
+void *map = qemu_xen_gnttab_map_refs(xendev->xgth, nr_refs,
+ xendev->frontend_id, refs, prot);
 
 if (!map) {
 error_setg_errno(errp, errno,
@@ -971,109 +970,17 @@ void *xen_device_map_grant_refs(XenDevice *xendev, 
uint32_t *refs,
 void xen_device_unmap_grant_refs(XenDevice *xendev, void *map,
  unsigned int nr_refs, Error **errp)
 {
-if (xengnttab_unmap(xendev->xgth, map, nr_refs)) {
+if (qemu_xen_gnttab_unmap(xendev->xgth, map, nr_refs)) {
 error_setg_errno(errp, errno, "xengnttab_unmap failed");
 }
 }
 
-static void compat_copy_grant_refs(XenDevice *xendev, bool to_domain,
-   XenDeviceGrantCopySegment segs[],
-   unsigned int nr_segs, Error **errp)
-{
-uint32_t *refs = g_new(uint32_t, nr_segs);
-int prot = to_domain ? PROT_WRITE : PROT_READ;
-void *map;
-unsigned int i;
-
-for (i = 0; i < nr_segs; i++) {
-XenDeviceGrantCopySegment *seg = [i];
-
-refs[i] = to_domain ? seg->dest.foreign.ref :
-seg->source.foreign.ref;
-}
-
-map = xengnttab_map_domain_grant_refs(xendev->xgth, nr_segs,
-  xendev->frontend_id, refs,
-  prot);
-if (!map) {
-error_setg_errno(errp, errno,
- "xengnttab_map_domain_grant_refs failed");
-goto done;
-}
-
-for (i = 0; i < nr_segs; i++) {
-XenDeviceGrantCopySegment *seg = [i];
-void *page = map + (i * XC_PAGE_SIZE);
-
-if (to_domain) {
-memcpy(page + seg->dest.foreign.offset, seg->source.virt,
-   seg->len);
-} else {
-memcpy(seg->dest.virt, page + seg->source.foreign.offset,
-   seg->len);
-}
-}
-
-if (xengnttab_unmap(xendev->xgth, map, nr_segs)) {
-error_setg_errno(errp, errno, "xengnttab_unmap failed");
-}
-
-done:
-g_free(refs);
-}
-
 void xen_device_copy_grant_refs(XenDevice *xendev, bool to_domain,
 XenDeviceGrantCopySegment segs[],
 unsigned int nr_segs, Error **errp)
 {
-xengnttab_grant_copy_segment_t *xengnttab_segs;
-unsigned int i;
-
-if (!xendev->feature_grant_copy) {
-compat_copy_grant_refs(xendev, 

[RFC PATCH v1 11/25] hw/xen: Pass grant ref to gnttab unmap operation

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

The previous commit introduced redirectable gnttab operations fairly
much like-for-like, with the exception of the extra arguments to the
->open() call which were always NULL/0 anyway.

This *changes* the arguments to the ->unmap() operation to include the
original ref# that was mapped. Under real Xen it isn't necessary; all we
need to do from QEMU is munmap(), then the kernel will release the grant,
and Xen does the tracking/refcounting for the guest.

When we have emulated grant tables though, we need to do all that for
ourselves. So let's have the back ends keep track of what they mapped
and pass it in to the ->unmap() method for us.

Signed-off-by: David Woodhouse 
---
 hw/9pfs/xen-9p-backend.c|  7 ---
 hw/block/dataplane/xen-block.c  |  1 +
 hw/char/xen_console.c   |  2 +-
 hw/net/xen_nic.c| 13 -
 hw/usb/xen-usb.c| 21 -
 hw/xen/xen-bus.c|  4 ++--
 hw/xen/xen-legacy-backend.c |  4 ++--
 hw/xen/xen-operations.c |  9 -
 include/hw/xen/xen-bus.h|  2 +-
 include/hw/xen/xen-legacy-backend.h |  6 +++---
 include/hw/xen/xen_backend_ops.h|  7 ---
 11 files changed, 50 insertions(+), 26 deletions(-)

diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index 864bdaf952..d8bb0e847c 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -359,12 +359,13 @@ static int xen_9pfs_free(struct XenLegacyDevice *xendev)
 if (xen_9pdev->rings[i].data != NULL) {
 xen_be_unmap_grant_refs(_9pdev->xendev,
 xen_9pdev->rings[i].data,
+xen_9pdev->rings[i].intf->ref,
 (1 << xen_9pdev->rings[i].ring_order));
 }
 if (xen_9pdev->rings[i].intf != NULL) {
-xen_be_unmap_grant_refs(_9pdev->xendev,
-xen_9pdev->rings[i].intf,
-1);
+xen_be_unmap_grant_ref(_9pdev->xendev,
+   xen_9pdev->rings[i].intf,
+   xen_9pdev->rings[i].ref);
 }
 if (xen_9pdev->rings[i].bh != NULL) {
 qemu_bh_delete(xen_9pdev->rings[i].bh);
diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 2785b9e849..e55b713002 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -705,6 +705,7 @@ void xen_block_dataplane_stop(XenBlockDataPlane *dataplane)
 Error *local_err = NULL;
 
 xen_device_unmap_grant_refs(xendev, dataplane->sring,
+dataplane->ring_ref,
 dataplane->nr_ring_ref, _err);
 dataplane->sring = NULL;
 
diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index 63153dfde4..19ad6c946a 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -271,7 +271,7 @@ static void con_disconnect(struct XenLegacyDevice *xendev)
 if (!xendev->dev) {
 xenforeignmemory_unmap(xen_fmem, con->sring, 1);
 } else {
-xen_be_unmap_grant_ref(xendev, con->sring);
+xen_be_unmap_grant_ref(xendev, con->sring, con->ring_ref);
 }
 con->sring = NULL;
 }
diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c
index 7d92c2d022..166d03787d 100644
--- a/hw/net/xen_nic.c
+++ b/hw/net/xen_nic.c
@@ -181,7 +181,7 @@ static void net_tx_packets(struct XenNetDev *netdev)
 qemu_send_packet(qemu_get_queue(netdev->nic),
  page + txreq.offset, txreq.size);
 }
-xen_be_unmap_grant_ref(>xendev, page);
+xen_be_unmap_grant_ref(>xendev, page, txreq.gref);
 net_tx_response(netdev, , NETIF_RSP_OKAY);
 }
 if (!netdev->tx_work) {
@@ -261,7 +261,7 @@ static ssize_t net_rx_packet(NetClientState *nc, const 
uint8_t *buf, size_t size
 return -1;
 }
 memcpy(page + NET_IP_ALIGN, buf, size);
-xen_be_unmap_grant_ref(>xendev, page);
+xen_be_unmap_grant_ref(>xendev, page, rxreq.gref);
 net_rx_response(netdev, , NETIF_RSP_OKAY, NET_IP_ALIGN, size, 0);
 
 return size;
@@ -343,7 +343,8 @@ static int net_connect(struct XenLegacyDevice *xendev)
netdev->rx_ring_ref,
PROT_READ | PROT_WRITE);
 if (!netdev->rxs) {
-xen_be_unmap_grant_ref(>xendev, netdev->txs);
+xen_be_unmap_grant_ref(>xendev, netdev->txs,
+   netdev->tx_ring_ref);
 netdev->txs = NULL;
 return -1;
 }
@@ -368,11 +369,13 @@ static void net_disconnect(struct XenLegacyDevice *xendev)
 xen_pv_unbind_evtchn(>xendev);
 
 if (netdev->txs) {
-xen_be_unmap_grant_ref(>xendev, netdev->txs);
+

[RFC PATCH v1 03/25] hw/xen: Implement XenStore watches

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Starts out fairly simple: a hash table of watches based on the path.

Except there can be multiple watches on the same path, so the watch ends
up being a simple linked list, and the head of that list is in the hash
table. Which makes removal a bit of a PITA but it's not so bad; we just
special-case "I had to remove the head of the list and now I have to
replace it in / remove it from the hash table". And if we don't remove
the head, it's a simple linked-list operation.

We do need to fire watches on *deleted* nodes, so instead of just a simple
xs_node_unref() on the topmost victim, we need to recurse down and fire
watches on them all.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xenstore_impl.c | 253 +---
 tests/unit/test-xs-node.c   |  85 
 2 files changed, 323 insertions(+), 15 deletions(-)

diff --git a/hw/i386/kvm/xenstore_impl.c b/hw/i386/kvm/xenstore_impl.c
index 9e10a31bea..2e464af93a 100644
--- a/hw/i386/kvm/xenstore_impl.c
+++ b/hw/i386/kvm/xenstore_impl.c
@@ -37,9 +37,20 @@ typedef struct XsNode {
 #endif
 } XsNode;
 
+typedef struct XsWatch {
+struct XsWatch *next;
+xs_impl_watch_fn *cb;
+void *cb_opaque;
+char *token;
+unsigned int dom_id;
+int rel_prefix;
+} XsWatch;
+
 struct XenstoreImplState {
 XsNode *root;
 unsigned int nr_nodes;
+GHashTable *watches;
+unsigned int nr_domu_watches;
 };
 
 static inline XsNode *xs_node_new(void)
@@ -146,6 +157,7 @@ struct walk_op {
 void *op_opaque;
 void *op_opaque2;
 
+GList *watches;
 unsigned int dom_id;
 
 /* The number of nodes which will exist in the tree if this op succeeds. */
@@ -166,6 +178,35 @@ struct walk_op {
 bool create_dirs;
 };
 
+static void fire_watches(struct walk_op *op, bool parents)
+{
+GList *l = NULL;
+XsWatch *w;
+
+if (!op->mutating) {
+return;
+}
+
+if (parents) {
+l = op->watches;
+}
+
+w = g_hash_table_lookup(op->s->watches, op->path);
+while (w || l) {
+if (!w) {
+/* Fire the parent nodes from 'op' if asked to */
+w = l->data;
+l = l->next;
+continue;
+}
+
+assert(strlen(op->path) > w->rel_prefix);
+w->cb(w->cb_opaque, op->path + w->rel_prefix, w->token);
+
+w = w->next;
+}
+}
+
 static int xs_node_add_content(XsNode **n, struct walk_op *op)
 {
 GByteArray *data = op->op_opaque;
@@ -213,6 +254,8 @@ static int xs_node_get_content(XsNode **n, struct walk_op 
*op)
 static int node_rm_recurse(gpointer key, gpointer value, gpointer user_data)
 {
 struct walk_op *op = user_data;
+int path_len = strlen(op->path);
+int key_len = strlen(key);
 XsNode *n = value;
 bool this_inplace = op->inplace;
 
@@ -220,11 +263,22 @@ static int node_rm_recurse(gpointer key, gpointer value, 
gpointer user_data)
 op->inplace = 0;
 }
 
+assert(key_len + path_len + 2 <= sizeof(op->path));
+op->path[path_len] = '/';
+memcpy(op->path + path_len + 1, key, key_len + 1);
+
 if (n->children) {
 g_hash_table_foreach_remove(n->children, node_rm_recurse, op);
 }
 op->new_nr_nodes--;
 
+/*
+ * Fire watches on *this* node but not the parents because they are
+ * going to be deleted too, so the watch will fire for them anyway.
+ */
+fire_watches(op, false);
+op->path[path_len] = '\0';
+
 /*
  * Actually deleting the child here is just an optimisation; if we
  * don't then the final unref on the topmost victim will just have
@@ -238,7 +292,7 @@ static int xs_node_rm(XsNode **n, struct walk_op *op)
 {
 bool this_inplace = op->inplace;
 
-/* Keep count of the nodes in the subtree which gets deleted. */
+/* Fire watches for, and count, nodes in the subtree which get deleted */
 if ((*n)->children) {
 g_hash_table_foreach_remove((*n)->children, node_rm_recurse, op);
 }
@@ -269,9 +323,11 @@ static int xs_node_walk(XsNode **n, struct walk_op *op)
 XsNode *old = *n, *child = NULL;
 bool stole_child = false;
 bool this_inplace;
+XsWatch *watch;
 int err;
 
 namelen = strlen(op->path);
+watch = g_hash_table_lookup(op->s->watches, op->path);
 
 /* Is there a child, or do we hit the double-NUL termination? */
 if (op->path[namelen + 1]) {
@@ -292,6 +348,9 @@ static int xs_node_walk(XsNode **n, struct walk_op *op)
 if (!child_name) {
 /* This is the actual node on which the operation shall be performed */
 err = op->op_fn(n, op);
+if (!err) {
+fire_watches(op, true);
+}
 goto out;
 }
 
@@ -333,11 +392,24 @@ static int xs_node_walk(XsNode **n, struct walk_op *op)
 goto out;
 }
 
+/*
+ * If there's a watch on this node, add it to the list to be fired
+ * (with the correct full pathname for the modified node) at the end.
+ */
+if (watch) {
+

[RFC PATCH v1 12/25] hw/xen: Add foreignmem operations to allow redirection to internal emulation

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Signed-off-by: David Woodhouse 
Signed-off-by: Paul Durrant 
---
 hw/char/xen_console.c|  8 +++---
 hw/display/xenfb.c   | 20 +++---
 hw/xen/xen-operations.c  | 45 
 include/hw/xen/xen_backend_ops.h | 26 ++
 include/hw/xen/xen_common.h  | 13 -
 softmmu/globals.c|  1 +
 tests/unit/test-xs-node.c|  1 +
 7 files changed, 88 insertions(+), 26 deletions(-)

diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index 19ad6c946a..e9cef3e1ef 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -237,9 +237,9 @@ static int con_initialise(struct XenLegacyDevice *xendev)
 
 if (!xendev->dev) {
 xen_pfn_t mfn = con->ring_ref;
-con->sring = xenforeignmemory_map(xen_fmem, con->xendev.dom,
-  PROT_READ | PROT_WRITE,
-  1, , NULL);
+con->sring = qemu_xen_foreignmem_map(con->xendev.dom, NULL,
+ PROT_READ | PROT_WRITE,
+ 1, , NULL);
 } else {
 con->sring = xen_be_map_grant_ref(xendev, con->ring_ref,
   PROT_READ | PROT_WRITE);
@@ -269,7 +269,7 @@ static void con_disconnect(struct XenLegacyDevice *xendev)
 
 if (con->sring) {
 if (!xendev->dev) {
-xenforeignmemory_unmap(xen_fmem, con->sring, 1);
+qemu_xen_foreignmem_unmap(con->sring, 1);
 } else {
 xen_be_unmap_grant_ref(xendev, con->sring, con->ring_ref);
 }
diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 260eb38a76..2c4016fcbd 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -98,8 +98,9 @@ static int common_bind(struct common *c)
 if (xenstore_read_fe_int(>xendev, "event-channel", 
>xendev.remote_port) == -1)
 return -1;
 
-c->page = xenforeignmemory_map(xen_fmem, c->xendev.dom,
-   PROT_READ | PROT_WRITE, 1, , NULL);
+c->page = qemu_xen_foreignmem_map(c->xendev.dom, NULL,
+  PROT_READ | PROT_WRITE, 1, ,
+  NULL);
 if (c->page == NULL)
 return -1;
 
@@ -115,7 +116,7 @@ static void common_unbind(struct common *c)
 {
 xen_pv_unbind_evtchn(>xendev);
 if (c->page) {
-xenforeignmemory_unmap(xen_fmem, c->page, 1);
+qemu_xen_foreignmem_unmap(c->page, 1);
 c->page = NULL;
 }
 }
@@ -500,15 +501,16 @@ static int xenfb_map_fb(struct XenFB *xenfb)
 fbmfns = g_new0(xen_pfn_t, xenfb->fbpages);
 
 xenfb_copy_mfns(mode, n_fbdirs, pgmfns, pd);
-map = xenforeignmemory_map(xen_fmem, xenfb->c.xendev.dom,
-   PROT_READ, n_fbdirs, pgmfns, NULL);
+map = qemu_xen_foreignmem_map(xenfb->c.xendev.dom, NULL, PROT_READ,
+  n_fbdirs, pgmfns, NULL);
 if (map == NULL)
 goto out;
 xenfb_copy_mfns(mode, xenfb->fbpages, fbmfns, map);
-xenforeignmemory_unmap(xen_fmem, map, n_fbdirs);
+qemu_xen_foreignmem_unmap(map, n_fbdirs);
 
-xenfb->pixels = xenforeignmemory_map(xen_fmem, xenfb->c.xendev.dom,
-PROT_READ, xenfb->fbpages, fbmfns, NULL);
+xenfb->pixels = qemu_xen_foreignmem_map(xenfb->c.xendev.dom, NULL,
+PROT_READ, xenfb->fbpages,
+fbmfns, NULL);
 if (xenfb->pixels == NULL)
 goto out;
 
@@ -927,7 +929,7 @@ static void fb_disconnect(struct XenLegacyDevice *xendev)
  *   Replacing the framebuffer with anonymous shared memory
  *   instead.  This releases the guest pages and keeps qemu happy.
  */
-xenforeignmemory_unmap(xen_fmem, fb->pixels, fb->fbpages);
+qemu_xen_foreignmem_unmap(fb->pixels, fb->fbpages);
 fb->pixels = mmap(fb->pixels, fb->fbpages * XC_PAGE_SIZE,
   PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON,
   -1, 0);
diff --git a/hw/xen/xen-operations.c b/hw/xen/xen-operations.c
index 73dabac8e5..61e56a7abe 100644
--- a/hw/xen/xen-operations.c
+++ b/hw/xen/xen-operations.c
@@ -22,6 +22,7 @@
  */
 #undef XC_WANT_COMPAT_EVTCHN_API
 #undef XC_WANT_COMPAT_GNTTAB_API
+#undef XC_WANT_COMPAT_MAP_FOREIGN_API
 
 #include 
 
@@ -56,10 +57,13 @@ typedef xc_gnttab xengnttab_handle;
 #define xengnttab_map_domain_grant_refs(h, c, d, r, p) \
 xc_gnttab_map_domain_grant_refs(h, c, d, r, p)
 
+typedef xc_interface xenforeignmemory_handle;
+
 #else /* CONFIG_XEN_CTRL_INTERFACE_VERSION >= 40701 */
 
 #include 
 #include 
+#include 
 
 #endif
 
@@ -218,6 +222,46 @@ static struct gnttab_backend_ops libxengnttab_backend_ops 
= {
 .unmap = libxengnttab_backend_unmap,
 };
 
+#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 40701
+
+static void *libxenforeignmem_backend_map(uint32_t dom, 

[RFC PATCH v1 14/25] hw/xen: Move xenstore_store_pv_console_info to xen_console.c

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

There's no need for this to be in the Xen accel code, and as we want to
use the Xen console support with KVM-emulated Xen we'll want to have a
platform-agnostic version of it. Make it use GString to build up the
path while we're at it.

Signed-off-by: David Woodhouse 
---
 accel/xen/xen-all.c   | 61 ---
 hw/char/xen_console.c | 45 +--
 include/hw/xen/xen.h  |  2 --
 3 files changed, 43 insertions(+), 65 deletions(-)

diff --git a/accel/xen/xen-all.c b/accel/xen/xen-all.c
index 425216230f..2d51c41e40 100644
--- a/accel/xen/xen-all.c
+++ b/accel/xen/xen-all.c
@@ -29,67 +29,6 @@ xc_interface *xen_xc;
 xenforeignmemory_handle *xen_fmem;
 xendevicemodel_handle *xen_dmod;
 
-static int store_dev_info(int domid, Chardev *cs, const char *string)
-{
-struct xs_handle *xs = NULL;
-char *path = NULL;
-char *newpath = NULL;
-char *pts = NULL;
-int ret = -1;
-
-/* Only continue if we're talking to a pty. */
-if (!CHARDEV_IS_PTY(cs)) {
-return 0;
-}
-pts = cs->filename + 4;
-
-/* We now have everything we need to set the xenstore entry. */
-xs = xs_open(0);
-if (xs == NULL) {
-fprintf(stderr, "Could not contact XenStore\n");
-goto out;
-}
-
-path = xs_get_domain_path(xs, domid);
-if (path == NULL) {
-fprintf(stderr, "xs_get_domain_path() error\n");
-goto out;
-}
-newpath = realloc(path, (strlen(path) + strlen(string) +
-strlen("/tty") + 1));
-if (newpath == NULL) {
-fprintf(stderr, "realloc error\n");
-goto out;
-}
-path = newpath;
-
-strcat(path, string);
-strcat(path, "/tty");
-if (!xs_write(xs, XBT_NULL, path, pts, strlen(pts))) {
-fprintf(stderr, "xs_write for '%s' fail", string);
-goto out;
-}
-ret = 0;
-
-out:
-free(path);
-xs_close(xs);
-
-return ret;
-}
-
-void xenstore_store_pv_console_info(int i, Chardev *chr)
-{
-if (i == 0) {
-store_dev_info(xen_domid, chr, "/console");
-} else {
-char buf[32];
-snprintf(buf, sizeof(buf), "/device/console/%d", i);
-store_dev_info(xen_domid, chr, buf);
-}
-}
-
-
 static void xenstore_record_dm_state(const char *state)
 {
 struct xs_handle *xs;
diff --git a/hw/char/xen_console.c b/hw/char/xen_console.c
index ad8638a86d..c7a19c0e7c 100644
--- a/hw/char/xen_console.c
+++ b/hw/char/xen_console.c
@@ -173,6 +173,48 @@ static void xencons_send(struct XenConsole *con)
 
 /*  */
 
+static int store_con_info(struct XenConsole *con)
+{
+Chardev *cs = qemu_chr_fe_get_driver(>chr);
+char *pts = NULL;
+char *dom_path;
+GString *path;
+int ret = -1;
+
+/* Only continue if we're talking to a pty. */
+if (!CHARDEV_IS_PTY(cs)) {
+return 0;
+}
+pts = cs->filename + 4;
+
+dom_path = qemu_xen_xs_get_domain_path(xenstore, xen_domid);
+if (!dom_path) {
+return 0;
+}
+
+path = g_string_new(dom_path);
+free(dom_path);
+
+if (con->xendev.dev) {
+g_string_append_printf(path, "/device/console/%d", con->xendev.dev);
+} else {
+g_string_append(path, "/console");
+}
+g_string_append(path, "/tty");
+
+if (xenstore_write_str(con->console, path->str, pts)) {
+fprintf(stderr, "xenstore_write_str for '%s' fail", path->str);
+goto out;
+}
+ret = 0;
+
+out:
+g_string_free(path, true);
+free(path);
+
+return ret;
+}
+
 static int con_init(struct XenLegacyDevice *xendev)
 {
 struct XenConsole *con = container_of(xendev, struct XenConsole, xendev);
@@ -215,8 +257,7 @@ static int con_init(struct XenLegacyDevice *xendev)
  _abort);
 }
 
-xenstore_store_pv_console_info(con->xendev.dev,
-   qemu_chr_fe_get_driver(>chr));
+store_con_info(con);
 
 out:
 g_free(type);
diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index 03983939f9..56b1c2a827 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -39,8 +39,6 @@ int xen_is_pirq_msi(uint32_t msi_data);
 
 qemu_irq *xen_interrupt_controller_init(void);
 
-void xenstore_store_pv_console_info(int i, Chardev *chr);
-
 void xen_register_framebuffer(struct MemoryRegion *mr);
 
 #endif /* QEMU_HW_XEN_H */
-- 
2.39.0




[RFC PATCH v1 02/25] hw/xen: Add basic XenStore tree walk and write/read/directory support

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

This is a fairly simple implementation of a copy-on-write tree.

The node walk function starts off at the root, with 'inplace == true'.
If it ever encounters a node with a refcount greater than one (including
the root node), then that node is shared with other trees, and cannot
be modified in place, so the inplace flag is cleared and we copy on
write from there on down.

Xenstore write has 'mkdir -p' semantics and will create the intermediate
nodes if they don't already exist, so in that case we flip the inplace
flag back to true as as populated the newly-created nodes.

We put a copy of the absolute path into the buffer in the struct walk_op,
with *two* NUL terminators at the end. As xs_node_walk() goes down the
tree, it replaces the next '/' separator with a NUL so that it can use
the 'child name' in place. The next recursion down then puts the '/'
back and repeats the exercise for the next path element... if it doesn't
hit that *second* NUL termination which indicates the true end of the
path.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xenstore_impl.c | 527 +++-
 tests/unit/meson.build  |   1 +
 tests/unit/test-xs-node.c   | 197 ++
 3 files changed, 718 insertions(+), 7 deletions(-)
 create mode 100644 tests/unit/test-xs-node.c

diff --git a/hw/i386/kvm/xenstore_impl.c b/hw/i386/kvm/xenstore_impl.c
index 31dbc98fe0..9e10a31bea 100644
--- a/hw/i386/kvm/xenstore_impl.c
+++ b/hw/i386/kvm/xenstore_impl.c
@@ -10,13 +10,470 @@
  */
 
 #include "qemu/osdep.h"
+#include "qom/object.h"
 
 #include "xen_xenstore.h"
 #include "xenstore_impl.h"
 
+#include "hw/xen/interface/io/xs_wire.h"
+
+#define XS_MAX_WATCHES  128
+#define XS_MAX_DOMAIN_NODES 1000
+#define XS_MAX_NODE_SIZE2048
+#define XS_MAX_TRANSACTIONS 10
+#define XS_MAX_PERMS_PER_NODE   5
+
+#define XS_VALID_CHARS "abcdefghijklmnopqrstuvwxyz" \
+   "ABCDEFGHIJKLMNOPQRSTUVWXYZ" \
+   "0123456789-/_"
+
+typedef struct XsNode {
+uint32_t ref;
+GByteArray *content;
+GHashTable *children;
+uint64_t gencnt;
+#ifdef XS_NODE_UNIT_TEST
+gchar *name; /* debug only */
+#endif
+} XsNode;
+
 struct XenstoreImplState {
+XsNode *root;
+unsigned int nr_nodes;
 };
 
+static inline XsNode *xs_node_new(void)
+{
+XsNode *n = g_new0(XsNode, 1);
+n->ref = 1;
+
+#ifdef XS_NODE_UNIT_TEST
+nr_xs_nodes++;
+xs_node_list = g_list_prepend(xs_node_list, n);
+#endif
+return n;
+}
+
+static inline XsNode *xs_node_ref(XsNode *n)
+{
+/* With just 10 transactions, it can never get anywhere near this. */
+g_assert(n->ref < INT_MAX);
+
+g_assert(n->ref);
+n->ref++;
+return n;
+}
+
+static inline void xs_node_unref(XsNode *n)
+{
+if (!n) {
+return;
+}
+g_assert(n->ref);
+if (--n->ref) {
+return;
+}
+
+if (n->content) {
+g_byte_array_unref(n->content);
+}
+if (n->children) {
+g_hash_table_unref(n->children);
+}
+#ifdef XS_NODE_UNIT_TEST
+g_free(n->name);
+nr_xs_nodes--;
+xs_node_list = g_list_remove(xs_node_list, n);
+#endif
+g_free(n);
+}
+
+/* For copying from one hash table to another using g_hash_table_foreach() */
+static void do_insert(gpointer key, gpointer value, gpointer user_data)
+{
+g_hash_table_insert(user_data, g_strdup(key), xs_node_ref(value));
+}
+
+static XsNode *xs_node_copy(XsNode *old)
+{
+XsNode *n = xs_node_new();
+
+n->gencnt = old->gencnt;
+if (old->children) {
+n->children = g_hash_table_new_full(g_str_hash, g_str_equal, g_free,
+(GDestroyNotify)xs_node_unref);
+g_hash_table_foreach(old->children, do_insert, n->children);
+}
+if (old && old->content) {
+n->content = g_byte_array_ref(old->content);
+}
+return n;
+}
+
+/* Returns true if it made a change to the hash table */
+static bool xs_node_add_child(XsNode *n, const char *path_elem, XsNode *child)
+{
+assert(!strchr(path_elem, '/'));
+
+if (!child) {
+assert(n->children);
+return g_hash_table_remove(n->children, path_elem);
+}
+
+#ifdef XS_NODE_UNIT_TEST
+g_free(child->name);
+child->name = g_strdup(path_elem);
+#endif
+if (!n->children) {
+n->children = g_hash_table_new_full(g_str_hash, g_str_equal, g_free,
+(GDestroyNotify)xs_node_unref);
+}
+
+/*
+ * The documentation for g_hash_table_insert() says that it "returns a
+ * boolean value to indicate whether the newly added value was already
+ * in the hash table or not."
+ *
+ * It could perhaps be clearer that returning TRUE means it wasn't,
+ */
+return g_hash_table_insert(n->children, g_strdup(path_elem), child);
+}
+
+struct walk_op {
+struct XenstoreImplState *s;
+char path[XENSTORE_ABS_PATH_MAX + 2]; /* Two NUL terminators */
+int 

[RFC PATCH v1 17/25] hw/xen: Build PV backend drivers for CONFIG_XEN_BUS

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Now that we have the redirectable Xen backend operations we can build the
PV backends even without the Xen libraries.

Signed-off-by: David Woodhouse 
---
 hw/9pfs/meson.build| 2 +-
 hw/block/dataplane/meson.build | 2 +-
 hw/block/meson.build   | 2 +-
 hw/char/meson.build| 2 +-
 hw/display/meson.build | 2 +-
 hw/usb/meson.build | 2 +-
 hw/xen/meson.build | 5 -
 7 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/hw/9pfs/meson.build b/hw/9pfs/meson.build
index 12443b6ad5..fd37b7a02d 100644
--- a/hw/9pfs/meson.build
+++ b/hw/9pfs/meson.build
@@ -15,7 +15,7 @@ fs_ss.add(files(
 ))
 fs_ss.add(when: 'CONFIG_LINUX', if_true: files('9p-util-linux.c'))
 fs_ss.add(when: 'CONFIG_DARWIN', if_true: files('9p-util-darwin.c'))
-fs_ss.add(when: 'CONFIG_XEN', if_true: files('xen-9p-backend.c'))
+fs_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xen-9p-backend.c'))
 softmmu_ss.add_all(when: 'CONFIG_FSDEV_9P', if_true: fs_ss)
 
 specific_ss.add(when: 'CONFIG_VIRTIO_9P', if_true: files('virtio-9p-device.c'))
diff --git a/hw/block/dataplane/meson.build b/hw/block/dataplane/meson.build
index 12c6a264f1..78d7ac1a11 100644
--- a/hw/block/dataplane/meson.build
+++ b/hw/block/dataplane/meson.build
@@ -1,2 +1,2 @@
 specific_ss.add(when: 'CONFIG_VIRTIO_BLK', if_true: files('virtio-blk.c'))
-specific_ss.add(when: 'CONFIG_XEN', if_true: files('xen-block.c'))
+specific_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xen-block.c'))
diff --git a/hw/block/meson.build b/hw/block/meson.build
index b434d5654c..cc2a75cc50 100644
--- a/hw/block/meson.build
+++ b/hw/block/meson.build
@@ -14,7 +14,7 @@ softmmu_ss.add(when: 'CONFIG_PFLASH_CFI02', if_true: 
files('pflash_cfi02.c'))
 softmmu_ss.add(when: 'CONFIG_SSI_M25P80', if_true: files('m25p80.c'))
 softmmu_ss.add(when: 'CONFIG_SSI_M25P80', if_true: files('m25p80_sfdp.c'))
 softmmu_ss.add(when: 'CONFIG_SWIM', if_true: files('swim.c'))
-softmmu_ss.add(when: 'CONFIG_XEN', if_true: files('xen-block.c'))
+softmmu_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xen-block.c'))
 softmmu_ss.add(when: 'CONFIG_TC58128', if_true: files('tc58128.c'))
 
 specific_ss.add(when: 'CONFIG_VIRTIO_BLK', if_true: files('virtio-blk.c', 
'virtio-blk-common.c'))
diff --git a/hw/char/meson.build b/hw/char/meson.build
index 7b594f51b8..e02c60dd54 100644
--- a/hw/char/meson.build
+++ b/hw/char/meson.build
@@ -18,7 +18,7 @@ softmmu_ss.add(when: 'CONFIG_SERIAL_PCI', if_true: 
files('serial-pci.c'))
 softmmu_ss.add(when: 'CONFIG_SERIAL_PCI_MULTI', if_true: 
files('serial-pci-multi.c'))
 softmmu_ss.add(when: 'CONFIG_SHAKTI_UART', if_true: files('shakti_uart.c'))
 softmmu_ss.add(when: 'CONFIG_VIRTIO_SERIAL', if_true: 
files('virtio-console.c'))
-softmmu_ss.add(when: 'CONFIG_XEN', if_true: files('xen_console.c'))
+softmmu_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xen_console.c'))
 softmmu_ss.add(when: 'CONFIG_XILINX', if_true: files('xilinx_uartlite.c'))
 
 softmmu_ss.add(when: 'CONFIG_AVR_USART', if_true: files('avr_usart.c'))
diff --git a/hw/display/meson.build b/hw/display/meson.build
index f470179122..4191694380 100644
--- a/hw/display/meson.build
+++ b/hw/display/meson.build
@@ -14,7 +14,7 @@ softmmu_ss.add(when: 'CONFIG_PL110', if_true: 
files('pl110.c'))
 softmmu_ss.add(when: 'CONFIG_SII9022', if_true: files('sii9022.c'))
 softmmu_ss.add(when: 'CONFIG_SSD0303', if_true: files('ssd0303.c'))
 softmmu_ss.add(when: 'CONFIG_SSD0323', if_true: files('ssd0323.c'))
-softmmu_ss.add(when: 'CONFIG_XEN', if_true: files('xenfb.c'))
+softmmu_ss.add(when: 'CONFIG_XEN_BUS', if_true: files('xenfb.c'))
 
 softmmu_ss.add(when: 'CONFIG_VGA_PCI', if_true: files('vga-pci.c'))
 softmmu_ss.add(when: 'CONFIG_VGA_ISA', if_true: files('vga-isa.c'))
diff --git a/hw/usb/meson.build b/hw/usb/meson.build
index bdf34cbd3e..599dc24f0d 100644
--- a/hw/usb/meson.build
+++ b/hw/usb/meson.build
@@ -84,6 +84,6 @@ if libusb.found()
   hw_usb_modules += {'host': usbhost_ss}
 endif
 
-softmmu_ss.add(when: ['CONFIG_USB', 'CONFIG_XEN', libusb], if_true: 
files('xen-usb.c'))
+softmmu_ss.add(when: ['CONFIG_USB', 'CONFIG_XEN_BUS', libusb], if_true: 
files('xen-usb.c'))
 
 modules += { 'hw-usb': hw_usb_modules }
diff --git a/hw/xen/meson.build b/hw/xen/meson.build
index f195bbd25c..19c6aabc7c 100644
--- a/hw/xen/meson.build
+++ b/hw/xen/meson.build
@@ -1,10 +1,13 @@
-softmmu_ss.add(when: ['CONFIG_XEN', xen], if_true: files(
+softmmu_ss.add(when: ['CONFIG_XEN_BUS'], if_true: files(
   'xen-backend.c',
   'xen-bus-helper.c',
   'xen-bus.c',
   'xen-legacy-backend.c',
   'xen_devconfig.c',
   'xen_pvdev.c',
+))
+
+softmmu_ss.add(when: ['CONFIG_XEN', xen], if_true: files(
   'xen-operations.c',
 ))
 
-- 
2.39.0




[RFC PATCH v1 16/25] hw/xen: Rename xen_common.h to xen_native.h

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

This header is now only for native Xen code, not PV backends that may be
used in Xen emulation. Since the toolstack libraries may depend on the
specific version of Xen headers that they pull in (and will set the
__XEN_TOOLS__ macro to enable internal definitions that they depend on),
the rule is that xen_native.h (and thus the toolstack library headers)
must be included *before* any of the headers in include/hw/xen/interface.

Signed-off-by: David Woodhouse 
---
 accel/xen/xen-all.c   |  1 +
 hw/9pfs/xen-9p-backend.c  |  1 +
 hw/block/dataplane/xen-block.c|  3 ++-
 hw/block/xen-block.c  |  1 -
 hw/i386/pc_piix.c |  4 ++--
 hw/i386/xen/xen-hvm.c | 11 +-
 hw/i386/xen/xen-mapcache.c|  2 +-
 hw/i386/xen/xen_platform.c|  7 +++---
 hw/xen/trace-events   |  2 +-
 hw/xen/xen-operations.c   |  2 +-
 hw/xen/xen_pt.c   |  2 +-
 hw/xen/xen_pt.h   |  2 +-
 hw/xen/xen_pt_config_init.c   |  2 +-
 hw/xen/xen_pt_msi.c   |  4 ++--
 include/hw/xen/xen.h  | 22 ---
 include/hw/xen/{xen_common.h => xen_native.h} | 10 ++---
 include/hw/xen/xen_pvdev.h|  3 ++-
 17 files changed, 47 insertions(+), 32 deletions(-)
 rename include/hw/xen/{xen_common.h => xen_native.h} (98%)

diff --git a/accel/xen/xen-all.c b/accel/xen/xen-all.c
index 2d51c41e40..00221e23c5 100644
--- a/accel/xen/xen-all.c
+++ b/accel/xen/xen-all.c
@@ -12,6 +12,7 @@
 #include "qemu/error-report.h"
 #include "qemu/module.h"
 #include "qapi/error.h"
+#include "hw/xen/xen_native.h"
 #include "hw/xen/xen-legacy-backend.h"
 #include "hw/xen/xen_pt.h"
 #include "chardev/char.h"
diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index d8bb0e847c..74f3a05f88 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -22,6 +22,7 @@
 #include "qemu/config-file.h"
 #include "qemu/main-loop.h"
 #include "qemu/option.h"
+#include "qemu/iov.h"
 #include "fsdev/qemu-fsdev.h"
 
 #define VERSIONS "1"
diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 8322a1de82..734da42ea7 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -23,8 +23,9 @@
 #include "qemu/main-loop.h"
 #include "qemu/memalign.h"
 #include "qapi/error.h"
-#include "hw/xen/xen_common.h"
+#include "hw/xen/xen.h"
 #include "hw/block/xen_blkif.h"
+#include "hw/xen/interface/io/ring.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/iothread.h"
 #include "xen-block.h"
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 345b284d70..87299615e3 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -19,7 +19,6 @@
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qstring.h"
 #include "qom/object_interfaces.h"
-#include "hw/xen/xen_common.h"
 #include "hw/block/xen_blkif.h"
 #include "hw/qdev-properties.h"
 #include "hw/xen/xen-block.h"
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 126b6c11df..e0f768b664 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -47,8 +47,6 @@
 #include "hw/kvm/clock.h"
 #include "hw/sysbus.h"
 #include "hw/i2c/smbus_eeprom.h"
-#include "hw/xen/xen-x86.h"
-#include "hw/xen/xen.h"
 #include "exec/memory.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/piix4.h"
@@ -60,6 +58,8 @@
 #include 
 #include "hw/xen/xen_pt.h"
 #endif
+#include "hw/xen/xen-x86.h"
+#include "hw/xen/xen.h"
 #include "migration/global_state.h"
 #include "migration/misc.h"
 #include "sysemu/numa.h"
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index cb1d24f592..56641a550e 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -18,7 +18,7 @@
 #include "hw/irq.h"
 #include "hw/hw.h"
 #include "hw/i386/apic-msidef.h"
-#include "hw/xen/xen_common.h"
+#include "hw/xen/xen_native.h"
 #include "hw/xen/xen-legacy-backend.h"
 #include "hw/xen/xen-bus.h"
 #include "hw/xen/xen-x86.h"
@@ -52,10 +52,11 @@ static bool xen_in_migration;
 
 /* Compatibility with older version */
 
-/* This allows QEMU to build on a system that has Xen 4.5 or earlier
- * installed.  This here (not in hw/xen/xen_common.h) because xen/hvm/ioreq.h
- * needs to be included before this block and hw/xen/xen_common.h needs to
- * be included before xen/hvm/ioreq.h
+/*
+ * This allows QEMU to build on a system that has Xen 4.5 or earlier installed.
+ * This is here (not in hw/xen/xen_native.h) because xen/hvm/ioreq.h needs to
+ * be included before this block and hw/xen/xen_native.h needs to be included
+ * before xen/hvm/ioreq.h
  */
 #ifndef IOREQ_TYPE_VMWARE_PORT
 #define IOREQ_TYPE_VMWARE_PORT  3
diff --git a/hw/i386/xen/xen-mapcache.c b/hw/i386/xen/xen-mapcache.c
index 1d0879d234..f7d974677d 100644
--- 

[RFC PATCH v1 08/25] hw/xen: Create initial XenStore nodes

2023-03-02 Thread David Woodhouse
From: Paul Durrant 

Signed-off-by: Paul Durrant 
Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_xenstore.c | 70 ++
 1 file changed, 70 insertions(+)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 1b1358ad4c..5a8e38aae7 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -76,9 +76,39 @@ struct XenXenstoreState *xen_xenstore_singleton;
 static void xen_xenstore_event(void *opaque);
 static void fire_watch_cb(void *opaque, const char *path, const char *token);
 
+static void G_GNUC_PRINTF (4, 5) relpath_printf(XenXenstoreState *s,
+GList *perms,
+const char *relpath,
+const char *fmt, ...)
+{
+gchar *abspath;
+gchar *value;
+va_list args;
+GByteArray *data;
+int err;
+
+abspath = g_strdup_printf("/local/domain/%u/%s", xen_domid, relpath);
+va_start(args, fmt);
+value = g_strdup_vprintf(fmt, args);
+va_end(args);
+
+data = g_byte_array_new_take((void *)value, strlen(value));
+
+err = xs_impl_write(s->impl, DOMID_QEMU, XBT_NULL, abspath, data);
+assert(!err);
+
+g_byte_array_unref(data);
+
+err = xs_impl_set_perms(s->impl, DOMID_QEMU, XBT_NULL, abspath, perms);
+assert(!err);
+
+g_free(abspath);
+}
+
 static void xen_xenstore_realize(DeviceState *dev, Error **errp)
 {
 XenXenstoreState *s = XEN_XENSTORE(dev);
+GList *perms;
 
 if (xen_mode != XEN_EMULATE) {
 error_setg(errp, "Xen xenstore support is for Xen emulation");
@@ -102,6 +132,46 @@ static void xen_xenstore_realize(DeviceState *dev, Error 
**errp)
xen_xenstore_event, NULL, NULL, NULL, s);
 
 s->impl = xs_impl_create(xen_domid);
+
+/* Populate the default nodes */
+
+/* Nodes owned by 'dom0' but readable by the guest */
+perms = g_list_append(NULL, xs_perm_as_string(XS_PERM_NONE, DOMID_QEMU));
+perms = g_list_append(perms, xs_perm_as_string(XS_PERM_READ, xen_domid));
+
+relpath_printf(s, perms, "", "%s", "");
+
+relpath_printf(s, perms, "domid", "%u", xen_domid);
+
+relpath_printf(s, perms, "control/platform-feature-xs_reset_watches", 
"%u", 1);
+relpath_printf(s, perms, 
"control/platform-feature-multiprocessor-suspend", "%u", 1);
+
+relpath_printf(s, perms, "platform/acpi", "%u", 1);
+relpath_printf(s, perms, "platform/acpi_s3", "%u", 1);
+relpath_printf(s, perms, "platform/acpi_s4", "%u", 1);
+relpath_printf(s, perms, "platform/acpi_laptop_slate", "%u", 0);
+
+g_list_free_full(perms, g_free);
+
+/* Nodes owned by the guest */
+perms = g_list_append(NULL, xs_perm_as_string(XS_PERM_NONE, xen_domid));
+
+relpath_printf(s, perms, "attr", "%s", "");
+
+relpath_printf(s, perms, "control/shutdown", "%s", "");
+relpath_printf(s, perms, "control/feature-poweroff", "%u", 1);
+relpath_printf(s, perms, "control/feature-reboot", "%u", 1);
+relpath_printf(s, perms, "control/feature-suspend", "%u", 1);
+relpath_printf(s, perms, "control/feature-s3", "%u", 1);
+relpath_printf(s, perms, "control/feature-s4", "%u", 1);
+
+relpath_printf(s, perms, "data", "%s", "");
+relpath_printf(s, perms, "device", "%s", "");
+relpath_printf(s, perms, "drivers", "%s", "");
+relpath_printf(s, perms, "error", "%s", "");
+relpath_printf(s, perms, "feature", "%s", "");
+
+g_list_free_full(perms, g_free);
 }
 
 static bool xen_xenstore_is_needed(void *opaque)
-- 
2.39.0




[RFC PATCH v1 15/25] hw/xen: Use XEN_PAGE_SIZE in PV backend drivers

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

XC_PAGE_SIZE comes from the actual Xen libraries, while XEN_PAGE_SIZE is
provided by QEMU itself in xen_backend_ops.h. For backends which may be
built for emulation mode, use the latter.

Signed-off-by: David Woodhouse 
---
 hw/block/dataplane/xen-block.c |  8 
 hw/display/xenfb.c | 12 ++--
 hw/net/xen_nic.c   | 12 ++--
 hw/usb/xen-usb.c   |  8 
 4 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index e55b713002..8322a1de82 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -101,9 +101,9 @@ static XenBlockRequest 
*xen_block_start_request(XenBlockDataPlane *dataplane)
  * re-use requests, allocate the memory once here. It will be freed
  * xen_block_dataplane_destroy() when the request list is freed.
  */
-request->buf = qemu_memalign(XC_PAGE_SIZE,
+request->buf = qemu_memalign(XEN_PAGE_SIZE,
  BLKIF_MAX_SEGMENTS_PER_REQUEST *
- XC_PAGE_SIZE);
+ XEN_PAGE_SIZE);
 dataplane->requests_total++;
 qemu_iovec_init(>v, 1);
 } else {
@@ -185,7 +185,7 @@ static int xen_block_parse_request(XenBlockRequest *request)
 goto err;
 }
 if (request->req.seg[i].last_sect * dataplane->sector_size >=
-XC_PAGE_SIZE) {
+XEN_PAGE_SIZE) {
 error_report("error: page crossing");
 goto err;
 }
@@ -740,7 +740,7 @@ void xen_block_dataplane_start(XenBlockDataPlane *dataplane,
 
 dataplane->protocol = protocol;
 
-ring_size = XC_PAGE_SIZE * dataplane->nr_ring_ref;
+ring_size = XEN_PAGE_SIZE * dataplane->nr_ring_ref;
 switch (dataplane->protocol) {
 case BLKIF_PROTOCOL_NATIVE:
 {
diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 2c4016fcbd..0074a9b6f8 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -489,13 +489,13 @@ static int xenfb_map_fb(struct XenFB *xenfb)
 }
 
 if (xenfb->pixels) {
-munmap(xenfb->pixels, xenfb->fbpages * XC_PAGE_SIZE);
+munmap(xenfb->pixels, xenfb->fbpages * XEN_PAGE_SIZE);
 xenfb->pixels = NULL;
 }
 
-xenfb->fbpages = DIV_ROUND_UP(xenfb->fb_len, XC_PAGE_SIZE);
+xenfb->fbpages = DIV_ROUND_UP(xenfb->fb_len, XEN_PAGE_SIZE);
 n_fbdirs = xenfb->fbpages * mode / 8;
-n_fbdirs = DIV_ROUND_UP(n_fbdirs, XC_PAGE_SIZE);
+n_fbdirs = DIV_ROUND_UP(n_fbdirs, XEN_PAGE_SIZE);
 
 pgmfns = g_new0(xen_pfn_t, n_fbdirs);
 fbmfns = g_new0(xen_pfn_t, xenfb->fbpages);
@@ -528,8 +528,8 @@ static int xenfb_configure_fb(struct XenFB *xenfb, size_t 
fb_len_lim,
 {
 size_t mfn_sz = sizeof_field(struct xenfb_page, pd[0]);
 size_t pd_len = sizeof_field(struct xenfb_page, pd) / mfn_sz;
-size_t fb_pages = pd_len * XC_PAGE_SIZE / mfn_sz;
-size_t fb_len_max = fb_pages * XC_PAGE_SIZE;
+size_t fb_pages = pd_len * XEN_PAGE_SIZE / mfn_sz;
+size_t fb_len_max = fb_pages * XEN_PAGE_SIZE;
 int max_width, max_height;
 
 if (fb_len_lim > fb_len_max) {
@@ -930,7 +930,7 @@ static void fb_disconnect(struct XenLegacyDevice *xendev)
  *   instead.  This releases the guest pages and keeps qemu happy.
  */
 qemu_xen_foreignmem_unmap(fb->pixels, fb->fbpages);
-fb->pixels = mmap(fb->pixels, fb->fbpages * XC_PAGE_SIZE,
+fb->pixels = mmap(fb->pixels, fb->fbpages * XEN_PAGE_SIZE,
   PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON,
   -1, 0);
 if (fb->pixels == MAP_FAILED) {
diff --git a/hw/net/xen_nic.c b/hw/net/xen_nic.c
index 166d03787d..9bbf6599fc 100644
--- a/hw/net/xen_nic.c
+++ b/hw/net/xen_nic.c
@@ -145,7 +145,7 @@ static void net_tx_packets(struct XenNetDev *netdev)
 continue;
 }
 
-if ((txreq.offset + txreq.size) > XC_PAGE_SIZE) {
+if ((txreq.offset + txreq.size) > XEN_PAGE_SIZE) {
 xen_pv_printf(>xendev, 0, "error: page crossing\n");
 net_tx_error(netdev, , rc);
 continue;
@@ -171,7 +171,7 @@ static void net_tx_packets(struct XenNetDev *netdev)
 if (txreq.flags & NETTXF_csum_blank) {
 /* have read-only mapping -> can't fill checksum in-place */
 if (!tmpbuf) {
-tmpbuf = g_malloc(XC_PAGE_SIZE);
+tmpbuf = g_malloc(XEN_PAGE_SIZE);
 }
 memcpy(tmpbuf, page + txreq.offset, txreq.size);
 net_checksum_calculate(tmpbuf, txreq.size, CSUM_ALL);
@@ -243,9 +243,9 @@ static ssize_t net_rx_packet(NetClientState *nc, const 
uint8_t *buf, size_t size
 if (rc == rp || RING_REQUEST_CONS_OVERFLOW(>rx_ring, rc)) {
 return 0;
 }
-if (size > XC_PAGE_SIZE - NET_IP_ALIGN) {
+if (size > 

[RFC PATCH v1 06/25] hw/xen: Implement XenStore permissions

2023-03-02 Thread David Woodhouse
From: Paul Durrant 

Store perms as a GList of strings, check permissions.

Signed-off-by: Paul Durrant 
Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_xenstore.c  |   2 +-
 hw/i386/kvm/xenstore_impl.c | 259 +---
 hw/i386/kvm/xenstore_impl.h |   8 +-
 tests/unit/test-xs-node.c   |  27 +++-
 4 files changed, 275 insertions(+), 21 deletions(-)

diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c
index 64d8f1a38f..3b409e3817 100644
--- a/hw/i386/kvm/xen_xenstore.c
+++ b/hw/i386/kvm/xen_xenstore.c
@@ -98,7 +98,7 @@ static void xen_xenstore_realize(DeviceState *dev, Error 
**errp)
 aio_set_fd_handler(qemu_get_aio_context(), xen_be_evtchn_fd(s->eh), true,
xen_xenstore_event, NULL, NULL, NULL, s);
 
-s->impl = xs_impl_create();
+s->impl = xs_impl_create(xen_domid);
 }
 
 static bool xen_xenstore_is_needed(void *opaque)
diff --git a/hw/i386/kvm/xenstore_impl.c b/hw/i386/kvm/xenstore_impl.c
index 380f8003ec..7988bde88f 100644
--- a/hw/i386/kvm/xenstore_impl.c
+++ b/hw/i386/kvm/xenstore_impl.c
@@ -12,6 +12,8 @@
 #include "qemu/osdep.h"
 #include "qom/object.h"
 
+#include "hw/xen/xen.h"
+
 #include "xen_xenstore.h"
 #include "xenstore_impl.h"
 
@@ -30,6 +32,7 @@
 typedef struct XsNode {
 uint32_t ref;
 GByteArray *content;
+GList *perms;
 GHashTable *children;
 uint64_t gencnt;
 bool deleted_in_tx;
@@ -133,6 +136,9 @@ static inline void xs_node_unref(XsNode *n)
 if (n->content) {
 g_byte_array_unref(n->content);
 }
+if (n->perms) {
+g_list_free_full(n->perms, g_free);
+}
 if (n->children) {
 g_hash_table_unref(n->children);
 }
@@ -144,8 +150,51 @@ static inline void xs_node_unref(XsNode *n)
 g_free(n);
 }
 
+char *xs_perm_as_string(unsigned int perm, unsigned int domid)
+{
+char letter;
+
+switch (perm) {
+case XS_PERM_READ | XS_PERM_WRITE:
+letter = 'b';
+break;
+case XS_PERM_READ:
+letter = 'r';
+break;
+case XS_PERM_WRITE:
+letter = 'w';
+break;
+case XS_PERM_NONE:
+default:
+letter = 'n';
+break;
+}
+
+return g_strdup_printf("%c%u", letter, domid);
+}
+
+static gpointer do_perm_copy(gconstpointer src, gpointer user_data)
+{
+return g_strdup(src);
+}
+
+static XsNode *xs_node_create(const char *name, GList *perms)
+{
+XsNode *n = xs_node_new();
+
+#ifdef XS_NODE_UNIT_TEST
+if (name) {
+n->name = g_strdup(name);
+}
+#endif
+
+n->perms = g_list_copy_deep(perms, do_perm_copy, NULL);
+
+return n;
+}
+
 /* For copying from one hash table to another using g_hash_table_foreach() */
-static void do_insert(gpointer key, gpointer value, gpointer user_data)
+static void do_child_insert(gpointer key, gpointer value, gpointer user_data)
 {
 g_hash_table_insert(user_data, g_strdup(key), xs_node_ref(value));
 }
@@ -162,12 +211,16 @@ static XsNode *xs_node_copy(XsNode *old)
 }
 #endif
 
+assert(old);
 if (old->children) {
 n->children = g_hash_table_new_full(g_str_hash, g_str_equal, g_free,
 (GDestroyNotify)xs_node_unref);
-g_hash_table_foreach(old->children, do_insert, n->children);
+g_hash_table_foreach(old->children, do_child_insert, n->children);
 }
-if (old && old->content) {
+if (old->perms) {
+n->perms = g_list_copy_deep(old->perms, do_perm_copy, NULL);
+}
+if (old->content) {
 n->content = g_byte_array_ref(old->content);
 }
 return n;
@@ -383,6 +436,9 @@ static XsNode *xs_node_copy_deleted(XsNode *old, struct 
walk_op *op)
 op->op_opaque2 = n->children;
 g_hash_table_foreach(old->children, copy_deleted_recurse, op);
 }
+if (old->perms) {
+n->perms = g_list_copy_deep(old->perms, do_perm_copy, NULL);
+}
 n->deleted_in_tx = true;
 /* If it gets resurrected we only fire a watch if it lost its content */
 if (old->content) {
@@ -417,6 +473,104 @@ static int xs_node_rm(XsNode **n, struct walk_op *op)
 return 0;
 }
 
+static int xs_node_get_perms(XsNode **n, struct walk_op *op)
+{
+GList **perms = op->op_opaque;
+
+assert(op->inplace);
+assert(*n);
+
+*perms = g_list_copy_deep((*n)->perms, do_perm_copy, NULL);
+return 0;
+}
+
+static void parse_perm(const char *perm, char *letter, unsigned int *dom_id)
+{
+unsigned int n = sscanf(perm, "%c%u", letter, dom_id);
+
+assert(n == 2);
+}
+
+static bool can_access(unsigned int dom_id, GList *perms, const char *letters)
+{
+unsigned int i, n;
+char perm_letter;
+unsigned int perm_dom_id;
+bool access;
+
+if (dom_id == 0) {
+return true;
+}
+
+n = g_list_length(perms);
+assert(n >= 1);
+
+/*
+ * The dom_id of the first perm is the owner, and the owner always has
+ * read-write access.
+ */
+parse_perm(g_list_nth_data(perms, 0), 

[RFC PATCH v1 20/25] hw/xen: Hook up emulated implementation for event channel operations

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

We provided the backend-facing evtchn functions very early on as part of
the core Xen platform support, since things like timers and xenstore need
to use them.

By what may or may not be an astonishing coincidence, those functions
just *happen* all to have exactly the right function prototypes to slot
into the evtchn_backend_ops table and be called by the PV backends.

Signed-off-by: David Woodhouse 
---
 hw/i386/kvm/xen_evtchn.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 886fbf6b3b..98a7b85047 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -34,6 +34,7 @@
 #include "hw/pci/msi.h"
 #include "hw/pci/msix.h"
 #include "hw/irq.h"
+#include "hw/xen/xen_backend_ops.h"
 
 #include "xen_evtchn.h"
 #include "xen_overlay.h"
@@ -278,6 +279,17 @@ static const TypeInfo xen_evtchn_info = {
 .class_init= xen_evtchn_class_init,
 };
 
+static struct evtchn_backend_ops emu_evtchn_backend_ops = {
+.open = xen_be_evtchn_open,
+.bind_interdomain = xen_be_evtchn_bind_interdomain,
+.unbind = xen_be_evtchn_unbind,
+.close = xen_be_evtchn_close,
+.get_fd = xen_be_evtchn_fd,
+.notify = xen_be_evtchn_notify,
+.unmask = xen_be_evtchn_unmask,
+.pending = xen_be_evtchn_pending,
+};
+
 static void gsi_assert_bh(void *opaque)
 {
 struct vcpu_info *vi = kvm_xen_get_vcpu_info_hva(0);
@@ -318,6 +330,9 @@ void xen_evtchn_create(void)
 s->nr_pirq_inuse_words = DIV_ROUND_UP(s->nr_pirqs, 64);
 s->pirq_inuse_bitmap = g_new0(uint64_t, s->nr_pirq_inuse_words);
 s->pirq = g_new0(struct pirq_info, s->nr_pirqs);
+
+/* Set event channel functions for backend drivers to use */
+xen_evtchn_ops = _evtchn_backend_ops;
 }
 
 void xen_evtchn_connect_gsis(qemu_irq *system_gsis)
-- 
2.39.0




[RFC PATCH v1 19/25] hw/xen: Only advertise ring-page-order for xen-block if gnttab supports it

2023-03-02 Thread David Woodhouse
From: David Woodhouse 

Whem emulating Xen, multi-page grants are distinctly non-trivial and we
have elected not to support them for the time being. Don't advertise
them to the guest.

Signed-off-by: David Woodhouse 
---
 hw/block/xen-block.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 87299615e3..f5a744589d 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -83,7 +83,8 @@ static void xen_block_connect(XenDevice *xendev, Error **errp)
 g_free(ring_ref);
 return;
 }
-} else if (order <= blockdev->props.max_ring_page_order) {
+} else if (qemu_xen_gnttab_can_map_multi() &&
+   order <= blockdev->props.max_ring_page_order) {
 unsigned int i;
 
 nr_ring_ref = 1 << order;
@@ -255,8 +256,12 @@ static void xen_block_realize(XenDevice *xendev, Error 
**errp)
 }
 
 xen_device_backend_printf(xendev, "feature-flush-cache", "%u", 1);
-xen_device_backend_printf(xendev, "max-ring-page-order", "%u",
-  blockdev->props.max_ring_page_order);
+
+if (qemu_xen_gnttab_can_map_multi()) {
+xen_device_backend_printf(xendev, "max-ring-page-order", "%u",
+  blockdev->props.max_ring_page_order);
+}
+
 xen_device_backend_printf(xendev, "info", "%u", blockdev->info);
 
 xen_device_frontend_printf(xendev, "virtual-device", "%lu",
-- 
2.39.0




[RFC PATCH v1 18/25] hw/xen: Avoid crash when backend watch fires too early

2023-03-02 Thread David Woodhouse
From: Paul Durrant 

The xen-block code ends up calling aio_poll() through blkconf_geometry(),
which means we see watch events during the indirect call to
xendev_class->realize() in xen_device_realize(). Unfortunately this call
is made before populating the initial frontend and backend device nodes
in xenstore and hence xen_block_frontend_changed() (which is called from
a watch event) fails to read the frontend's 'state' node, and hence
believes the device is being torn down. This in-turn sets the backend
state to XenbusStateClosed and causes the device to be deleted before it
is fully set up, leading to the crash.
By simply moving the call to xendev_class->realize() after the initial
xenstore nodes are populated, this sorry state of affairs is avoided.

Reported-by: David Woodhouse 
Signed-off-by: Paul Durrant 
Signed-off-by: David Woodhouse 
---
 hw/xen/xen-bus.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/hw/xen/xen-bus.c b/hw/xen/xen-bus.c
index 9fe54967d4..c59850b1de 100644
--- a/hw/xen/xen-bus.c
+++ b/hw/xen/xen-bus.c
@@ -1034,13 +1034,6 @@ static void xen_device_realize(DeviceState *dev, Error 
**errp)
 goto unrealize;
 }
 
-if (xendev_class->realize) {
-xendev_class->realize(xendev, errp);
-if (*errp) {
-goto unrealize;
-}
-}
-
 xen_device_backend_printf(xendev, "frontend", "%s",
   xendev->frontend_path);
 xen_device_backend_printf(xendev, "frontend-id", "%u",
@@ -1059,6 +1052,13 @@ static void xen_device_realize(DeviceState *dev, Error 
**errp)
 xen_device_frontend_set_state(xendev, XenbusStateInitialising, true);
 }
 
+if (xendev_class->realize) {
+xendev_class->realize(xendev, errp);
+if (*errp) {
+goto unrealize;
+}
+}
+
 xendev->exit.notify = xen_device_exit;
 qemu_add_exit_notifier(>exit);
 return;
-- 
2.39.0




RE: [PATCH v6 3/5] xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping

2023-03-02 Thread Henry Wang
Hi Julien,

> -Original Message-
> Subject: [PATCH v6 3/5] xen/arm64: mm: Introduce helpers to
> prepare/enable/disable the identity mapping
> 
> From: Julien Grall 
> 
> In follow-up patches we will need to have part of Xen identity mapped in
> order to safely switch the TTBR.
> 
> On some platform, the identity mapping may have to start at 0. If we always
> keep the identity region mapped, NULL pointer dereference would lead to
> access to valid mapping.
> 
> It would be possible to relocate Xen to avoid clashing with address 0.
> However the identity mapping is only meant to be used in very limited
> places. Therefore it would be better to keep the identity region invalid
> for most of the time.
> 
> Two new external helpers are introduced:
> - arch_setup_page_tables() will setup the page-tables so it is
>   easy to create the mapping afterwards.
> - update_identity_mapping() will create/remove the identity mapping
> 
> Signed-off-by: Julien Grall 

Reviewed-by: Henry Wang 

Kind regards,
Henry



Re: [XEN PATCH v7 13/20] xen/arm: ffa: support mapping guest RX/TX buffers

2023-03-02 Thread Bertrand Marquis
Hi Jens,

> On 22 Feb 2023, at 16:33, Jens Wiklander  wrote:
> 
> Adds support in the mediator to map and unmap the RX and TX buffers
> provided by the guest using the two FF-A functions FFA_RXTX_MAP and
> FFA_RXTX_UNMAP.
> 
> These buffer are later used to to transmit data that cannot be passed in
> registers only.
> 
> Signed-off-by: Jens Wiklander 
> ---
> xen/arch/arm/tee/ffa.c | 127 +
> 1 file changed, 127 insertions(+)
> 
> diff --git a/xen/arch/arm/tee/ffa.c b/xen/arch/arm/tee/ffa.c
> index f1b014b6c7f4..953b6dfd5eca 100644
> --- a/xen/arch/arm/tee/ffa.c
> +++ b/xen/arch/arm/tee/ffa.c
> @@ -149,10 +149,17 @@ struct ffa_partition_info_1_1 {
> };
> 
> struct ffa_ctx {
> +void *rx;
> +const void *tx;
> +struct page_info *rx_pg;
> +struct page_info *tx_pg;
> +unsigned int page_count;
> uint32_t guest_vers;
> +bool tx_is_mine;
> bool interrupted;
> };
> 
> +
Newline probably added by mistake.

> /* Negotiated FF-A version to use with the SPMC */
> static uint32_t ffa_version __ro_after_init;
> 
> @@ -337,6 +344,11 @@ static void set_regs(struct cpu_user_regs *regs, 
> register_t v0, register_t v1,
> set_user_reg(regs, 7, v7);
> }
> 
> +static void set_regs_error(struct cpu_user_regs *regs, uint32_t error_code)
> +{
> +set_regs(regs, FFA_ERROR, 0, error_code, 0, 0, 0, 0, 0);
> +}
> +
> static void set_regs_success(struct cpu_user_regs *regs, uint32_t w2,
>  uint32_t w3)
> {
> @@ -358,6 +370,99 @@ static void handle_version(struct cpu_user_regs *regs)
> set_regs(regs, vers, 0, 0, 0, 0, 0, 0, 0);
> }
> 
> +static uint32_t handle_rxtx_map(uint32_t fid, register_t tx_addr,
> +register_t rx_addr, uint32_t page_count)
> +{
> +uint32_t ret = FFA_RET_INVALID_PARAMETERS;
> +struct domain *d = current->domain;
> +struct ffa_ctx *ctx = d->arch.tee;
> +struct page_info *tx_pg;
> +struct page_info *rx_pg;
> +p2m_type_t t;
> +void *rx;
> +void *tx;
> +
> +if ( !smccc_is_conv_64(fid) )
> +{
> +tx_addr &= UINT32_MAX;
> +rx_addr &= UINT32_MAX;
> +}

I am bit wondering here what we should do:
- we could just say that 32bit version of the call is not allowed for non 32bit 
guests
- we could check that the highest bits are 0 for 64bit guests and return an 
error if not
- we can just mask hopping that if there was a mistake the address after the 
mask
does not exist in the guest space

At the end nothing in the spec is preventing a 64bit guest to use the 32bit so 
it might
be a good idea to return an error if the highest 32bit are not 0 here ?

> +
> +/* For now to keep things simple, only deal with a single page */
> +if ( page_count != 1 )
> +return FFA_RET_NOT_SUPPORTED;

Please add a TODO here and a print as this is a limitation we will probably 
have to
work on soon.


> +
> +/* Already mapped */
> +if ( ctx->rx )
> +return FFA_RET_DENIED;
> +
> +tx_pg = get_page_from_gfn(d, gfn_x(gaddr_to_gfn(tx_addr)), , 
> P2M_ALLOC);
> +if ( !tx_pg )
> +return FFA_RET_INVALID_PARAMETERS;
> +/* Only normal RAM for now */
> +if ( !p2m_is_ram(t) )
> +goto err_put_tx_pg;
> +
> +rx_pg = get_page_from_gfn(d, gfn_x(gaddr_to_gfn(rx_addr)), , 
> P2M_ALLOC);
> +if ( !tx_pg )
> +goto err_put_tx_pg;
> +/* Only normal RAM for now */
> +if ( !p2m_is_ram(t) )
> +goto err_put_rx_pg;
> +
> +tx = __map_domain_page_global(tx_pg);
> +if ( !tx )
> +goto err_put_rx_pg;
> +
> +rx = __map_domain_page_global(rx_pg);
> +if ( !rx )
> +goto err_unmap_tx;
> +
> +ctx->rx = rx;
> +ctx->tx = tx;
> +ctx->rx_pg = rx_pg;
> +ctx->tx_pg = tx_pg;
> +ctx->page_count = 1;

please use page_count here instead of 1 so that this is not forgotten once
we add support for more pages.


Cheers
Bertrand

> +ctx->tx_is_mine = true;
> +return FFA_RET_OK;
> +
> +err_unmap_tx:
> +unmap_domain_page_global(tx);
> +err_put_rx_pg:
> +put_page(rx_pg);
> +err_put_tx_pg:
> +put_page(tx_pg);
> +
> +return ret;
> +}
> +
> +static void rxtx_unmap(struct ffa_ctx *ctx)
> +{
> +unmap_domain_page_global(ctx->rx);
> +unmap_domain_page_global(ctx->tx);
> +put_page(ctx->rx_pg);
> +put_page(ctx->tx_pg);
> +ctx->rx = NULL;
> +ctx->tx = NULL;
> +ctx->rx_pg = NULL;
> +ctx->tx_pg = NULL;
> +ctx->page_count = 0;
> +ctx->tx_is_mine = false;
> +}
> +
> +static uint32_t handle_rxtx_unmap(void)
> +{
> +struct domain *d = current->domain;
> +struct ffa_ctx *ctx = d->arch.tee;
> +
> +if ( !ctx->rx )
> +return FFA_RET_INVALID_PARAMETERS;
> +
> +rxtx_unmap(ctx);
> +
> +return FFA_RET_OK;
> +}
> +
> static void handle_msg_send_direct_req(struct cpu_user_regs *regs, uint32_t 
> fid)
> {
> struct arm_smccc_1_2_regs arg = { .a0 = fid, };
> @@ -423,6 +528,7 @@ static bool 

[PATCH v6 5/5] xen/arm64: smpboot: Directly switch to the runtime page-tables

2023-03-02 Thread Julien Grall
From: Julien Grall 

Switching TTBR while the MMU is on is not safe. Now that the identity
mapping will not clash with the rest of the memory layout, we can avoid
creating temporary page-tables every time a CPU is brought up.

The arm32 code will use a different approach. So this issue is for now
only resolved on arm64.

Signed-off-by: Julien Grall 
Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 
Reviewed-by: Bertrand Marquis 


Changes in v6:
- Add Bertrand's reviewed-by

Changes in v5:
- Add Luca's reviewed-by and tested-by tags.

Changes in v4:
- Somehow I forgot to send it in v3. So re-include it.

Changes in v2:
- Remove arm32 code
---
 xen/arch/arm/arm32/smpboot.c   |  4 
 xen/arch/arm/arm64/head.S  | 29 +
 xen/arch/arm/arm64/smpboot.c   | 15 ++-
 xen/arch/arm/include/asm/smp.h |  1 +
 xen/arch/arm/smpboot.c |  1 +
 5 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/xen/arch/arm/arm32/smpboot.c b/xen/arch/arm/arm32/smpboot.c
index e7368665d50d..518e9f9c7e70 100644
--- a/xen/arch/arm/arm32/smpboot.c
+++ b/xen/arch/arm/arm32/smpboot.c
@@ -21,6 +21,10 @@ int arch_cpu_up(int cpu)
 return platform_cpu_up(cpu);
 }
 
+void arch_cpu_up_finish(void)
+{
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 5efd442b24af..a61b4d3c2738 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -308,6 +308,7 @@ real_start_efi:
 blcheck_cpu_mode
 blcpu_init
 blcreate_page_tables
+load_paddr x0, boot_pgtable
 blenable_mmu
 
 /* We are still in the 1:1 mapping. Jump to the runtime Virtual 
Address. */
@@ -365,29 +366,14 @@ GLOBAL(init_secondary)
 #endif
 blcheck_cpu_mode
 blcpu_init
-blcreate_page_tables
+load_paddr x0, init_ttbr
+ldr   x0, [x0]
 blenable_mmu
 
 /* We are still in the 1:1 mapping. Jump to the runtime Virtual 
Address. */
 ldr   x0, =secondary_switched
 brx0
 secondary_switched:
-/*
- * Non-boot CPUs need to move on to the proper pagetables, which were
- * setup in init_secondary_pagetables.
- *
- * XXX: This is not compliant with the Arm Arm.
- */
-ldr   x4, =init_ttbr /* VA of TTBR0_EL2 stashed by CPU 0 */
-ldr   x4, [x4]   /* Actual value */
-dsb   sy
-msr   TTBR0_EL2, x4
-dsb   sy
-isb
-tlbi  alle2
-dsb   sy /* Ensure completion of TLB flush */
-isb
-
 #ifdef CONFIG_EARLY_PRINTK
 /* Use a virtual address to access the UART. */
 ldr   x23, =EARLY_UART_VIRTUAL_ADDRESS
@@ -672,9 +658,13 @@ ENDPROC(create_page_tables)
  * mapping. In other word, the caller is responsible to switch to the runtime
  * mapping.
  *
- * Clobbers x0 - x3
+ * Inputs:
+ *   x0 : Physical address of the page tables.
+ *
+ * Clobbers x0 - x4
  */
 enable_mmu:
+mov   x4, x0
 PRINT("- Turning on paging -\r\n")
 
 /*
@@ -685,8 +675,7 @@ enable_mmu:
 dsb   nsh
 
 /* Write Xen's PT's paddr into TTBR0_EL2 */
-load_paddr x0, boot_pgtable
-msr   TTBR0_EL2, x0
+msr   TTBR0_EL2, x4
 isb
 
 mrs   x0, SCTLR_EL2
diff --git a/xen/arch/arm/arm64/smpboot.c b/xen/arch/arm/arm64/smpboot.c
index 694fbf67e62a..9637f424699e 100644
--- a/xen/arch/arm/arm64/smpboot.c
+++ b/xen/arch/arm/arm64/smpboot.c
@@ -106,10 +106,23 @@ int __init arch_cpu_init(int cpu, struct dt_device_node 
*dn)
 
 int arch_cpu_up(int cpu)
 {
+int rc;
+
 if ( !smp_enable_ops[cpu].prepare_cpu )
 return -ENODEV;
 
-return smp_enable_ops[cpu].prepare_cpu(cpu);
+update_identity_mapping(true);
+
+rc = smp_enable_ops[cpu].prepare_cpu(cpu);
+if ( rc )
+update_identity_mapping(false);
+
+return rc;
+}
+
+void arch_cpu_up_finish(void)
+{
+update_identity_mapping(false);
 }
 
 /*
diff --git a/xen/arch/arm/include/asm/smp.h b/xen/arch/arm/include/asm/smp.h
index 8133d5c29572..a37ca55bff2c 100644
--- a/xen/arch/arm/include/asm/smp.h
+++ b/xen/arch/arm/include/asm/smp.h
@@ -25,6 +25,7 @@ extern void noreturn stop_cpu(void);
 extern int arch_smp_init(void);
 extern int arch_cpu_init(int cpu, struct dt_device_node *dn);
 extern int arch_cpu_up(int cpu);
+extern void arch_cpu_up_finish(void);
 
 int cpu_up_send_sgi(int cpu);
 
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 412ae2286906..4a89b3a8345b 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -500,6 +500,7 @@ int __cpu_up(unsigned int cpu)
 init_data.cpuid = ~0;
 smp_up_cpu = MPIDR_INVALID;
 clean_dcache(smp_up_cpu);
+arch_cpu_up_finish();
 
 if ( !cpu_online(cpu) )
 {
-- 
2.39.1




[PATCH v6 2/5] xen/arm64: Rework the memory layout

2023-03-02 Thread Julien Grall
From: Julien Grall 

Xen is currently not fully compliant with the Arm Arm because it will
switch the TTBR with the MMU on.

In order to be compliant, we need to disable the MMU before
switching the TTBR. The implication is the page-tables should
contain an identity mapping of the code switching the TTBR.

In most of the case we expect Xen to be loaded in low memory. I am aware
of one platform (i.e AMD Seattle) where the memory start above 512GB.
To give us some slack, consider that Xen may be loaded in the first 2TB
of the physical address space.

The memory layout is reshuffled to keep the first four slots of the zeroeth
level free. Xen will now be loaded at (2TB + 2MB). This requires a slight
tweak of the boot code because XEN_VIRT_START cannot be used as an
immediate.

This reshuffle will make trivial to create a 1:1 mapping when Xen is
loaded below 2TB.

Signed-off-by: Julien Grall 
Tested-by: Luca Fancellu 
Reviewed-by: Michal Orzel 
Reviewed-by: Bertrand Marquis 


Changes in v6:
- Correct the BUILD_BUG_ON(), Xen virtual address should be
  above 2TB (i.e. slot0 > 4).
- Add Bertrand's reviewed-by

Changes in v5:
- We are reserving 4 slots rather than 2.
- Fix the addresses in the layout comment.
- Fix the size of the region in the layout comment
- Add Luca's tested-by (the reviewed-by was not added
  because of the changes requested by Michal
- Add Michal's reviewed-by

Changes in v4:
- Correct the documentation
- The start address is 2TB, so slot0 is 4 not 2.

Changes in v2:
- Reword the commit message
- Load Xen at 2TB + 2MB
- Update the documentation to reflect the new layout
---
 xen/arch/arm/arm64/head.S |  3 ++-
 xen/arch/arm/include/asm/config.h | 34 +--
 xen/arch/arm/mm.c | 11 +-
 3 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 4a3f87117c83..663f5813b12e 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -607,7 +607,8 @@ create_page_tables:
  * need an additional 1:1 mapping, the virtual mapping will
  * suffice.
  */
-cmp   x19, #XEN_VIRT_START
+ldr   x0, =XEN_VIRT_START
+cmp   x19, x0
 bne   1f
 ret
 1:
diff --git a/xen/arch/arm/include/asm/config.h 
b/xen/arch/arm/include/asm/config.h
index 5df0e4c4959b..e388462c23d1 100644
--- a/xen/arch/arm/include/asm/config.h
+++ b/xen/arch/arm/include/asm/config.h
@@ -72,16 +72,13 @@
 #include 
 
 /*
- * Common ARM32 and ARM64 layout:
+ * ARM32 layout:
  *   0  -   2M   Unmapped
  *   2M -   4M   Xen text, data, bss
  *   4M -   6M   Fixmap: special-purpose 4K mapping slots
  *   6M -  10M   Early boot mapping of FDT
  *   10M - 12M   Livepatch vmap (if compiled in)
  *
- * ARM32 layout:
- *   0  -  12M   
- *
  *  32M - 128M   Frametable: 32 bytes per page for 12GB of RAM
  * 256M -   1G   VMAP: ioremap and early_ioremap use this virtual address
  *space
@@ -90,14 +87,23 @@
  *   2G -   4G   Domheap: on-demand-mapped
  *
  * ARM64 layout:
- * 0x - 0x007f (512GB, L0 slot [0])
- *   0  -  12M   
+ * 0x - 0x01ff (2TB, L0 slots [0..3])
+ *
+ *  Reserved to identity map Xen
+ *
+ * 0x0200 - 0x027f (512GB, L0 slot [4]
+ *  (Relative offsets)
+ *   0  -   2M   Unmapped
+ *   2M -   4M   Xen text, data, bss
+ *   4M -   6M   Fixmap: special-purpose 4K mapping slots
+ *   6M -  10M   Early boot mapping of FDT
+ *  10M -  12M   Livepatch vmap (if compiled in)
  *
  *   1G -   2G   VMAP: ioremap and early_ioremap
  *
  *  32G -  64G   Frametable: 56 bytes per page for 2TB of RAM
  *
- * 0x0080 - 0x7fff (127.5TB, L0 slots [1..255])
+ * 0x0280 - 0x7fff (125TB, L0 slots [5..255])
  *  Unused
  *
  * 0x8000 - 0x84ff (5TB, L0 slots [256..265])
@@ -107,7 +113,17 @@
  *  Unused
  */
 
+#ifdef CONFIG_ARM_32
 #define XEN_VIRT_START  _AT(vaddr_t, MB(2))
+#else
+
+#define SLOT0_ENTRY_BITS  39
+#define SLOT0(slot) (_AT(vaddr_t,slot) << SLOT0_ENTRY_BITS)
+#define SLOT0_ENTRY_SIZE  SLOT0(1)
+
+#define XEN_VIRT_START  (SLOT0(4) + _AT(vaddr_t, MB(2)))
+#endif
+
 #define XEN_VIRT_SIZE   _AT(vaddr_t, MB(2))
 
 #define FIXMAP_VIRT_START   (XEN_VIRT_START + XEN_VIRT_SIZE)
@@ -163,10 +179,6 @@
 
 #else /* ARM_64 */
 
-#define SLOT0_ENTRY_BITS  39
-#define SLOT0(slot) (_AT(vaddr_t,slot) << SLOT0_ENTRY_BITS)
-#define SLOT0_ENTRY_SIZE  SLOT0(1)
-
 #define VMAP_VIRT_START  GB(1)
 #define VMAP_VIRT_SIZE   GB(1)
 
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index f758cad545fa..9263fedc3b7d 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -153,7 +153,7 @@ static void __init __maybe_unused build_assertions(void)
 #endif
 

[PATCH v6 4/5] xen/arm64: mm: Rework switch_ttbr()

2023-03-02 Thread Julien Grall
From: Julien Grall 

At the moment, switch_ttbr() is switching the TTBR whilst the MMU is
still on.

Switching TTBR is like replacing existing mappings with new ones. So
we need to follow the break-before-make sequence.

In this case, it means the MMU needs to be switched off while the
TTBR is updated. In order to disable the MMU, we need to first
jump to an identity mapping.

Rename switch_ttbr() to switch_ttbr_id() and create an helper on
top to temporary map the identity mapping and call switch_ttbr()
via the identity address.

switch_ttbr_id() is now reworked to temporarily turn off the MMU
before updating the TTBR.

We also need to make sure the helper switch_ttbr() is part of the
identity mapping. So move _end_boot past it.

The arm32 code will use a different approach. So this issue is for now
only resolved on arm64.

Signed-off-by: Julien Grall 
Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 
Reviewed-by: Michal Orzel 
Reviewed-by: Bertrand Marquis 


Changes in v6:
- Add Michal's reviewed-by tag
- Add Bertrand's reviewed-by tag

Changes in v5:
- Add a newline in switch_ttbr()
- Add Luca's reviewed-by and tested-by

Changes in v4:
- Don't modify setup_pagetables() as we don't handle arm32.
- Move the clearing of the boot page tables in an earlier patch
- Fix the numbering

Changes in v2:
- Remove the arm32 changes. This will be addressed differently
- Re-instate the instruct cache flush. This is not strictly
  necessary but kept it for safety.
- Use "dsb ish"  rather than "dsb sy".


TODO:
* Handle the case where the runtime Xen is loaded at a different
  position for cache coloring. This will be dealt separately.
---
 xen/arch/arm/arm64/head.S | 50 +++
 xen/arch/arm/arm64/mm.c   | 31 ++
 xen/arch/arm/include/asm/mm.h |  2 ++
 xen/arch/arm/mm.c |  2 --
 4 files changed, 66 insertions(+), 19 deletions(-)

diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 663f5813b12e..5efd442b24af 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -816,30 +816,46 @@ ENDPROC(fail)
  * Switch TTBR
  *
  * x0ttbr
- *
- * TODO: This code does not comply with break-before-make.
  */
-ENTRY(switch_ttbr)
-dsb   sy /* Ensure the flushes happen before
-  * continuing */
-isb  /* Ensure synchronization with previous
-  * changes to text */
-tlbi   alle2 /* Flush hypervisor TLB */
-ic iallu /* Flush I-cache */
-dsbsy/* Ensure completion of TLB flush */
+ENTRY(switch_ttbr_id)
+/* 1) Ensure any previous read/write have completed */
+dsbish
+isb
+
+/* 2) Turn off MMU */
+mrsx1, SCTLR_EL2
+bicx1, x1, #SCTLR_Axx_ELx_M
+msrSCTLR_EL2, x1
+isb
+
+/*
+ * 3) Flush the TLBs.
+ * See asm/arm64/flushtlb.h for the explanation of the sequence.
+ */
+dsb   nshst
+tlbi  alle2
+dsb   nsh
+isb
+
+/* 4) Update the TTBR */
+msr   TTBR0_EL2, x0
 isb
 
-msrTTBR0_EL2, x0
+/*
+ * 5) Flush I-cache
+ * This should not be necessary but it is kept for safety.
+ */
+ic iallu
+isb
 
-isb  /* Ensure synchronization with previous
-  * changes to text */
-tlbi   alle2 /* Flush hypervisor TLB */
-ic iallu /* Flush I-cache */
-dsbsy/* Ensure completion of TLB flush */
+/* 6) Turn on the MMU */
+mrs   x1, SCTLR_EL2
+orr   x1, x1, #SCTLR_Axx_ELx_M  /* Enable MMU */
+msr   SCTLR_EL2, x1
 isb
 
 ret
-ENDPROC(switch_ttbr)
+ENDPROC(switch_ttbr_id)
 
 #ifdef CONFIG_EARLY_PRINTK
 /*
diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
index 56b9e9b8d3ef..78b7c7eb004f 100644
--- a/xen/arch/arm/arm64/mm.c
+++ b/xen/arch/arm/arm64/mm.c
@@ -120,6 +120,37 @@ void update_identity_mapping(bool enable)
 BUG_ON(rc);
 }
 
+extern void switch_ttbr_id(uint64_t ttbr);
+
+typedef void (switch_ttbr_fn)(uint64_t ttbr);
+
+void __init switch_ttbr(uint64_t ttbr)
+{
+vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
+switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
+lpae_t pte;
+
+/* Enable the identity mapping in the boot page tables */
+update_identity_mapping(true);
+
+/* Enable the identity mapping in the runtime page tables */
+pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
+pte.pt.table = 1;
+pte.pt.xn = 0;
+pte.pt.ro = 1;
+

[PATCH v6 3/5] xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping

2023-03-02 Thread Julien Grall
From: Julien Grall 

In follow-up patches we will need to have part of Xen identity mapped in
order to safely switch the TTBR.

On some platform, the identity mapping may have to start at 0. If we always
keep the identity region mapped, NULL pointer dereference would lead to
access to valid mapping.

It would be possible to relocate Xen to avoid clashing with address 0.
However the identity mapping is only meant to be used in very limited
places. Therefore it would be better to keep the identity region invalid
for most of the time.

Two new external helpers are introduced:
- arch_setup_page_tables() will setup the page-tables so it is
  easy to create the mapping afterwards.
- update_identity_mapping() will create/remove the identity mapping

Signed-off-by: Julien Grall 


Changes in v6:
- Correctly check the placement of the identity mapping (take
  2).
- Fix typoes

Changes in v5:
- The reserved area for the identity mapping is 2TB (so 4 slots)
  rather than 512GB.

Changes in v4:
- Fix typo in a comment
- Clarify which page-tables are updated

Changes in v2:
- Remove the arm32 part
- Use a different logic for the boot page tables and runtime
  one because Xen may be running in a different place.
---
 xen/arch/arm/arm64/Makefile |   1 +
 xen/arch/arm/arm64/mm.c | 130 
 xen/arch/arm/include/asm/arm32/mm.h |   4 +
 xen/arch/arm/include/asm/arm64/mm.h |  13 +++
 xen/arch/arm/include/asm/config.h   |   2 +
 xen/arch/arm/include/asm/setup.h|  11 +++
 xen/arch/arm/mm.c   |   6 +-
 7 files changed, 165 insertions(+), 2 deletions(-)
 create mode 100644 xen/arch/arm/arm64/mm.c

diff --git a/xen/arch/arm/arm64/Makefile b/xen/arch/arm/arm64/Makefile
index 6d507da0d44d..28481393e98f 100644
--- a/xen/arch/arm/arm64/Makefile
+++ b/xen/arch/arm/arm64/Makefile
@@ -10,6 +10,7 @@ obj-y += entry.o
 obj-y += head.o
 obj-y += insn.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o
+obj-y += mm.o
 obj-y += smc.o
 obj-y += smpboot.o
 obj-y += traps.o
diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
new file mode 100644
index ..56b9e9b8d3ef
--- /dev/null
+++ b/xen/arch/arm/arm64/mm.c
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include 
+#include 
+
+#include 
+
+/* Override macros from asm/page.h to make them work with mfn_t */
+#undef virt_to_mfn
+#define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
+
+static DEFINE_PAGE_TABLE(xen_first_id);
+static DEFINE_PAGE_TABLE(xen_second_id);
+static DEFINE_PAGE_TABLE(xen_third_id);
+
+/*
+ * The identity mapping may start at physical address 0. So we don't want
+ * to keep it mapped longer than necessary.
+ *
+ * When this is called, we are still using the boot_pgtable.
+ *
+ * We need to prepare the identity mapping for both the boot page tables
+ * and runtime page tables.
+ *
+ * The logic to create the entry is slightly different because Xen may
+ * be running at a different location at runtime.
+ */
+static void __init prepare_boot_identity_mapping(void)
+{
+paddr_t id_addr = virt_to_maddr(_start);
+lpae_t pte;
+DECLARE_OFFSETS(id_offsets, id_addr);
+
+/*
+ * We will be re-using the boot ID tables. They may not have been
+ * zeroed but they should be unlinked. So it is fine to use
+ * clear_page().
+ */
+clear_page(boot_first_id);
+clear_page(boot_second_id);
+clear_page(boot_third_id);
+
+if ( id_offsets[0] >= IDENTITY_MAPPING_AREA_NR_L0 )
+panic("Cannot handle ID mapping above 2TB\n");
+
+/* Link first ID table */
+pte = mfn_to_xen_entry(virt_to_mfn(boot_first_id), MT_NORMAL);
+pte.pt.table = 1;
+pte.pt.xn = 0;
+
+write_pte(_pgtable[id_offsets[0]], pte);
+
+/* Link second ID table */
+pte = mfn_to_xen_entry(virt_to_mfn(boot_second_id), MT_NORMAL);
+pte.pt.table = 1;
+pte.pt.xn = 0;
+
+write_pte(_first_id[id_offsets[1]], pte);
+
+/* Link third ID table */
+pte = mfn_to_xen_entry(virt_to_mfn(boot_third_id), MT_NORMAL);
+pte.pt.table = 1;
+pte.pt.xn = 0;
+
+write_pte(_second_id[id_offsets[2]], pte);
+
+/* The mapping in the third table will be created at a later stage */
+}
+
+static void __init prepare_runtime_identity_mapping(void)
+{
+paddr_t id_addr = virt_to_maddr(_start);
+lpae_t pte;
+DECLARE_OFFSETS(id_offsets, id_addr);
+
+if ( id_offsets[0] >= IDENTITY_MAPPING_AREA_NR_L0 )
+panic("Cannot handle ID mapping above 2TB\n");
+
+/* Link first ID table */
+pte = pte_of_xenaddr((vaddr_t)xen_first_id);
+pte.pt.table = 1;
+pte.pt.xn = 0;
+
+write_pte(_pgtable[id_offsets[0]], pte);
+
+/* Link second ID table */
+pte = pte_of_xenaddr((vaddr_t)xen_second_id);
+pte.pt.table = 1;
+pte.pt.xn = 0;
+
+write_pte(_first_id[id_offsets[1]], pte);
+
+/* Link third ID table */
+pte = 

[PATCH v6 0/5] xen/arm: Don't switch TTBR while the MMU is on

2023-03-02 Thread Julien Grall
From: Julien Grall 

Hi all,

Currently, Xen on Arm will switch TTBR whilst the MMU is on. This is
similar to replacing existing mappings with new ones. So we need to
follow a break-before-make sequence.

When switching the TTBR, we need to temporarily disable the MMU
before updating the TTBR. This means the page-tables must contain an
identity mapping.

The current memory layout is not very flexible and has an higher chance
to clash with the identity mapping.

On Arm64, we have plenty of unused virtual address space Therefore, we can
simply reshuffle the layout to leave the first part of the virtual
address space empty.

On Arm32, the virtual address space is already quite full. Even if we
find space, it would be necessary to have a dynamic layout. So a
different approach will be necessary. The chosen one is to have
a temporary mapping that will be used to jumped from the ID mapping
to the runtime mapping (or vice versa). The temporary mapping will
be overlapping with the domheap area as it should not be used when
switching on/off the MMU.

The Arm32 part is not yet addressed and will be handled in a follow-up
series.

After this series, most of Xen page-table code should be compliant
with the Arm Arm. The last two issues I am aware of are:
 - domheap: Mappings are replaced without using the Break-Before-Make
   approach.
 - The cache is not cleaned/invalidated when updating the page-tables
   with Data cache off (like during early boot).

The long term plan is to get rid of boot_* page tables and then
directly use the runtime pages. This means for coloring, we will
need to build the pages in the relocated Xen rather than the current
Xen.

For convience, I pushed a branch with everything applied:

https://xenbits.xen.org/git-http/people/julieng/xen-unstable.git
branch boot-pt-rework-v6

Cheers,

Julien Grall (5):
  xen/arm32: head: Widen the use of the temporary mapping
  xen/arm64: Rework the memory layout
  xen/arm64: mm: Introduce helpers to prepare/enable/disable the
identity mapping
  xen/arm64: mm: Rework switch_ttbr()
  xen/arm64: smpboot: Directly switch to the runtime page-tables

 xen/arch/arm/arm32/head.S   |  86 +++
 xen/arch/arm/arm32/smpboot.c|   4 +
 xen/arch/arm/arm64/Makefile |   1 +
 xen/arch/arm/arm64/head.S   |  82 +++---
 xen/arch/arm/arm64/mm.c | 161 
 xen/arch/arm/arm64/smpboot.c|  15 ++-
 xen/arch/arm/include/asm/arm32/mm.h |   4 +
 xen/arch/arm/include/asm/arm64/mm.h |  13 +++
 xen/arch/arm/include/asm/config.h   |  34 --
 xen/arch/arm/include/asm/mm.h   |   2 +
 xen/arch/arm/include/asm/setup.h|  11 ++
 xen/arch/arm/include/asm/smp.h  |   1 +
 xen/arch/arm/mm.c   |  19 ++--
 xen/arch/arm/smpboot.c  |   1 +
 14 files changed, 306 insertions(+), 128 deletions(-)
 create mode 100644 xen/arch/arm/arm64/mm.c

-- 
2.39.1




[PATCH v6 1/5] xen/arm32: head: Widen the use of the temporary mapping

2023-03-02 Thread Julien Grall
From: Julien Grall 

At the moment, the temporary mapping is only used when the virtual
runtime region of Xen is clashing with the physical region.

In follow-up patches, we will rework how secondary CPU bring-up works
and it will be convenient to use the fixmap area for accessing
the root page-table (it is per-cpu).

Rework the code to use temporary mapping when the Xen physical address
is not overlapping with the temporary mapping.

This also has the advantage to simplify the logic to identity map
Xen.

Signed-off-by: Julien Grall 
Reviewed-by: Henry Wang 
Tested-by: Henry Wang 
Reviewed-by: Michal Orzel 



Even if this patch is rewriting part of the previous patch, I decided
to keep them separated to help the review.

The "follow-up patches" are still in draft at the moment. I still haven't
find a way to split them nicely and not require too much more work
in the coloring side.

I have provided some medium-term goal in the cover letter.

Changes in v6:
- Add Henry's reviewed-by and tested-by tag
- Add Michal's reviewed-by
- Add newline in remove_identity_mapping for clarity

Changes in v5:
- Fix typo in a comment
- No need to link boot_{second, third}_id again if we need to
  create a temporary area.

Changes in v3:
- Resolve conflicts after switching from "ldr rX, " to
  "mov_w rX, " in a previous patch

Changes in v2:
- Patch added
---
 xen/arch/arm/arm32/head.S | 86 ---
 1 file changed, 16 insertions(+), 70 deletions(-)

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index df51550baa8a..9befffd85079 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -459,7 +459,6 @@ ENDPROC(cpu_init)
 create_page_tables:
 /* Prepare the page-tables for mapping Xen */
 mov_w r0, XEN_VIRT_START
-create_table_entry boot_pgtable, boot_second, r0, 1
 create_table_entry boot_second, boot_third, r0, 2
 
 /* Setup boot_third: */
@@ -479,70 +478,37 @@ create_page_tables:
 cmp   r1, #(XEN_PT_LPAE_ENTRIES<<3) /* 512*8-byte entries per page */
 blo   1b
 
-/*
- * If Xen is loaded at exactly XEN_VIRT_START then we don't
- * need an additional 1:1 mapping, the virtual mapping will
- * suffice.
- */
-cmp   r9, #XEN_VIRT_START
-moveq pc, lr
-
 /*
  * Setup the 1:1 mapping so we can turn the MMU on. Note that
  * only the first page of Xen will be part of the 1:1 mapping.
- *
- * In all the cases, we will link boot_third_id. So create the
- * mapping in advance.
  */
+create_table_entry boot_pgtable, boot_second_id, r9, 1
+create_table_entry boot_second_id, boot_third_id, r9, 2
 create_mapping_entry boot_third_id, r9, r9
 
 /*
- * Find the first slot used. If the slot is not XEN_FIRST_SLOT,
- * then the 1:1 mapping will use its own set of page-tables from
- * the second level.
+ * Find the first slot used. If the slot is not the same
+ * as TEMPORARY_AREA_FIRST_SLOT, then we will want to switch
+ * to the temporary mapping before jumping to the runtime
+ * virtual mapping.
  */
 get_table_slot r1, r9, 1 /* r1 := first slot */
-cmp   r1, #XEN_FIRST_SLOT
-beq   1f
-create_table_entry boot_pgtable, boot_second_id, r9, 1
-b link_from_second_id
-
-1:
-/*
- * Find the second slot used. If the slot is XEN_SECOND_SLOT, then the
- * 1:1 mapping will use its own set of page-tables from the
- * third level.
- */
-get_table_slot r1, r9, 2 /* r1 := second slot */
-cmp   r1, #XEN_SECOND_SLOT
-beq   virtphys_clash
-create_table_entry boot_second, boot_third_id, r9, 2
-b link_from_third_id
+cmp   r1, #TEMPORARY_AREA_FIRST_SLOT
+bne   use_temporary_mapping
 
-link_from_second_id:
-create_table_entry boot_second_id, boot_third_id, r9, 2
-link_from_third_id:
-/* Good news, we are not clashing with Xen virtual mapping */
+mov_w r0, XEN_VIRT_START
+create_table_entry boot_pgtable, boot_second, r0, 1
 mov   r12, #0/* r12 := temporary mapping not created */
 mov   pc, lr
 
-virtphys_clash:
+use_temporary_mapping:
 /*
- * The identity map clashes with boot_third. Link boot_first_id and
- * map Xen to a temporary mapping. See switch_to_runtime_mapping
- * for more details.
+ * The identity mapping is not using the first slot
+ * TEMPORARY_AREA_FIRST_SLOT. Create a temporary mapping.
+ * See switch_to_runtime_mapping for more details.
  */
-PRINT("- Virt and Phys addresses clash  -\r\n")
 PRINT("- Create temporary mapping -\r\n")
 
-

Re: [PATCH v2 2/3] xen/riscv: initialize .bss section

2023-03-02 Thread Oleksii
On Thu, 2023-03-02 at 14:12 +, Andrew Cooper wrote:
> On 02/03/2023 1:23 pm, Oleksii Kurochko wrote:
> > Signed-off-by: Oleksii Kurochko 
> > ---
> > Changes since v1:
> >  * initialization of .bss was moved to head.S
> > ---
> >  xen/arch/riscv/include/asm/asm.h | 4 
> >  xen/arch/riscv/riscv64/head.S    | 9 +
> >  2 files changed, 13 insertions(+)
> > 
> > diff --git a/xen/arch/riscv/include/asm/asm.h
> > b/xen/arch/riscv/include/asm/asm.h
> > index 6d426ecea7..5208529cb4 100644
> > --- a/xen/arch/riscv/include/asm/asm.h
> > +++ b/xen/arch/riscv/include/asm/asm.h
> > @@ -26,14 +26,18 @@
> >  #if __SIZEOF_POINTER__ == 8
> >  #ifdef __ASSEMBLY__
> >  #define RISCV_PTR  .dword
> > +#define RISCV_SZPTR8
> >  #else
> >  #define RISCV_PTR  ".dword"
> > +#define RISCV_SZPTR8
> >  #endif
> >  #elif __SIZEOF_POINTER__ == 4
> >  #ifdef __ASSEMBLY__
> >  #define RISCV_PTR  .word
> > +#define RISCV_SZPTR4
> >  #else
> >  #define RISCV_PTR  ".word"
> > +#define RISCV_SZPTR4
> 
> This an extremely verbose way of saying that __SIZEOF_POINTER__ is
> the
> right value to use...
> 
> Just drop the change here.  The code is better without this
> indirection.
> 
> >  #endif
> >  #else
> >  #error "Unexpected __SIZEOF_POINTER__"
> > diff --git a/xen/arch/riscv/riscv64/head.S
> > b/xen/arch/riscv/riscv64/head.S
> > index 851b4691a5..b139976b7a 100644
> > --- a/xen/arch/riscv/riscv64/head.S
> > +++ b/xen/arch/riscv/riscv64/head.S
> > @@ -13,6 +13,15 @@ ENTRY(start)
> >  lla a6, _dtb_base
> >  REG_S   a1, (a6)
> >  
> 
> /* Clear the BSS */
> 
> The comments (even just oneliners) will become increasingly useful as
> the logic here grows.
> 
> > +    la  a3, __bss_start
> > +    la  a4, __bss_end
> > +    ble a4, a3, clear_bss_done
> > +clear_bss:
> > +    REG_S   zero, (a3)
> > +    add a3, a3, RISCV_SZPTR
> > +    blt a3, a4, clear_bss
> > +clear_bss_done:
> 
> You should use t's here, not a's.  What we are doing here is
> temporary
> and not constructing arguments to a function.  Furthermore we want to
> preserve the a's where possible to avoid spilling the parameters.
> 
> Finally, the symbols should have an .L_ prefix to make the local
> symbols.
> 
> It really doesn't matter now, but will when you're retrofitting elf
> metadata to asm code to make livepatching work.  (I'm doing this on
> x86
> and it would have been easier if people had written the code nicely
> the
> first time around.)
Thanks. I'll update the code.
> 
> ~Andrew




Re: [PATCH v2 1/3] xen/riscv: read/save hart_id and dtb_base passed by bootloader

2023-03-02 Thread Oleksii
On Thu, 2023-03-02 at 14:02 +, Andrew Cooper wrote:
> On 02/03/2023 1:23 pm, Oleksii Kurochko wrote:
> > diff --git a/xen/arch/riscv/riscv64/head.S
> > b/xen/arch/riscv/riscv64/head.S
> > index ffd95f9f89..851b4691a5 100644
> > --- a/xen/arch/riscv/riscv64/head.S
> > +++ b/xen/arch/riscv/riscv64/head.S
> > @@ -6,8 +7,31 @@ ENTRY(start)
> >  /* Mask all interrupts */
> >  csrw    CSR_SIE, zero
> >  
> > +    /* Save HART ID and DTB base */
> > +    lla a6, _bootcpu_id
> > +    REG_S   a0, (a6)
> > +    lla a6, _dtb_base
> > +    REG_S   a1, (a6)
> > +
> >  la  sp, cpu0_boot_stack
> >  li  t0, STACK_SIZE
> >  add sp, sp, t0
> >  
> > +    lla a6, _bootcpu_id
> > +    REG_L   a0, (a6)
> > +    lla a6, _dtb_base
> > +    REG_L   a1, (a6)
> 
> This is overkill.
> 
> Put a comment at start identifying which parameters are in which
> registers, and just make sure not to clobber them - RISCV has plenty
> of
> registers.
> 
> Right now, we don't touch the a registers at all, which is why your
> v1
> patch worked.  (a0 and a1 still have the same value when we get into
> C).
> 
> If we do need to start calling more complex things here (and I'm not
> sure that we do), we could either store the parameters in s2-11, or
> spill them onto the stack; both of which are preferable to spilling
> to
> global variables like this.
> 
> > +
> >  tail    start_xen
> > +
> > +    /*
> > + * Boot cpu id is passed by a bootloader
> > + */
> > +_bootcpu_id:
> > +    RISCV_PTR 0x0
> 
> Just a note (as you want to delete this anyway), this isn't a PTR,
> it's
> a LONG.
> 
> > +
> > +    /*
> > + * DTB base is passed by a bootloader
> > + */
> > +_dtb_base:
> > +    RISCV_PTR 0x0
> > diff --git a/xen/arch/riscv/setup.c b/xen/arch/riscv/setup.c
> > index 1c87899e8e..d9723fe1c0 100644
> > --- a/xen/arch/riscv/setup.c
> > +++ b/xen/arch/riscv/setup.c
> > @@ -7,7 +7,8 @@
> >  unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
> >  __aligned(STACK_SIZE);
> >  
> > -void __init noreturn start_xen(void)
> > +void __init noreturn start_xen(unsigned long bootcpu_id,
> > +   unsigned long dtb_base)
> 
> To be clear, this change should be this hunk exactly as it is, and a
> comment immediately ahead of ENTRY(start) describing the entry ABI.
> 
> There is no need currently to change any of the asm code.
I think that I'll use s2 and s3 to save bootcpu_id.

But I am unsure I understand why the asm code shouldn't be changed.

I mean that a0-7 are used as function arguments, a0-1 are used for
return value so they are expected to be changed. That is why we have to
save them somewhere.

If I understand you correctly I can write in a comment ahead of
ENTRY(start) that a0, and a1 are reserved for hart_id and dtb_base
which are passed from a bootloader but it will work only if start_xen
will be only C function called from head.S.

I probably misunderstand you...

~ Oleksii




Re: [PATCH v1] xen/arm: align *(.proc.info) in the linker script

2023-03-02 Thread Julien Grall

Hi Oleksii,

On 02/03/2023 07:34, Oleksii wrote:

Hi Julien,

On Wed, 2023-03-01 at 16:21 +, Julien Grall wrote:

Hi Oleksii,

On 01/03/2023 16:14, Oleksii Kurochko wrote:

During testing of bug.h's macros generic implementation
yocto-
qemuarm
job crashed with data abort:


The commit message is lacking some information. You are telling
us
that
there is an error when building with your series, but this
doesn't
tell
me why this is the correct fix.

This is also why I asked to have the xen binary because I want
to
check
whether this was a latent bug in Xen or your series effectively
introduce a bug.

Note that regardless what I just wrote this is a good idea to
align
__proc_info_start. I will try to have a closer look later and
propose
a
commit message and/or any action for your other series.

Regarding binaries please take a look here:
https://lore.kernel.org/xen-devel/aa2862eacccfb0574859bf4cda8f4992baa5d2e1.ca...@gmail.com/

I am not sure if you get my answer as I had the message from
delivery
server that it was blocked for some reason.


I got the answer. The problem now is gitlab only keep the artifact
for
the latest build and it only provide a zImage (having the ELF would
be
easier).

I will try to reproduce the error on my end.


I managed to reproduce it. It looks like that after your bug patch,
*(.rodata.*) will not be end on a 4-byte boundary. Before your patch,
all the messages will be in .rodata.str. Now they are in
.bug_frames.*,
so there some difference in .rodata.*.

That said, it is not entirely clear why we never seen the issue
before
because my guessing there are no guarantee that .rodata.* will be
suitably aligned.

Anyway, here a proposal for the commit message:

"
xen/arm: Ensure the start *(.proc.info) of is 4-byte aligned

The entries in *(.proc.info) are expected to be 4-byte aligned and
the
compiler will access them using 4-byte load instructions. On Arm32,
the
alignment is strictly enforced by the processor and will result to a
data abort if it is not correct.

However, the linker script doesn't encode this requirement. So we are
at
the mercy of the compiler/linker to have aligned the previous
sections
suitably.

This was spotted when trying to use the upcoming generic bug
infrastructure with the compiler provided by Yocto.

Link:
https://lore.kernel.org/xen-devel/6735859208c6dcb7320f89664ae298005f70827b.ca...@gmail.com/
"

If you are happy with the proposed commit message, then I can update
it
while committing.

I am happy with the proposed commit message.


Thanks. With that:

Reviewed-by: Julien Grall 

I have addressed Jan's comment and committed the patch.

Cheers,

--
Julien Grall



[PATCH] x86emul: rework compiler probing in the test harness

2023-03-02 Thread Jan Beulich
Checking for what $(SIMD) contains was initially right, but already the
addition of $(FMA) wasn't. Later categories (correctly) weren't added.
Instead what is of interest is anything the main harness source file
uses outside of suitable #if and without resorting to .byte, as that's
the one file (containing actual tests) which has to succeed in building.
The auxiliary binary blobs we utilize may fail to build; the resulting
empty blobs are recognized and reported as "n/a" when the harness is
run.

Note that strictly speaking we'd need to probe the assembler. We assume
that a compiler knowing of a certain ISA extension is backed by an
equally capable assembler.

Signed-off-by: Jan Beulich 
---
A little while ago this would probably have enabled osstest to actually
build the harness. Luckily meanwhile a new enough gcc is in use there
to be unaffected by the inappropriate checking.

--- a/tools/tests/x86_emulator/Makefile
+++ b/tools/tests/x86_emulator/Makefile
@@ -104,11 +104,13 @@ TARGET-y := $(TARGET)
 
 ifeq ($(filter run%,$(MAKECMDGOALS)),)
 
-define simd-check-cc
+define isa-check-cc
 TARGET-$(shell echo 'int i;' | $(CC) -x c -c -o /dev/null -m$(1) - || echo y) 
:=
 endef
 
-$(foreach flavor,$(SIMD) $(FMA),$(eval $(call simd-check-cc,$(flavor
+ISA := bmi bmi2 tbm sse4.1 sse4.2 sse4a avx avx2 f16c
+ISA += $(addprefix avx512,f bw dq 4fmaps)
+$(foreach isa,$(ISA),$(eval $(call isa-check-cc,$(isa
 
 # Also explicitly check for {evex} pseudo-prefix support, which got introduced
 # only after AVX512F and some of its extensions.



Re: [PATCH v2 2/3] xen/riscv: initialize .bss section

2023-03-02 Thread Jan Beulich
On 02.03.2023 14:23, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/riscv64/head.S
> +++ b/xen/arch/riscv/riscv64/head.S
> @@ -13,6 +13,15 @@ ENTRY(start)
>  lla a6, _dtb_base
>  REG_S   a1, (a6)
>  
> +la  a3, __bss_start
> +la  a4, __bss_end
> +ble a4, a3, clear_bss_done

While it may be that .bss is indeed empty right now, even short term
it won't be, and never will. I'd drop this conditional (and in
particular the label), inserting a transient item into .bss for the
time being. As soon as your patch introducing page tables has landed,
there will be multiple pages worth of .bss.

Also are this and ...

> +clear_bss:
> +REG_S   zero, (a3)
> +add a3, a3, RISCV_SZPTR
> +blt a3, a4, clear_bss

... this branch actually the correct ones? I'd expect the unsigned
flavors to be used when comparing addresses. It may not matter here
and/or right now, but it'll set a bad precedent unless you expect
to only ever work on addresses which have the sign bit clear.

Jan

> +clear_bss_done:
> +
>  la  sp, cpu0_boot_stack
>  li  t0, STACK_SIZE
>  add sp, sp, t0




  1   2   >