Re: [PATCH] tools/oxenstored: Re-format

2024-03-27 Thread Edwin Torok
On Mon, Feb 26, 2024 at 10:48 AM Andrew Cooper
 wrote:
>
> Rerun make format.

Looks good, although not sure whether whitespace will be correctly
preserved in email, recommend using git to push the changes.
Reviewed-by: Edwin Török 

>
> Signed-off-by: Andrew Cooper 
> ---
> CC: Christian Lindig 
> CC: Edwin Török 
> CC: Rob Hoes 
> ---
>  tools/ocaml/xenstored/quota.ml | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/tools/ocaml/xenstored/quota.ml b/tools/ocaml/xenstored/quota.ml
> index 1f652040d898..082cd25f26fc 100644
> --- a/tools/ocaml/xenstored/quota.ml
> +++ b/tools/ocaml/xenstored/quota.ml
> @@ -55,13 +55,13 @@ let _check quota id size =
>  raise Data_too_big
>);
>if id > 0 then
> -  try
> -let entry = DomidMap.find id quota.cur in
> -if entry >= quota.maxent then (
> -  warn "domain %u cannot create entry: quota reached" id;
> -  raise Limit_reached
> -)
> -  with Not_found -> ()
> +try
> +  let entry = DomidMap.find id quota.cur in
> +  if entry >= quota.maxent then (
> +warn "domain %u cannot create entry: quota reached" id;
> +raise Limit_reached
> +  )
> +with Not_found -> ()
>
>  let check quota id size =
>if !activate then
> @@ -88,4 +88,4 @@ let merge orig_quota mod_quota dest_quota =
>  | diff -> update_entry dest id diff (* update with [x=x+diff] *)
>in
>{dest_quota with cur = DomidMap.fold fold_merge mod_quota.cur 
> dest_quota.cur}
> -  (* dest_quota = dest_quota + (mod_quota - orig_quota) *)
> +(* dest_quota = dest_quota + (mod_quota - orig_quota) *)
>
> base-commit: 8de3afc0b402bc17f65093a53e5870862707a8c7
> --
> 2.30.2
>



Re: [PATCH v1 2/2] oxenstored: make Quota.t pure

2024-02-23 Thread Edwin Torok


> On 31 Jan 2024, at 16:27, Edwin Torok  wrote:
> 
> 
> 
>> On 31 Jan 2024, at 11:17, Christian Lindig  
>> wrote:
>> 
>> 
>> 
>>> On 31 Jan 2024, at 10:52, Edwin Török  wrote:
>>> 
>>> Now that we no longer have a hashtable inside we can make Quota.t pure,
>>> and push the mutable update to its callers.
>>> Store.t already had a mutable Quota.t field.
>>> 
>>> No functional change.
>> 
>> Acked-by: Christian Lindig 
>> 
>> This is shifting copying working to GC work, at least potentially. I would 
>> agree that this is a good trade-off and the code looks correct to me. But I 
>> think we should see more testing and benchmarking before merging this unless 
>> we are fine merging speculative improvements.
>> 
>> — C
>> 
>> 
> 
> 
> I’ve written this [1] microbenchmark which creates ~300_000 entries in 
> xenstore, assigns quota to 1000 domains and then measure how long it takes to 
> list xenstore
> (It doesn’t matter whether these domains exist or not).
> It shows a measurable improvement with this patch, once I’ve run a more 
> realistic test on the original machine with 1000 real VMs I’ll post those 
> results too:

The machine that can run this test is offline now due to a lab move, but I 
managed to get this data before it went away, and I think this patch series is 
ready to be committed.

Flamegraph without my changes, where Hashtbl.copy takes up a significant amount 
of oxenstored time: 
https://cdn.jsdelivr.net/gh/edwintorok/xen@oxenstored-coverletter/docs/original.svg?x=153.0=1269
Flamegraph with this patch series, where Hashtbl.copy does not show up at all: 
https://cdn.jsdelivr.net/gh/edwintorok/xen@oxenstored-coverletter/docs/oxenstored_no_hashtbl_copy.svg?x=294.3=1301
(There are of course still hashtbl in the flame graph, due to the well-known 
inefficient poll_select implementation, and we see hashtbl iteration as a 
parent caller, which is fine)

IMHO this means the patch series is a worthwhile improvement: it removes a 
codepath that was previously a hotspot completely from oxenstored.

The timings on the test also show improvements with this patch:

Booting 575 VMs:
* without this patch series: 1099s
* with this patch series: 604s

Booting 626 VMs:
* without this patch series: 4027s
* with this patch series: 2115s

Booting 627 VMs:
* without this patch series: 4038s
* with this patch series: 4120s

This shows that *with* the patch series the system scales better until it hits 
a breaking point around 627 VMs where everything is ~equally slow

Not everything is ideal, there is also a slowdown around the 789 booted VM mark:
* without this patch series: 168s
* with this patch series: 394s
I wouldn’t worry about this result, because by this point some VMs have already 
crashed, and I’d consider the test to have failed by this point. What results 
you get at this point depends on how much CPU oxenstored gets compared to the 
qemus ioreq handler that is spinning handling the crash on the serial port.
To actually reach 1000 VMs it will require more fixes outsides of oxenstored 
(e.g. wrt to using correct groups with tapdisk, qemu, etc., or making qemu 
better cope with flooding on serial port from the guest), and probably some 
fixes to the poll_select in oxenstored that I may address in a future patch 
series.



Best regards,
—Edwin

> 
> On an Intel Xeon Gold 6354 @ 3.0 Ghz:
> * without my patch: 12.756 +- 0.152 seconds time elapsed  ( +-  1.19% )
> * with my patch: 5.6051 +- 0.0467 seconds time elapsed  ( +-  0.83% )
> 
> [1]
> # cat >create.py < #!/usr/bin/env python3
> from xen.lowlevel.xs import xs
> 
> xenstore = xs()
> 
> for i in range(1,1000):
>   txn = xenstore.transaction_start()
>   for j in range(1,10):
> for k in range(1,30):
> path=f"/quotatest/{i}/{j}/{k}/x"
> xenstore.write(txn, path, "val")
> # assign quota to domid i
> xenstore.set_permissions(txn, path, [{"dom": i}])
>   xenstore.transaction_end(txn)
> EOF
> # python3 create.py
> # perf stat -r 10 sh -c 'xenstore-ls $(for i in $(seq 1 100); do echo 
> "/quotatest/$i"; done) >/dev/null’
> 
> Best regards,
> —Edwin



Re: [PATCH v1 0/2] tools/ocaml: support OCaml 5.x, drop support for <=4.05

2024-02-01 Thread Edwin Torok



> On 31 Jan 2024, at 10:55, Andrew Cooper  wrote:
> 
> On 31/01/2024 10:44 am, Christian Lindig wrote:
>>> On 31 Jan 2024, at 10:42, Edwin Török  wrote:
>>> 
>>> Fix building oxenstored with OCaml 5.x.
>>> OCaml 5.x has removed some functions that have been deprecated for many 
>>> years,
>>> in order to support OCaml 5.x we need to drop support for OCaml 4.02.
>>> 
>>> Tested in gitlab CI (together with my other series):
>>> https://gitlab.com/xen-project/people/edwintorok/xen/-/pipelines/1158302827
>>> 
>>> Edwin Török (2):
>>> oxenstored: fix build on OCaml 5.x
>>> tools/ocaml: bump minimum version to OCaml 4.05
>>> 
>>> tools/configure   | 2 +-
>>> tools/configure.ac| 2 +-
>>> tools/ocaml/xenstored/disk.ml | 2 +-
>>> 3 files changed, 3 insertions(+), 3 deletions(-)
>>> 
>>> -- 
>>> 2.43.0
>>> 
>> Acked-by: Christian Lindig 
> 
> It occurs to me that this is the kind of thing which should get a
> CHANGELOG.md entry these days.  Something like:
> 
> diff --git a/CHANGELOG.md b/CHANGELOG.md
> index 1f55c9c72d10..fd7c8f5c6b82 100644
> --- a/CHANGELOG.md
> +++ b/CHANGELOG.md
> @@ -9,6 +9,7 @@ The format is based on [Keep a
> Changelog](https://keepachangelog.com/en/1.0.0/)
>  ### Changed
>   - Changed flexible array definitions in public I/O interface headers
> to not
> use "1" as the number of array elements.
> + - The minimum supported Ocaml toolchain version is now 4.05
>   - On x86:
> - HVM PIRQs are disabled by default.
> - Reduce IOMMU setup time for hardware domain.

Sounds good.

Should this be mentioned in 
https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches#Sending_a_Patch_Series?

Best regards,
—Edwin

> 
> 
> ought to do.
> 
> Have we checked to see whether this drops Ocaml from any of the build
> containers ?


I can look into this later, haven’t tried rebuilding the containers (the gitlab 
CI passed though)

Best regards,
—Edwin

> 
> ~Andrew




Re: [PATCH v1 2/2] tools/ocaml: bump minimum version to OCaml 4.05

2024-01-31 Thread Edwin Torok


> On 31 Jan 2024, at 17:17, Andrew Cooper  wrote:
> 
> On 31/01/2024 4:36 pm, Anthony PERARD wrote:
>> On Wed, Jan 31, 2024 at 10:42:49AM +, Edwin Török wrote:
>>> We tried bumping to 4.06.1 [1] previously, but OSSTest was holding us
>>> back.
>>> So bump to OCaml 4.05 instead, which should match the version on
>>> OSSTest?
>> Yes, it's looks that's the version osstest can currently use.
>> I've started an osstest flight with this patch series and your other
>> ocaml patch series, and so far osstest seems happy with it. The flight
>> isn't finished but all build jobs succeed, and a lot of the tests jobs
>> as well.
>> 
>> So:
>> Acked-by: Anthony PERARD 
> 
> A question, while I think about it.
> 
> I understand why we want patch 1.  The 4.02 -> 4.03 bump is necessary to
> also compile with 5.0
> 
> But why this 4.03 -> 4.05 bump?  There is no other change in this patch.


The oldest supported Debian has 4.05, and I can’t find a non-EOL distro with 
4.03 or 4.04 here: https://repology.org/project/ocaml/versions
I also have another series (that I haven’t sent out yet) which would use Dune 
1.x in an attempt to use Dune in a way that works on OSSTest, and the oldest 
release I can test this on is Debian 10.

We could keep the minimum at 4.03, but would anything in the CI actually be 
able to test that? 

Best regards,
—Edwin

> 
> If it's "just because", then why should we take it?  All it's doing is
> moving a baseline which doesn't need appear to need to move.
> 
> ~Andrew



Re: [PATCH v1 2/2] oxenstored: make Quota.t pure

2024-01-31 Thread Edwin Torok


> On 31 Jan 2024, at 11:17, Christian Lindig  wrote:
> 
> 
> 
>> On 31 Jan 2024, at 10:52, Edwin Török  wrote:
>> 
>> Now that we no longer have a hashtable inside we can make Quota.t pure,
>> and push the mutable update to its callers.
>> Store.t already had a mutable Quota.t field.
>> 
>> No functional change.
> 
> Acked-by: Christian Lindig 
> 
> This is shifting copying working to GC work, at least potentially. I would 
> agree that this is a good trade-off and the code looks correct to me. But I 
> think we should see more testing and benchmarking before merging this unless 
> we are fine merging speculative improvements.
> 
> — C
> 
> 


I’ve written this [1] microbenchmark which creates ~300_000 entries in 
xenstore, assigns quota to 1000 domains and then measure how long it takes to 
list xenstore
(It doesn’t matter whether these domains exist or not).
It shows a measurable improvement with this patch, once I’ve run a more 
realistic test on the original machine with 1000 real VMs I’ll post those 
results too:

On an Intel Xeon Gold 6354 @ 3.0 Ghz:
* without my patch: 12.756 +- 0.152 seconds time elapsed  ( +-  1.19% )
* with my patch: 5.6051 +- 0.0467 seconds time elapsed  ( +-  0.83% )

[1]
# cat >create.py 

Re: [PATCH v2] stubdom: remove caml-stubdom

2023-11-22 Thread Edwin Torok



> On 22 Nov 2023, at 10:06, Andrew Cooper  wrote:
> 
> On 22/11/2023 7:21 am, Juergen Gross wrote:
>> diff --git a/CHANGELOG.md b/CHANGELOG.md
>> index c341c9d0bf..bbb3cd5beb 100644
>> --- a/CHANGELOG.md
>> +++ b/CHANGELOG.md
>> @@ -14,6 +14,8 @@ The format is based on [Keep a 
>> Changelog](https://keepachangelog.com/en/1.0.0/)
>>  for IPIs and Physical addressing mode for external interrupts.
>> 
>> ### Removed
>> +- caml-stubdom has been removed. It didn't build since 2014, so nobody seems
>> +  to care.
> 
> May I suggest some different phrasing.
> 
> "caml-stubdom.  It hasn't built since 2014, was pinned to Ocaml 4.02,
> and has been superseded by the MirageOS/SOLO5 projects."
> 
> Also, one last ping to anyone Ocaml folks for any comment whatsoever.
> 

I'm fine with removing it, seems to predate the Lwt based Mirage, and the way 
to cross-compile OCaml to unikernels is completely different now.

I think the latest way to use Mirage under Xen is to use Solo5 (which has Xen 
support and gives you a PVH domain).
https://mirage.io/blog/announcing-mirage-39-release
"The Xen backend has been re-written from scratch to be based on Solo5, and now 
supports PVHv2 on Xen 4.10 or higher, and QubesOS 4.0."

The mirage version of xenstore is here (a functor that can be instantiated on 
both Unix and Unikernel implementations and doesn't depend on Irmin): 
https://github.com/mirage/ocaml-xenstore
And I think the Unikernel instantiation of the Xenstore Server functor is here 
https://github.com/djs55/ocaml-xenstore-xen/blob/master/xen/src/server_xen.ml, 
but probably needs to be updated to work with newer versions of Mirage. I 
haven't got around to trying it out yet.

I never tried to use the unikernel build inside the Xen tree, it seems to be 
completely undocumented so I wouldn't know where to start.

The new way is shown here, and dune has some builtin support to make this 
easier: https://mirage.io/blog/2022-03-30.cross-compilation

Best regards,
--Edwin


Re: [PATCH 4/5] tools/xenstored: remove "-N" command line option

2023-11-15 Thread Edwin Torok



> On 14 Nov 2023, at 22:11, Andrew Cooper  wrote:
> 
> On 13/11/2023 12:43 pm, Juergen Gross wrote:
>> The "-N" (do not daemonize) command line option is of questionable use:
>> its sole purpose seems to be to aid debugging of xenstored by making it
>> easier to start xenstored under gdb, or to see any debug messages
>> easily.
>> 
>> Debug messages can as well be sent to syslog(), while gdb can be
>> attached to the daemon easily. The only not covered case is an error
>> while initializing xenstored, but this could be handled e.g. by saving
>> a core dump, which can be analyzed later.
>> 
>> The call of talloc_enable_leak_report_full() done only with "-N"
>> specified is no longer needed, as the same can be achieved via
>> "xenstore-control memreport".
>> 
>> Signed-off-by: Juergen Gross 
> 
> CC Edvin.
> 
> There's a patch out to specifically use this option (under systemd) to
> fix a bug we found.
> 
> I can't recall the details, but I seem to recall it wasn't specific to
> oxenstored.
> 


The patch is here 
https://lore.kernel.org/xen-devel/ecfa15a7-9dc8-4476-8d0b-44a6d1219...@cloud.com/

It is about not losing stderr when run under systemd (well you can use syslog 
but what if your connection to syslog fails or you fail before getting to the 
point of parsing which syslog to use). 
Alternatively we could have a "don't redirect stderr to /dev/null" flag,
but such redirections are usually expected when daemonizing.

It'd be good if the --no-fork flag could be retained for C xenstored, currently 
there is a CI failure on a non-systemd system that I'm trying to fix (a bit 
blindly because I don't have such a system myself), but from my testing the 
flag does work on a systemd system with C xenstored.

The patch is motivated by having some undebuggable issues in oxenstored where 
it just exits without writing any messages and without dumping core, so by 
retaining the stderr path we should have an output of last resort in case 
something goes so wrong that the syslog error handler cannot function.

Best regards,
--Edwin


Re: [RFC PATCH 02/22] x86/msr: implement MSR_SMI_COUNT for Dom0 on Intel

2023-11-01 Thread Edwin Torok



> On 31 Oct 2023, at 09:57, Jan Beulich  wrote:
> 
> On 31.10.2023 10:42, Edwin Torok wrote:
>>> On 30 Oct 2023, at 16:20, Jan Beulich  wrote:
>>> On 25.10.2023 21:29, Edwin Török wrote:
>>>> Dom0 should always be able to read this MSR: it is useful when
>>>> investigating performance issues in production.
>>> 
>>> While I'm not outright opposed, I'm also not convinced. At the very least
>>> ...
>>> 
>>>> Although the count is Thread scoped, in practice all cores were observed
>>>> to return the same count (perhaps due to implementation details of SMM),
>>>> so do not require the cpu to be pinned in order to read it.
>>> 
>>> ... this, even if matching your observations, is going to be properly
>>> misleading in case counts end up diverging.
>>> 
>>>> This MSR exists on Intel since Nehalem.
>>>> 
>>>> Backport: 4.15+
>>> 
>>> If this was a backporting candidate, I think a Fixes: tag would need
>>> to indicate what's being fixed here.
>> 
>> 
>> I used the Backport tag to indicate what is the oldest release that it is 
>> backportable to.
>> IIRC the choices were:
>> * 4.0+ for issues that were present for a long time (didn't look further 
>> back than that in history), so there isn't any particular commit that 
>> introduces the problem, it was like that from the very beginning, i.e. not a 
>> regression.
>> 
>> * 4.13+ for issues affecting only new CPU support (I think that is the 
>> release that added Icelake support). I can attempt to find the commit that 
>> added Icelake support and mark them as Fixes: that commit (although there 
>> might be several of them)
>> 
>> * 4.15+ for bugs introduced by the default read-write msr changes
>> 
>> 
>>> Otherwise this is merely a new
>>> feature.
>>> 
>> 
>> Prior to the default rdwrmsr changes it was possible to read any MSR, so I 
>> consider it a bug that after the default rdwrmsr changes you can no longer 
>> do that, it takes away a valuable debugging tool.
> 
> As said elsewhere, making MSRs generally inaccessible was a deliberate change.
> I don't think any of us x86 maintainers has so far considered that as 
> introducing
> a bug. MSRs being accessible as a debugging tool may be worthwhile to have as 
> an
> optional feature (see my suggestion elsewhere as to a possible way to approach
> this), but I don't think this can be taken as an indication that we should 
> revert
> back to "blind" exposure.
> 


This particular patch (unlike the other one that is more general for all MSRs) 
is about one specific MSR, and makes it accessible to Dom0 only in a read-only 
way.
That is very different from blindly exposing it to all guests.
Anyway let me get something ready for master first, and then distributions that 
package Xen can decide whether to backport this patch or not, I'll replace 
Backport: 4.X with Backportable: 4.X (and add a Fixes: line) to indicate that 
this patch is recommended to be backported (for anyone who uses that version of 
Xen in production).

[I'm not saying that this patch is complete, as you've indicated there is more 
work required to make it work correctly wrt to pinning vs non-pinning, and I'll 
try to require a pinned Dom0 core in my next version]

Best regards,
--Edwin



Re: [RFC PATCH 04/22] x86/msr-index: add references to vendor manuals

2023-10-31 Thread Edwin Torok



> On 31 Oct 2023, at 11:34, Andrew Cooper  wrote:
> 
> On 30/10/2023 4:15 pm, Jan Beulich wrote:
>> 
>>> --- a/xen/arch/x86/include/asm/msr-index.h
>>> +++ b/xen/arch/x86/include/asm/msr-index.h
>>> @@ -13,6 +13,16 @@
>>> * Blocks of related constants should be sorted by MSR index. The constant
>>> * names should be as concise as possible, and the bit names may have an
>>> * abbreviated name. Exceptions will be considered on a case-by-case basis.
>>> + *
>>> + * References:
>>> + * - 
>>> https://software.intel.com/content/www/us/en/develop/articles/intel-sdm.html
>>> + * Intel(R) 64 and IA-32 architectures SDM volume 4: Model-specific 
>>> registers
>>> + * Chapter 2, "Model-Specific Registers (MSRs)"
>>> 
>> ... at least Intel's URL has changed several times over the years. Volume
>> and chapter numbers change even more frequently. Any such is liable to go
>> stale at some point.
> 
> https://intel.com/sdm
> 
> This one has been valid for roughly the lifetime of intel.com, and is 
> committed to stay so.

That is useful to know, I'll update the URL.

> 
>> 
>> Jan
>> 
>> 
>>> + * - https://developer.amd.com/resources/developer-guides-manuals/
> 
> whereas AMD really have broken this one, and don't seem to be showing any 
> urgency in unbreaking it...  Right now there is no landing page at all for 
> manuals.
> 


Linux commits appear to reference a certain bugzilla that has the manuals 
uploaded: https://bugzilla.kernel.org/show_bug.cgi?id=206537
(although they will go stale in another way, e.g. I see no 2023 manuals there, 
but at least you know which manual a given commit referenced).
Although referencing someone else's bugzilla in the Xen codebase wouldn't be a 
nice thing to do, so if we do this it'd probably have to be something hosted on 
Xen infra.

For now I'll probably drop the URL and just keep the name (so at least you'd 
know what to search for).


Best regards,
--Edwin

> ~Andrew




Re: [RFC PATCH 02/22] x86/msr: implement MSR_SMI_COUNT for Dom0 on Intel

2023-10-31 Thread Edwin Torok



> On 30 Oct 2023, at 16:20, Jan Beulich  wrote:
> 
> On 25.10.2023 21:29, Edwin Török wrote:
>> Dom0 should always be able to read this MSR: it is useful when
>> investigating performance issues in production.
> 
> While I'm not outright opposed, I'm also not convinced. At the very least
> ...
> 
>> Although the count is Thread scoped, in practice all cores were observed
>> to return the same count (perhaps due to implementation details of SMM),
>> so do not require the cpu to be pinned in order to read it.
> 
> ... this, even if matching your observations, is going to be properly
> misleading in case counts end up diverging.
> 
>> This MSR exists on Intel since Nehalem.
>> 
>> Backport: 4.15+
> 
> If this was a backporting candidate, I think a Fixes: tag would need
> to indicate what's being fixed here.


I used the Backport tag to indicate what is the oldest release that it is 
backportable to.
IIRC the choices were:
* 4.0+ for issues that were present for a long time (didn't look further back 
than that in history), so there isn't any particular commit that introduces the 
problem, it was like that from the very beginning, i.e. not a regression.

* 4.13+ for issues affecting only new CPU support (I think that is the release 
that added Icelake support). I can attempt to find the commit that added 
Icelake support and mark them as Fixes: that commit (although there might be 
several of them)

* 4.15+ for bugs introduced by the default read-write msr changes


> Otherwise this is merely a new
> feature.
> 

Prior to the default rdwrmsr changes it was possible to read any MSR, so I 
consider it a bug that after the default rdwrmsr changes you can no longer do 
that, it takes away a valuable debugging tool.
I can attempt to find a more specific commit to indicate with Fixes:

>> --- a/xen/arch/x86/include/asm/msr-index.h
>> +++ b/xen/arch/x86/include/asm/msr-index.h
>> @@ -641,6 +641,9 @@
>> #define MSR_NHL_LBR_SELECT 0x01c8
>> #define MSR_NHL_LASTBRANCH_TOS 0x01c9
>> 
>> +/* Nehalem and newer other MSRs */
>> +#define MSR_SMI_COUNT   0x0034
> 
> See my comment on the other patch regarding additions here.
> 
>> --- a/xen/arch/x86/msr.c
>> +++ b/xen/arch/x86/msr.c
>> @@ -139,6 +139,13 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t 
>> *val)
>> *val = msrs->misc_features_enables.raw;
>> break;
>> 
>> +case MSR_SMI_COUNT:
>> +if ( cp->x86_vendor != X86_VENDOR_INTEL )
>> +goto gp_fault;
>> +if ( is_hardware_domain(d) && !rdmsr_safe(msr, *val) )
>> +break;
>> +return X86EMUL_UNHANDLEABLE;
> 
> Why #GP for non-Intel but UNHANDLEABLE for DomU?

I wanted to match the behaviour of the 'default:' case statement, although 
looking at the other MSR handling code in this file they just usually gp_fault,
with the exception of the MSR_K8* code that returns UNHANDLEABLE, so if 
condition here could be simplified (e.g. follow how MSR_AMD_PATCHLEVEL does it).

Best regards,
--Edwin 

> 
> Jan




Re: [RFC PATCH 03/22] x86/msr: always allow a pinned Dom0 to read any unknown MSR

2023-10-31 Thread Edwin Torok



> On 30 Oct 2023, at 16:29, Jan Beulich  wrote:
> 
> On 25.10.2023 21:29, Edwin Török wrote:
>> This can be useful if you realize you have to inspect the value of an
>> MSR in production, without having to change into a new Xen first that
>> handles the MSR.
> 
> Yet on a non-pinned Dom0 you'd still be lost. Since iirc we generally
> advise against pinning,

You can temporarily pin while debugging the issue (e.g. pin just 1 CPU from 
Dom0, and "walk" all your physical CPUs with it if you have to,
so that you query them all), e.g. with 'xl vcpu-pin'.
Although that is more invasive than reading a value.
 
Or alternatively have another (privileged) interface to read the MSR for a 
given core without exposing it to any guests, that way you don't affect the 
running system at all
(which would be preferable in a production environment), i.e. a Xen equivalent 
of 'rdmsr'.

> I wonder of how much use such a change would be,
> when it effectively undoes what we deliberately did a while ago.
> 
>> --- a/xen/arch/x86/hvm/svm/svm.c
>> +++ b/xen/arch/x86/hvm/svm/svm.c
>> @@ -1933,6 +1933,9 @@ static int cf_check svm_msr_read_intercept(
>> break;
>> 
>> default:
>> +if ( is_hwdom_pinned_vcpu(v) && !rdmsr_safe(msr, *msr_content) )
>> +break;
>> +
>> if ( d->arch.msr_relaxed && !rdmsr_safe(msr, tmp) )
>> {
>> *msr_content = 0;
> 
> If we went as far as undoing some of what was done, I'd then wonder
> whether instead we should mandate relaxed mode to be enabled on such a
> Dom0. Then, instead of returning fake 0 here, the actual value could
> be returned in the specific case of (pinned?) Dom0.


Can relaxed mode be enabled at runtime? I'd be happy with either solution, but 
it should be something that can be enabled at runtime
(if you have to reboot Xen then you may lose the bug repro that you want to 
gather more information on).
Although changing such a setting in a production environment may still be 
risky, because the guest will then become very confused that it has previously 
read some 0s, now there are some real values, and later when you flip the 
switch off it gets 0s again.

Best regards,
--Edwin


Re: [PATCH] xenstored: do not redirect stderr to /dev/null

2023-10-30 Thread Edwin Torok



> On 25 Oct 2023, at 14:50, Edwin Török  wrote:
> 
> From: Edwin Török 
> 
> By default stderr gets redirected to /dev/null because oxenstored daemonizes 
> itself.
> This must be a left-over from pre-systemd days.
> 
> In ee7815f49f ("tools/oxenstored: Set uncaught exception handler") a 
> workaround was added to log exceptions
> directly to syslog to cope with standard error being lost.
> 
> However it is better to not lose standard error (what if the connection to 
> syslog itself fails, how'd we log that?),
> and use the '--no-fork' flag to do that.
> This flag is supported by both C and O versions of xenstored.
> 
> Both versions also call sd_notify so there is no need for forking.
> 
> Leave the default daemonize as is so that xenstored keeps working on 
> non-Linux systems as before.
> 
> Signed-off-by: Edwin Török 
> ---
> tools/hotplug/Linux/init.d/sysconfig.xencommons.in | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in 
> b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
> index 433e4849af..09a1230cee 100644
> --- a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
> +++ b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
> @@ -52,7 +52,7 @@
> # like "--trace-file @XEN_LOG_DIR@/xenstored-trace.log"
> # See "@sbindir@/xenstored --help" for possible options.
> # Only evaluated if XENSTORETYPE is "daemon".
> -XENSTORED_ARGS=
> +XENSTORED_ARGS=--no-fork


I think the CI failure is due to this patch, and it only happens on Linux 
systems that do not use systemd.
In that case we do need to fork, because that is the only way not to tie up the 
boot sequence.

I'll try to make '--no-fork' depend on having systemd present, because 
otherwise I tested both C and O xenstored and they do start up with --no-fork.

Best regards,
--Edwin


Re: [PATCH] tools/ocaml/Makefile.rules: use correct C flags when compiling OCaml C stubs

2023-10-26 Thread Edwin Torok



> On 25 Oct 2023, at 15:02, Christian Lindig  wrote:
> 
> 
> 
>> On 25 Oct 2023, at 14:52, Edwin Török  wrote:
>> 
>> From: Edwin Török 
>> 
>> The code currently uses GCC to compile OCaml C stubs directly,
>> and although in most cases this works, it is not entirely correct.
>> 
>> This will fail if the OCaml runtime has been recompiled to use and link with 
>> ASAN for example
>> (or other situations where a flag needs to be used consistently in 
>> everything that is linked into the same binary).
>> 
>> Use the OCaml compiler instead, which knows how to invoke the correct C 
>> compiler with the correct flags,
>> and append the Xen specific CFLAGS to that instead.
>> 
>> Drop the explicit -fPIC and -I$(ocamlc -where): these will now be provided 
>> by the compiler as needed.
>> 
>> Use -verbose so we see the actuall full C compiler command line invocation 
>> done by the OCaml compiler.
>> 
>> Signed-off-by: Edwin Török 
> 
> Acked-by: Christian Lindig 
> 
> I like using the OCaml compiler to compile stubs as it knows how to handle C 
> files and will invoke the C compiler with the correct flags. However, this is 
> the kind of change that would be good to have tested on all supported 
> platforms. I therefore invite comments from those who maintain the build 
> process.


There was a CI failure. I've pushed an updated version of this patch here:
https://gitlab.com/xen-project/people/edwintorok/xen/-/commit/137ffad324eb82884e7ac6f46f618459d365693e

If it passes the CI this time I'll send out a V2, looks like the -fPIC flag is 
needed on some platforms and is not automatically added by the OCaml compiler.


Best regards,
--Edwin


Re: [PATCH] tools/ocaml/Makefile.rules: use correct C flags when compiling OCaml C stubs

2023-10-26 Thread Edwin Torok



> On 25 Oct 2023, at 15:04, Jan Beulich  wrote:
> 
> On 25.10.2023 15:52, Edwin Török wrote:
>> --- a/tools/ocaml/Makefile.rules
>> +++ b/tools/ocaml/Makefile.rules
>> @@ -37,7 +37,7 @@ ALL_OCAML_OBJS ?= $(OBJS)
>> $(call quiet-command, $(OCAMLYACC) -q $<,MLYACC,$@)
>> 
>> %.o: %.c
>> - $(call quiet-command, $(CC) $(CFLAGS) -c -o $@ $<,CC,$@)
>> + $(call quiet-command, $(OCAMLOPT) -verbose $(addprefix -ccopt ,$(CFLAGS)) 
>> -c -o $@ $<,CC,$@)
> 
> Wouldn't -verbose better be passed only if the build isn't a quiet one?

Only the OCaml files (and the hypervisor itself) are compiled in quiet mode. It 
looks like tools/ and the C files in tools/ocaml were not,
so the patch as is preserves the existing behaviour.

I think there were some patches to switch to a non-recursive make, I hope 
that'll make quiet mode more consistent throughout the tree, until then I'd 
keep it as is
instead of complicating the macro more.

Thanks,
--Edwin


Re: [RFC PATCH 01/22] x86/msr: MSR_PLATFORM_INFO shouldn't claim that turbo is programmable

2023-10-26 Thread Edwin Torok



> On 25 Oct 2023, at 21:33, Andrew Cooper  wrote:
> 
> On 25/10/2023 8:29 pm, Edwin Török wrote:
>> From: Edwin Török 
>> 
>> Xen forbids writes to the various turbo control MSRs, however 
>> MSR_PLATFORM_INFO claims that these MSRs are writable.
>> Override MSR_PLATFORM_INFO bits to indicate lack of support.
>> 
>> See Intel SDM Volume 4, 2.17.6 "MSRs Introduced in the Intel Xeon Scaslable 
>> Processor Family",
>> which describes that MSR_PLATFORM_INFO.[28] = 1 implies that 
>> MSR_TURBO_RATIO_LIMIT is R/W,
>> and similarly bit 29 for TDP control, and bit 30 for MSR_TEMPERATURE_TARGET.
>> 
>> These bits were not all present on earlier processors, however where missing 
>> the bits were reserved,
>> and when present they are always present in the same bits.
>> 
>> (Curiously bit 31 that Xen uses is not documented anywhere in this manual 
>> but a separate one).
>> 
>> Backport: 4.0+
>> 
>> Signed-off-by: Edwin Török 
> 
> p->platform_info never has any bit other than cpuid_faulting set in it. 
> We still don't even report the proper raw value, because we don't (yet)
> have clean MSR derivation logic.
> 
> I'm confused as to how you managed to find these set.  Even back in Xen
> 4.13, PLATFORM_INFO was covered by the msr_policy (later merged into
> cpu_policy).  Furthermore, even patch 3 oughtn't to have such an effect.
> 
> Sadly, the whole of this MSR is model specific.  Vol4 2.17 is only for
> one SKX/CLX/ICX/SPR.  Technically its wrong to treat the cpuid_faulting
> in the way we do, but it is enumerated separately, and we intentionally
> don't have an Intel check because we need to emulate CPUID faulting on
> AMD hardware to make PVShim work.
> 

xen/lib/x86/msr.c:COPY_MSR(MSR_INTEL_PLATFORM_INFO, p->platform_info.raw);

This code made me believe that the underlying MSR value was copied, and only 
specific values from it were overwritten, I didn't spot any zeroing.
Although running rdmsr now (on 4.13.5) does show that all the other bits are 
zero, so the zeroing must happen somewhere else in the code, making this patch 
obsolete.
I'll drop it.

Thanks,
--Edwin

> ~Andrew




Re: [RFC PATCH 22/22] x86/AMD: add IRPerf support

2023-10-26 Thread Edwin Torok



> On 25 Oct 2023, at 21:59, Andrew Cooper  wrote:
> 
> On 25/10/2023 8:29 pm, Edwin Török wrote:
>> diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
>> index 483b5e4f70..b3cd851d9d 100644
>> --- a/xen/arch/x86/msr.c
>> +++ b/xen/arch/x86/msr.c
>> @@ -584,6 +584,13 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t 
>> val)
>> }
>> break;
>> 
>> +case MSR_K8_HWCR:
>> +if ( !(cp->x86_vendor & X86_VENDOR_AMD) ||
>> + (val & ~K8_HWCR_IRPERF_EN) ||
>> + wrmsr_safe(msr, val) != 0 )
>> +goto gp_fault;
>> +break;
>> +
>> case MSR_AMD64_DE_CFG:
>> /*
>>  * OpenBSD 6.7 will panic if writing to DE_CFG triggers a #GP:
>> diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
>> b/xen/include/public/arch-x86/cpufeatureset.h
>> index 5faca0bf7a..40f74cd5e8 100644
>> --- a/xen/include/public/arch-x86/cpufeatureset.h
>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
>> @@ -241,7 +241,7 @@ XEN_CPUFEATURE(EFRO,  7*32+10) /*   APERF/MPERF 
>> Read Only interface */
>> 
>> /* AMD-defined CPU features, CPUID level 0x8008.ebx, word 8 */
>> XEN_CPUFEATURE(CLZERO,8*32+ 0) /*A  CLZERO instruction */
>> -XEN_CPUFEATURE(IRPERF,8*32+ 1) /* Instruction Retired Performance 
>> Counter */
>> +XEN_CPUFEATURE(IRPERF,8*32+ 1) /*A! Instruction Retired Performance 
>> Counter */
>> XEN_CPUFEATURE(RSTR_FP_ERR_PTRS, 8*32+ 2) /*A  (F)X{SAVE,RSTOR} always 
>> saves/restores FPU Error pointers */
>> XEN_CPUFEATURE(WBNOINVD,  8*32+ 9) /*   WBNOINVD instruction */
>> XEN_CPUFEATURE(IBPB,  8*32+12) /*A  IBPB support only (no IBRS, used 
>> by AMD) */
> 
> Last things first.  You can advertise this bit to guests, but I'm
> looking at the AMD manuals and woefully little information on what this
> is an how it works.
> 
> It does have an architectural enumeration, but there isn't even a a
> description of what HWCR.IPERF_EN does.
> 
> The HWCR documentation says "enables Core::X86::Msr::IRPerfCount", but
> the IRPERF MSR says no such think,

> noting only that EFRO_LOCK makes the
> MSR non-writeable (which in turn means we can't always offer it to
> guests anyway.  See below).

> At a guess, the HWCR bit activates the counter, but nothing states
> this.


My version 
(https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24593.pdf)
 says:
"This is a dedicated counter that is always counting instructions retired. It 
exists at MSR address C000_00E9. It is enabled by setting a 1 to HWCR[30] and 
its support is indicated by CPUID Fn8000_0008_EBX[1]."

Although digging a bit more there are some erratas around deep C states on some 
chips and the HWCR register itself is not global but per CCD (which is neatly 
buried in the _ccd[7:0]_ smallprint at the top:
https://github.com/cyring/CoreFreq/issues/206#issuecomment-722713147


>   At a guess, it's a free-running, reset-able counter, but again
> nothing states this.  The manual just says "is a dedicated counter" and
> doesn't even state whether it is (or is not) tied into the other core
> fixed counters and whether it is integrated with PMI or not.

There is also a LWP 'Instructions Retired' described in 13.4.2.2, which is 
user-mode only and per-core, but I assume that is completely different from 
this MSR
and part of the features that got dropped in newer cores.

> 
> Which brings us on to the middle hunk.  Putting it there short circuits
> the PV-specific handling, but you can't do that in general without
> arranging for HWCR to be context switched properly for vCPUs, nor in
> this case IPERF_COUNT itself.
> 
> Unless you context switch HWCR and IPERF_COUNT, a guest will see it not
> counting, or the count going backwards, 50% of the time as the vCPU is
> scheduled around.
> 
> So while in principle we could let guests use this facility (it would
> have to be off by default, because there doesn't appear to be any
> filtering capability in it at all, so we can't restrict it to
> instructions retired in guest context), it will need a far larger patch
> than this to do it safely.

Thanks, I'll move this patch into the 'needs more work' category.


Thanks,
--Edwin




Re: [PATCH] tools/ocaml/Makefile.rules: use correct C flags when compiling OCaml C stubs

2023-10-25 Thread Edwin Torok
On Wed, Oct 25, 2023 at 2:58 PM Edwin Török  wrote:
> [...]
> Signed-off-by: Edwin Török 

Sorry about the duplicate emails to the list, my 'git send-email'
isn't working quite right.



xen summit 2023 design session: Using only stable interfaces for (o)xenstored

2023-06-27 Thread Edwin Torok
Background: 
https://design-sessions.xenproject.org/uid/discussion/disc_MEXyUIXV6clI8n1kgzQ9/view

Possible solutions:
Andrew Cooper: use get domain info list instead of querying domains one-by-one 
to reduce number of hypercalls, even though Alejandro's patch makes the 
get_domainfo lookups O(1).
To avoid a hypercall a shared read-only memory page between Xen and Dom0, 
containing a bitmap of domain existence and shutdown state.
We have 32k domains max and this fits into a small number of 4K pages.
Hypervisor can update it using atomic set/clear-bit, Dom0 will memcpy and then 
check for differences
Bernhardt/Andrew: discussion about efficiency and find first bit set, XOR. A 
*lot* more efficient than using hypercalls anyway
Edwin: do I have to implement that for various architectures?
Andrew Cooper: already implemented for all arches. Some architectures can only 
do atomic updates on natural register width (RISC-V), but that is already 
handled
Edwin: how would the stable interface look like? Memory layout definition in 
the header and ..?
Andrew Cooper: use acquire_resource with a new resource type, already have 
stable xen libs for resource mapping

Christopher Clarke/Daniel Smith (virtually from chatroom): alternatives to 
bitmap, how is it indexed?
Andrew Coooper: indexed by domid, we only have 32k
Marcus Granado: what if we want to increase that?
Andrew Coooper: 16-bit domid is baked into so many ABIs that if we want to 
increase that then this API won't be the blocker. No need to worry about more 
than 32k domids.
Christopher Clark: how about using an Argo ring
Andrew Cooper: Xen must NOT block when delivering the notification and 
notification delivery cannot be lossy, cannot afford to drop events: 
toolstacks/xenstored critically rely on this information.
Christopher Clark: bip buffers: 
https://www.codeproject.com/Articles/3479/The-Bip-Buffer-The-Circular-Buffer-with-a-Twist
 as possible solution to avoid running out of space
Andrew Cooper: bitmap also avoids having to take the domctl lock and reduce 
background idle CPU usage
Edwin: on that topic: we should also fix xcp-rrdd so it doesn't shell out to 
xenpm every 5s to collect statistics...
Andrew Cooper: should be possible to use libxenctrl directly for that
Edwin Torok: there are other interfacs that would need to be stabilized (see 
xenopsd), but lets start small and fix just this one for now, which will result 
in an immediate improvement for (o)xenstored since this was the final blocker 
to getting off stable interfaces (there is one other usage, but that already 
has a patch posted/solution)

Edwin Torok: how about release cadence? Doesn't necessarily have to go into 
immediately next Xen release, but don't know in which phase the release is in 
right now
Andrew Cooper: probably about 6 weeks time to get into next release





xen summit 2023 design session: distributed tracing

2023-06-25 Thread Edwin Torok
design session: distributed tracing

Marcusg: proof of concept
* where to start, simplest parts
* how to attempt to collect the context (unique/random id),begin/end events
* e.g. domain create

George: pass in id to hypervisor
Andy: maybe userspace first, libraries/xenguest
 more difficult in calls like add_to_physmap
 continuable hypercall indistinguishable from userspace calling it N times
 e.g. new event becomes pending

george: dom0 jumps into Xen, long running hypercall finishes, if it wasn't done 
it stores how far it got,
and pushes the IP back one instruction so when VM runs again it will reexecute 
same hypercall

mark: you called first one with a context, that continuation might've been 
stashed
andy: you'll need large trace buffers
george: write to disk might be limit, might be minutes worth of events
optimization
edwin: could filter on dump if needed, also hypercall could store a bit saying 
we started so we don't emit repeated traces
andy: we guarantee forward progress in hypercalls, but limited room to store 
continuation info

edwin: current context?
andy: hypervisor can't tell which process is current, continuation might be on 
different cpu,
  stashing domain-related hypercalls we can stash state in struct  domain/cpu, 
except for first hypercall
george: multicall like wrapper for tracing, Xen is not preemptable, store 
start/end
other ops inbetween may get tagged incorrectly when interrupted
andy: xentrace doesn't distinguish sync and async events, to ignore interrupts
andy: how big, what format?
marcusg/edwin: unique ids, 128-bit for context ,span 64-bit, but has 
parent/child and if child is unique it has a link to full parent

edwin: spanid passed in, hypervisor
george: postprocess and generate opentracing format there, hypervisor is not 
preemptible, can infer parents

andy/george: how to gen id? random? host unique: vcpu index + per-cpu counter. 
rdtsc? not reliable

pcpu bits + rdtsc bits --> unique
pcpu + incrementing counter
andy: tracing interrupts? not associated with anything
george: sometimes you can tie it back to a high-level task, e.g. global tlb 
flush
andy: properties of timestamps?
marcusg: begin/end event, translatable back
george: xentrace has (optional) time information, correlates across pcpus, uses 
scheduling as a lamport clock.
TSC drift not really a problem anymore
andy: invariant TSC, assume
andy: host mode tsc: no scaling/offset so timestamps in guest can be compared 
to Dom0
timestamps: convert to ns
andy: clocks change frequency with temp, etc.
Dom0 clocksource: tsc as clocksource doesn't quite work, but kernel could have 
NTP adjustment info

andy: what information would we want?
marcusg: who/what features are being used?
george: add tracing as needed
marcusg: tracing what happens with nested virt
andy: that would trap too often
edwin: pick one op as  PoC, but more than one optimization
george: get ready for tracing "in anger", do a proof of concept on a particular 
op to figure out the mechanism
domain create, get xenalyze ready to translate


Re: [PATCH v1 1/2] xc_core_arch_map_p2m_tree_rw: fix memory leak

2023-02-27 Thread Edwin Torok



> On 24 Feb 2023, at 14:56, Andrew Cooper  wrote:
> 
> On 24/02/2023 1:36 pm, Edwin Török wrote:
>> From: Edwin Török 
>> 
>> Prior to bd7a29c3d0 'out' would've always been executed and memory
>> freed, but that commit changed it such that it returns early and leaks.
>> 
>> Found using gcc 12.2.1 `-fanalyzer`:
>> ```
>> xg_core_x86.c: In function ‘xc_core_arch_map_p2m_tree_rw’:
>> xg_core_x86.c:300:5: error: leak of ‘p2m_frame_list_list’ [CWE-401] 
>> [-Werror=analyzer-malloc-leak]
>>  300 | return p2m_frame_list;
>>  | ^~
>>  ‘xc_core_arch_map_p2m_writable’: events 1-2
>>|
>>|  378 | xc_core_arch_map_p2m_writable(xc_interface *xch, struct 
>> domain_info_context *dinfo, xc_dominfo_t *info,
>>|  | ^
>>|  | |
>>|  | (1) entry to ‘xc_core_arch_map_p2m_writable’
>>|..
>>|  381 | return xc_core_arch_map_p2m_rw(xch, dinfo, info, 
>> live_shinfo, live_p2m, 1);
>>|  |
>> ~~~
>>|  ||
>>|  |(2) calling ‘xc_core_arch_map_p2m_rw’ from 
>> ‘xc_core_arch_map_p2m_writable’
>>|
>>+--> ‘xc_core_arch_map_p2m_rw’: events 3-10
>>   |
>>   |  319 | xc_core_arch_map_p2m_rw(xc_interface *xch, struct 
>> domain_info_context *dinfo, xc_dominfo_t *info,
>>   |  | ^~~
>>   |  | |
>>   |  | (3) entry to ‘xc_core_arch_map_p2m_rw’
>>   |..
>>   |  328 | if ( xc_domain_nr_gpfns(xch, info->domid, 
>> >p2m_size) < 0 )
>>   |  |~
>>   |  ||
>>   |  |(4) following ‘false’ branch...
>>   |..
>>   |  334 | if ( dinfo->p2m_size < info->nr_pages  )
>>   |  | ~~ ~
>>   |  | |  |
>>   |  | |  (6) following ‘false’ branch...
>>   |  | (5) ...to here
>>   |..
>>   |  340 | p2m_cr3 = GET_FIELD(live_shinfo, arch.p2m_cr3, 
>> dinfo->guest_width);
>>   |  | ~~~
>>   |  | |
>>   |  | (7) ...to here
>>   |  341 |
>>   |  342 | p2m_frame_list = p2m_cr3 ? 
>> xc_core_arch_map_p2m_list_rw(xch, dinfo, dom, live_shinfo, p2m_cr3)
>>   |  |  
>> ~
>>   |  343 |  : 
>> xc_core_arch_map_p2m_tree_rw(xch, dinfo, dom, live_shinfo);
>>   |  |  
>> 
>>   |  |  | |
>>   |  |  | (9) ...to here
>>   |  |  | (10) calling 
>> ‘xc_core_arch_map_p2m_tree_rw’ from ‘xc_core_arch_map_p2m_rw’
>>   |  |  (8) following ‘false’ 
>> branch...
>>   |
>>   +--> ‘xc_core_arch_map_p2m_tree_rw’: events 11-24
>>  |
>>  |  228 | xc_core_arch_map_p2m_tree_rw(xc_interface *xch, 
>> struct domain_info_context *dinfo,
>>  |  | ^~~~
>>  |  | |
>>  |  | (11) entry to ‘xc_core_arch_map_p2m_tree_rw’
>>  |..
>>  |  245 | if ( !live_p2m_frame_list_list )
>>  |  |~
>>  |  ||
>>  |  |(12) following ‘false’ branch (when 
>> ‘live_p2m_frame_list_list’ is non-NULL)...
>>  |..
>>  |  252 | if ( !(p2m_frame_list_list = 
>> malloc(PAGE_SIZE)) )
>>  |  | ~~ ~ ~
>>  |  | |  | |
>>  |  | |  | (14) allocated 
>> here
>>  |  | |  (15) assuming ‘p2m_frame_list_list’ is 
>> non-NULL
>>  |  | |  (16) following ‘false’ branch (when 
>> ‘p2m_frame_list_list’ is non-NULL)...
>>  |  | (13) ...to here
>>  |..
>>  |  257 | memcpy(p2m_frame_list_list, 
>> live_p2m_frame_list_list, PAGE_SIZE);
>>  |  | ~~
>>  |  | |
>>  |  | (17) ...to here
>>  |..
>>  |  266 | else if ( dinfo->guest_width < sizeof(unsigned 
>> long) )
>>  |  | ~
>>  |  | |
>>  |  | (18) following ‘false’ branch...
>>  |..
>>  |  270 | live_p2m_frame_list =
>>  |  | 

Re: [PATCH] docs: clarify xenstore permission documentation

2023-02-09 Thread Edwin Torok



> On 9 Feb 2023, at 15:15, Andrew Cooper  wrote:
> 
> On 09/02/2023 2:41 pm, Juergen Gross wrote:
>> In docs/misc/xenstore.txt the description of the Xenstore node access
>> permissions is missing one important aspect, which can be found only
>> in the code or in the wiki [1]:
>> 
>> The first permission entry is defining the owner of the node via the
>> domid, and the access rights for all domains NOT having a dedicated
>> permission entry.
>> 
>> Make that aspect clear in the official documentation.
>> 
>> [1]: https://wiki.xenproject.org/wiki/XenBus#Permissions
>> 
>> Signed-off-by: Juergen Gross 
> 
> I feel as if Edvin deserves some kind of credit here, seeing as it was
> his observation...
> 
> Also, CC to double check the wording.
> 
> ~Andrew
> 
>> ---
>> docs/misc/xenstore.txt | 17 ++---
>> 1 file changed, 10 insertions(+), 7 deletions(-)
>> 
>> diff --git a/docs/misc/xenstore.txt b/docs/misc/xenstore.txt
>> index 8887e7df88..d807ef0709 100644
>> --- a/docs/misc/xenstore.txt
>> +++ b/docs/misc/xenstore.txt
>> @@ -45,13 +45,16 @@ them to within 2048 bytes.  (See XENSTORE_*_PATH_MAX in 
>> xs_wire.h.)
>> 
>> Each node has one or multiple permission entries.  Permissions are
>> granted by domain-id, the first permission entry of each node specifies
>> -the owner of the node.  Permissions of a node can be changed by the
>> -owner of the node, the owner can only be modified by the control
>> -domain (usually domain id 0).  The owner always has the right to read
>> -and write the node, while other permissions can be setup to allow
>> -read and/or write access.  When a domain is being removed from Xenstore
>> -nodes owned by that domain will be removed together with all of those
>> -nodes' children.
>> +the owner of the node, who always has full access to the node (read and
>> +write permission).

>>  The access rights of the first entry specify the
>> +allowed access for all domains not having a dedicated permission entry
>> +(the default is "n", removing access for all domains not explicitly
>> +added via additional permission entries).  Permissions of a node can be
>> +changed by the owner of the node, the owner can only be modified by the
>> +control domain (usually domain id 0).

This looks good in general, one small nitpick, maybe we need something like 
this too:
", or domains equivalent to the owner or control domain (domain equivalence can 
be established with the privileged SET_TARGET operation)"

Thanks,
--Edwin

>>  Other permissions can be setup to
>> +allow read and/or write access for other domains.  When a domain is
>> +being removed from Xenstore nodes owned by that domain will be removed
>> +together with all of those nodes' children.
>> 
>> 
>> Communication with xenstore is via either sockets, or event channel
> 




Re: [PATCH v4 03/11] CODING_STYLE(tools/ocaml): add 'make format' and remove tabs

2022-12-19 Thread Edwin Torok



> On 19 Dec 2022, at 11:52, Anthony PERARD  wrote:
> 
> On Fri, Dec 16, 2022 at 06:25:12PM +, Edwin Török wrote:
>> + git ls-files '*.c' '*.h' | xargs -n1 sed -ie 's/\t//g'
> 
> Seen as there's a patch adding .clang-format, what the point of this
> command?

The diff to change tabs to spaces (and the equivalent one from ocp-indent) can 
be proven
to introduce 0 changes by looking at the diff with ignore-whitespace.
Proving the same with ocamlformat or clang-format is more difficult (and in 
particular if you keep rebasing commits from after the reformat commit to 
before it
you risk losing the change if you don't redo the format commit correctly).

So I intended to do this gradually: first get indentation to be consistent, and 
then get formatting to be consistent later once
the former has been accepted and committed.

> 
> "-ie" means to ask sed to change file in-place an keep a copy of the
> original file with "e" as suffix. So please replace that by "-i~" or
> "-i -e" or just "-i", with the first one create a "~" backup, the last
> two not creating a backup at all and probably what you wanted. ("-e" is
> optional as there's only a single command)
> 

Thanks for pointing it out, what I wanted is 'sed -i -e', and you can usually 
merge multiple single char flags into a single one,
except that is not the case for -i, and I keep making this mistake.

.PHONY: format
format:
git ls-files '*.ml' '*.mli' | xargs -n1 ocp-indent -i
git ls-files '*.c' '*.h' | xargs -n1 sed -i 's/\t//g'


--Edwin


Re: [PATCH v4 06/11] tools/ocaml: add .clang-format

2022-12-19 Thread Edwin Torok



> On 19 Dec 2022, at 12:03, Anthony PERARD  wrote:
> 
> On Fri, Dec 16, 2022 at 06:25:15PM +, Edwin Török wrote:
>> Add a .clang-format configuration that tries to match CODING_STYLE where
>> possible.
> 
> Is there going to be a patch applying this to "tools/ocaml", like you
> did with "make format"?

Long term probably, not yet.

For now I just want to experiment with it to see how well and consistently it 
works (especially with different versions of the tool).
(I've been using ocp-indent and ocamlformat for years now, clang-format would 
be a brand new addition.
 There are some solutions for ocamlformat that might be useful for backports, 
e.g. there is a git merge driver
that reformats all 3 sides of a merge and then attempt to solve conflict/do the 
merge on the resulting formatted files,
so even if old code was formatted differently or with a different version it 
might be able to apply the changes.
I've yet to try it though, and I'm not yet sure how easy it would be to 
integrate something like that with 'guilt' or 'stgit'
which is what Xen uses to manage its patch-queue for backports, at least 
internally.

If there was an ocp-indent equivalent for C that just re-indents that might be 
something we could apply right now,
but I'm not aware of one.
)


>> I was not able to expr
>> ess the special casing of braces after 'do'
>> though, this can only be controlled generally for all control
>> statements.
>> It is imperfect, but should be better than the existing bindings, which
>> do not follow Xen coding style.
> 
> There isn't a single CODING_STYLE for all code in the repo so it isn't
> an issue if the binding where having a different coding style. But
> having as few coding style as possible in the repo is probably helpful.

Indeed, having *a* coding style (and automated checking/formatting) is what I'm 
more interested in rather than which particular one.
The current one used by the bindings is not documented, and although I can 
guess at based on surrounding code,
that surrounding code is subtly different based on when it was written, so it 
seemed better to adopt what is already documented
in CODING_STYLE.


> 
> You could maybe add a CODING_STYLE in "tools/ocaml" to say that the
> coding style is running `make format` or `clang-format ...`. And if
> there other rules like how to name certain variable, that could go in
> this file as well.
> 


Having a way to easily run these formatters would be a plus. 
Perhaps extending one of the container images that Xen already has to include 
(a recent enough version of) ocp-indent and clang-format, and similarly for the 
.spec file it might be useful
to update its BuildRequires to bring in the necessary formatters to make it 
easier to use this in backports.

And then all that can be documented in a CODING_STYLE or 
CONTRIBUTING.txt/CONTRIBUTING.md file in tools/ocaml.

Best regards,
--Edwin




Re: [PATCH v4 10/11] tools/ocaml/xenstored: validate config file before live update

2022-12-19 Thread Edwin Torok



> On 19 Dec 2022, at 09:37, Christian Lindig  
> wrote:
> 
> 
> 
>> On 16 Dec 2022, at 18:25, Edwin Török  wrote:
>> 
>> The configuration file can contain typos or various errors that could prevent
>> live update from succeeding (e.g. a flag only valid on a different version).
>> Unknown entries in the config file would be ignored on startup normally,
>> add a strict --config-test that live-update can use to check that the config 
>> file
>> is valid *for the new binary*.
> 
> Is the configuration tested, checked, or validated? If feel “check" or 
> “validate" would convey better what is happening.

The rest of the code talks about validation, so I've renamed this to 
config-validate.

> 
> 
>> For compatibility with running old code during live update recognize
>> --live --help as an equivalent to --config-test.
>> 
>> Signed-off-by: Edwin Török  
> Acked-by: Christian Lindig 


Thanks, I've pushed an update commit with this change here: 
https://github.com/edwintorok/xen/compare/staging...private/edvint/review5,
in particular 
https://github.com/edwintorok/xen/commit/f1a9153bb867bbb5df0f5e17b1ed3348e7ea79f8

Best regards,
--Edwin
> 
>>> 
>> ---
>> Changes since v2:
>> * repost of lost patch from 2021: 
>> https://patchwork.kernel.org/project/xen-devel/patch/a53934dfa8ef984bffa858cc573cc7a6445bbdc0.1620755942.git.edvin.to...@citrix.com/
>> ---
>> tools/ocaml/xenstored/parse_arg.ml | 26 ++
>> tools/ocaml/xenstored/xenstored.ml | 11 +--
>> 2 files changed, 35 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tools/ocaml/xenstored/parse_arg.ml 
>> b/tools/ocaml/xenstored/parse_arg.ml
>> index 1a85b14ef5..b159b91f00 100644
>> --- a/tools/ocaml/xenstored/parse_arg.ml
>> +++ b/tools/ocaml/xenstored/parse_arg.ml
>> @@ -26,8 +26,14 @@ type config =
>>restart: bool;
>>live_reload: bool;
>>disable_socket: bool;
>> +config_test: bool;
>>  }
>> 
>> +let get_config_filename config_file =
>> +  match config_file with
>> +  | Some name -> name
>> +  | None  -> Define.default_config_dir ^ "/oxenstored.conf"
> 
> I’d use Filename.concat.
> 
>> +
>> let do_argv =
>>  let pidfile = ref "" and tracefile = ref "" (* old xenstored compatibility 
>> *)
>>  and domain_init = ref true
>> @@ -38,6 +44,8 @@ let do_argv =
>>  and restart = ref false
>>  and live_reload = ref false
>>  and disable_socket = ref false
>> +  and config_test = ref false
>> +  and help = ref false
>>  in
>> 
>>  let speclist =
>> @@ -55,10 +63,27 @@ let do_argv =
>>  ("-T", Arg.Set_string tracefile, ""); (* for compatibility *)
>>  ("--restart", Arg.Set restart, "Read database on starting");
>>  ("--live", Arg.Set live_reload, "Read live dump on startup");
>> +  ("--config-test", Arg.Set config_test, "Test validity of config 
>> file");
> 
> I see the logic how this flag was named but feel starting with a verb 
> (“validate”, “check”, “test”) leads to a clearer invocation pattern.
> 
>>  ("--disable-socket", Arg.Unit (fun () -> disable_socket := true), 
>> "Disable socket");
>> +  ("--help", Arg.Set help, "Display this list of options")
>>] in
>>  let usage_msg = "usage : xenstored [--config-file ] 
>> [--no-domain-init] [--help] [--no-fork] [--reraise-top-level] [--restart] 
>> [--disable-socket]" in
>>  Arg.parse speclist (fun _ -> ()) usage_msg;
>> +  let () =
>> +if !help then begin
>> +  if !live_reload then
>> +(*
>> +  Transform --live --help into --config-test for backward compat 
>> with
>> +  running code during live update.
>> +  Caller will validate config and exit
>> +*)
>> +config_test := true
>> +  else begin
>> +Arg.usage_string speclist usage_msg |> print_endline;
>> +exit 0
>> +  end
>> +end
>> +  in
>>  {
>>domain_init = !domain_init;
>>activate_access_log = !activate_access_log;
>> @@ -70,4 +95,5 @@ let do_argv =
>>restart = !restart;
>>live_reload = !live_reload;
>>disable_socket = !disable_socket;
>> +config_test = !config_test;
>>  }
>> diff --git a/tools/ocaml/xenstored/xenstored.ml 
>> b/tools/ocaml/xenstored/xenstored.ml
>> index 366437b396..1aaa3e995e 100644
>> --- a/tools/ocaml/xenstored/xenstored.ml
>> +++ b/tools/ocaml/xenstored/xenstored.ml
>> @@ -88,7 +88,7 @@ let default_pidfile = Paths.xen_run_dir ^ "/xenstored.pid"
>> 
>> let ring_scan_interval = ref 20
>> 
>> -let parse_config filename =
>> +let parse_config ?(strict=false) filename =
>>  let pidfile = ref default_pidfile in
>>  let options = [
>>("merge-activate", Config.Set_bool Transaction.do_coalesce);
>> @@ -129,11 +129,12 @@ let parse_config filename =
>>("xenstored-port", Config.Set_string Domains.xenstored_port); ] in
>>  begin try Config.read filename options (fun _ _ -> raise Not_found)
>>with
>> -| Config.Error err -> List.iter (fun (k, e) ->
>> +| Config.Error err as e -> List.iter (fun (k, e) ->
>>match e with
>>| "unknown key" -> 

Re: [PATCH v4 01/11] tools/ocaml/libs/{xb, mmap}: use Data_abstract_val wrapper

2022-12-19 Thread Edwin Torok



> On 16 Dec 2022, at 22:40, Andrew Cooper  wrote:
> 
> On 16/12/2022 6:25 pm, Edwin Török wrote:
>> diff --git a/tools/ocaml/libs/mmap/mmap_stubs.h 
>> b/tools/ocaml/libs/mmap/mmap_stubs.h
>> index 65e4239890..66f18d4406 100644
>> --- a/tools/ocaml/libs/mmap/mmap_stubs.h
>> +++ b/tools/ocaml/libs/mmap/mmap_stubs.h
>> @@ -30,4 +30,8 @@ struct mmap_interface
>> int len;
>> };
>> 
>> +#ifndef Data_abstract_val
>> +#define Data_abstract_val(x) ((void*)(value*)(x))
> 
> ((void *)(x))
> 
> I take it this has come from the Ocaml headers?  The cast to (value *)
> in the middle is entirely cancelled out.
> 
> Can be fixed on commit.
> 


In the OCaml headers it looks like:
+#define Data_abstract_val(v) ((void*) Op_val(v))

where Op_val is:
+#define Op_val(x) ((value *) (x))

However I haven't realized that Op_val has been in OCaml for a long time (at 
least 1995), so we can safely use it, and do this instead:

diff --git a/tools/ocaml/libs/mmap/mmap_stubs.h 
b/tools/ocaml/libs/mmap/mmap_stubs.h
index 65e4239890..6c33f14138 100644
--- a/tools/ocaml/libs/mmap/mmap_stubs.h
+++ b/tools/ocaml/libs/mmap/mmap_stubs.h
@@ -30,4 +30,8 @@ struct mmap_interface
int len;
 };

+#ifndef Data_abstract_val
+#define Data_abstract_val(v) ((void*) Op_val(v))
+#endif
+
 #endif


There is an updated commit here for convenience: 
https://github.com/edwintorok/xen/commit/9855ed237f3f85ac30972bfd0a601c6746ba2353

Best regards,
--Edwin


Re: [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn

2022-12-02 Thread Edwin Torok



> On 1 Dec 2022, at 12:10, Andrew Cooper  wrote:
> 
> On 01/12/2022 11:20, Christian Lindig wrote:
>> 
>>> On 30 Nov 2022, at 16:54, Andrew Cooper  wrote:
>>> 
>>> Generally speaking, the event channel local/remote port is fixed for the
>>> lifetime of the associated domain object.  The exception to this is a
>>> secondary XS_INTRODUCE (defined to re-bind to a new event channel) which 
>>> pokes
>>> around at the domain object's internal state.
>>> 
>>> We need to refactor the evtchn handling to support live update, so start by
>>> moving the relevant manipulation into Domain.
>>> 
>>> No practical change.
>>> 
>>> Signed-off-by: Andrew Cooper 
>>> ---
>>> CC: Christian Lindig 
>>> CC: David Scott 
>>> CC: Edwin Torok 
>>> CC: Rob Hoes 
>> Acked-by: Christian Lindig 
> 
> Thanks.
> 
>> The code makes changes around if-expressions and it is easy to get mislead 
>> by indentation which parts are covered by an if and which are not in the 
>> presence of sequential code. I would be more confident about this with 
>> automatic formatting (but also believe this is correct).
> 
> I can keep the being/end if you'd prefer.
> 
> Looking at the end result, it would actually shrink the patch, so is
> probably worth doing anyway for clarity.  The net result is:
> 
> diff --git a/tools/ocaml/xenstored/process.ml
> b/tools/ocaml/xenstored/process.ml
> index b2973aca2a82..1c80e7198dbe 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -569,8 +569,7 @@ let do_introduce con t domains cons data =
> let edom = Domains.find domains domid in
> if (Domain.get_mfn edom) = mfn &&
> (Connections.find_domain cons domid) != con then begin
> (* Use XS_INTRODUCE for recreating the
> xenbus event-channel. *)
> -   edom.remote_port <- remote_port;
> -   Domain.bind_interdomain edom;
> +   Domain.rebind_evtchn edom remote_port;
> end;
> edom
> else try
> 
> I'm happy to adjust on commit.
> 
> When I started this, I tried rearranging things to avoid the "if exists
> then find" pattern, but quickly got into a mess, then realised that this
> is (almost) a dead logic path... I've got no idea why this is supported;
> looking through history, I can't find a case where a redundant
> XS_INTRODUCE was ever used, but its a common behaviour between C and O
> so there was clearly some reason...


Currently the soft reset code in xenopsd uses it, but as you say there must've 
been another reason too (the soft reset code is a lot more recent than this).

Best regards,
--Edwin


Re: [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair

2022-12-01 Thread Edwin Torok



> On 1 Dec 2022, at 14:22, Andrew Cooper  wrote:
> 
> On 01/12/2022 11:59, Christian Lindig wrote:
>>> On 30 Nov 2022, at 16:54, Andrew Cooper  wrote:
>>> 
>>> Inter-domain event channels are always a pair of local and remote ports.
>>> Right now the handling is asymmetric, caused by the fact that the evtchn is
>>> bound after the associated Domain object is constructed.
>>> 
>>> First, move binding of the event channel into the Domain.make() constructor.
>>> This means the local port no longer needs to be an option.  It also removes
>>> the final callers of Domain.bind_interdomain.
>>> 
>>> Next, introduce a new port_pair type to encapsulate the fact that these two
>>> should be updated together, and replace the previous port and remote_port
>>> fields.  This refactoring also changes the Domain.get_port interface 
>>> (removing
>>> an option) so take the opportunity to name it get_local_port instead.
>>> 
>>> Also, this fixes a use-after-free risk with Domain.close.  Once the evtchn 
>>> has
>>> been unbound, the same local port number can be reused for a different
>>> purpose, so explicitly invalidate the ports to prevent their accidental 
>>> misuse
>>> in the future.
>>> 
>>> This also cleans up some of the debugging, to always print a port pair.
>>> 
>>> Signed-off-by: Andrew Cooper 
>>> ---
>>> CC: Christian Lindig 
>>> CC: David Scott 
>>> CC: Edwin Torok 
>>> CC: Rob Hoes 
>> Acked-by: Christian Lindig 
> 
> Thanks.
> 
>> 
>>> v2:
>>> * New
>>> ---
>>> tools/ocaml/xenstored/connections.ml |  9 +
>>> tools/ocaml/xenstored/domain.ml  | 75 
>>> ++--
>>> tools/ocaml/xenstored/domains.ml |  2 -
>>> 3 files changed, 39 insertions(+), 47 deletions(-)
>>> 
>>> diff --git a/tools/ocaml/xenstored/connections.ml 
>>> b/tools/ocaml/xenstored/connections.ml
>>> index 7d68c583b43a..a80ae0bed2ce 100644
>>> --- a/tools/ocaml/xenstored/connections.ml
>>> +++ b/tools/ocaml/xenstored/connections.ml
>>> @@ -48,9 +48,7 @@ let add_domain cons dom =
>>> let xbcon = Xenbus.Xb.open_mmap ~capacity (Domain.get_interface dom) (fun 
>>> () -> Domain.notify dom) in
>>> let con = Connection.create xbcon (Some dom) in
>>> Hashtbl.add cons.domains (Domain.get_id dom) con;
>>> - match Domain.get_port dom with
>>> - | Some p -> Hashtbl.add cons.ports p con;
>>> - | None -> ()
>>> + Hashtbl.add cons.ports (Domain.get_local_port dom) con
>> I would prefer Hashtbl.replace. Hashtbl.add shadows an existing binding 
>> which becomes visible again after Hashtabl.remove. When we are sure that we 
>> only have one binding per key, we should use replace instead of add.
> 
> That's surprising behaviour.  Presumably the add->replace suggestion
> applies the other hashtable here (cons.domains)?  And possibly elsewhere
> too.


Yes:
* Hashtbl.add -> Hashtbl.replace
* Hashtbl.clear -> Hashtbl.reset

Using anything on the left is almost always an indication of a subtle bug (e.g. 
Hashtbl.clear won't release the memory used by the buckets, and the only time 
that is useful is if you'd immediately fill the hashtable with lots of elements 
again, really code should always use Hashtbl.reset but that only got introduced 
in OCaml 4.0.0, so older code won't have it).

And the use of Hashtbl.add can lead to "space leaks" (eventually OOM) unless 
one really knows what they are doing (i.e. there are only a finite number of 
add calls ever).

In XAPI we have a "quality gate" that counts the number of problematic 
functions/etc, and makes it a hard build time failure if any new usages are 
introduced (and we strive to reduce that to 0).
I don't think these 2 Hashtbl calls are there yet, but they probably should be.

Best regards,
--Edwin




Re: [PATCH v1 3/5] tools/ocaml/libs/xc: add hvm_param_get binding

2022-12-01 Thread Edwin Torok



> On 1 Dec 2022, at 14:22, Christian Lindig  wrote:
> 
> 
> 
>> On 1 Dec 2022, at 14:16, Edwin Torok  wrote:
>> 
>> The disadvantage is that we can't pattern match on it anymore.
>> 
>> Although we could have some OCaml code that does the pattern matching on 
>> another type and maps it to these private integer types.
>> However at that point we've manually reinvented what ctypes would already 
>> do, and we have an additional mapping step (which may not matter from a 
>> performance point of view but would be nice if we could avoid it).
> 
> I agree that this is a severe disadvantage. My method is only useful if they 
> are mostly passed around but not when they have an algebra defined over them 
> where you want to combine and match them.


It might be possible to use your method to implement a pure-OCaml ABI checker 
though.

Consider:
```
external constant_A : unit -> int = "caml_constant_A"
external constant_B : unit -> int = "caml_constant_B"
external constant_C : unit -> int = "caml_constant_C"

let check_match = List.iter @@ fun (ocaml_variant, c_constant) ->
   let r = Obj.repr ocaml_variant in
   assert (Obj.is_int r);
   assert ((Obj.magic r : int) = c_constant ())

type t = A | B | C
let () =
   [A, constant_A
   ;B, constant_B
   ;C, constant_C]
   |> check_match
```

(although perhaps with the 2nd assertion replaced by something that includes 
the constant in the exception raised to aid debugging)

This'd only detect the ABI mismatch at runtime (although it would detect it 
right on startup), so it'd need to be accompanied by a small unit test
that would link just this module to make any mismatches a build time failure.

Then there would be no runtime overhead when using these variants in function 
calls, and we'd know they match 1:1.
Although there is still a possibility of mismatching on names (the bug I 
originally had in my patch, which although wouldn't cause any runtime issues,
would be good to catch at build time too).

I'll keep thinking about this to see whether there is another easy approach we 
can use that doesn't require using Cstubs and doesn't rely on manually keeping
code in different files in sync.

Best regards,
--Edwin




Re: [PATCH v1 2/5] tools/ocaml/libs/xc: add binding to xc_evtchn_status

2022-12-01 Thread Edwin Torok


> On 1 Dec 2022, at 13:59, Andrew Cooper  wrote:
> 
> On 01/12/2022 13:35, Edwin Torok wrote:
>>> On 1 Dec 2022, at 11:34, Andrew Cooper  wrote:
>>> 
>>> On 30/11/2022 17:32, Edwin Török wrote:
>>>> +
>>>> +caml_enter_blocking_section();
>>>> +rc = xc_evtchn_status(_H(xch), );
>>>> +caml_leave_blocking_section();
>>>> +
>>>> +if ( rc < 0 )
>>>> +failwith_xc(_H(xch));
>>>> +
>>>> +if ( status.status == EVTCHNSTAT_closed )
>>>> +result = Val_none;
>>>> +else
>>>> +{
>>> This is actually one example where using a second CAMLreturn would
>>> simply things substantially.
>>> 
>>> switch ( status.status )
>>> {
>>> case EVTCHNSTAT_closed:
>>>CAMLreturn(Val_none);
>>> 
>>> case EVTCHNSTAT_unbound:
>>>...
>>> 
>>> Would remove the need for the outer if/else.
>> 
>> CAMLreturn has some macro magic to ensure it gets paired with the toplevel 
>> CAMLparam correctly (one of them opens a { scope and the other closes it, or 
>> something like that),
>> so I'd avoid putting it into the middle of other syntactic elements, it 
>> might just cause the build to fail (either now or in the future).
> 
> From the manual:
> 
> "The macros CAMLreturn, CAMLreturn0, and CAMLreturnT are used to replace
> the C keyword return. Every occurrence of return x must be replaced by ..."
> 
> It is common in C to have multiple returns, and the manual does say
> "Every occurrence" without stating any requirement for there to be a
> single occurrence.
> 
> CAMLreturn can't syntactically work splitting a scope with CAMLparam,
> because then this wouldn't compile:
> 
> CAMLprim value stub_xc_evtchn_status(value foo)
> {
> CAMLparam1(foo);
> int bar = 0;
> 
> retry:
> if ( bar )
> CAMLreturn(foo);
> 
> bar++
> goto retry;
> }
> 


I wouldn't expect it to, or at least I've never seen a C binding written that 
way (with CAMLreturn not as last statement),
but indeed nothing in the manual states that it wouldn't work.

> CAMLreturn does use a normal "do { ... } while (0)" construct simply for
> being a macro, and forces paring with CAMLparamX by referencing the
> caml__frame local variable by name.
> 
> 
> So I really do think that multiple CAMLreturns are fine and cannot
> reasonably be broken in the future.
> 
> But if we do really still want to keep a single return, then a "goto
> done" would be acceptable here and still better than the if/else.

I almost always used to use do/while(0) and break even in C as a replacement 
for 'goto',
especially if there are multiple nested resources that need cleaning up, 
do/while ensures you
unwind them in the correct order and don't accidentally skip one.
I think most code that is written using 'goto' can be rewritten not to use it, 
and might avoid some bugs in the process
(e.g. using goto might leave some local variables uninitialised).
I'm reluctant to introduce a goto just to decrease nesting level.

There might be another way to rewrite the code:
```
switch(status.status)
{
case EVTCHNSTAT_closed:
 stat = Val_none;
 break;
... other code that assigns to stat something other than None ...
}

if (Val_none == stat) {
   result = Val_none;
} else {
   .. code as it was before to construct a some ...
}

CAMLreturn(result);
```

This would then follow the logical order of how the values are constructed, and 
avoids the deep nesting of the switch.
(reading the code backwards from exit will show you how each piece is 
nested/constructed without jumps that alter control flow)

Val_none == is used instead of == Val_none to catch a typo where stat = 
Val_none would change stat, whereas Val_none = stat would be a compile error.

What do you think?

(might be slightly less efficient than the original version, but a reasonable C 
compiler should produce almost equal optimized code for both).

> 
>>> caml_alloc_some() is perhaps less interesting as it only appeared in
>>> Ocaml 4.12 AFAICT, but we could also have some ifdefary for that at the
>>> top of the file.
>>> 
>>> I don't know whether we have opencoded options elsewhere in the
>>> bindings, but it certainly would be reduce the amount opencoding that
>>> exists for standard patterns.
>> 
>> perhaps we can look into doing that cleanup as a separate patch.
> 
> Probably best, yes.
> 
> ~Andrew



Re: [PATCH v1 3/5] tools/ocaml/libs/xc: add hvm_param_get binding

2022-12-01 Thread Edwin Torok


> On 1 Dec 2022, at 13:57, Christian Lindig  wrote:
> 
> 
> 
>> On 1 Dec 2022, at 13:50, Edwin Torok  wrote:
>> 
>> Should we instead switch to using ctypes to generate these constants?
> 
> I would not advocate this. Ctypes is the kind of meta programming that is 
> great when it works but hell if it does not and it adds more dependencies. 

Perhaps use it to just generate the constant mappings?
Here is an example of how I used it elsewhere:
https://github.com/xapi-project/ocaml-dlm/blob/master/lib_gen/types/bindings_structs.ml#L30-L55

> 
> I just had a discussion with Andrew about other tricks how to bring C 
> constants to the ML side in order to decouple them. I’m using it in my Polly 
> library - it might not be the solution for Xen but worth knowing.
> 
> https://github.com/lindig/polly/blob/master/lib/polly_stubs.c#L23-L39


The disadvantage is that we can't pattern match on it anymore.

Although we could have some OCaml code that does the pattern matching on 
another type and maps it to these private integer types.
However at that point we've manually reinvented what ctypes would already do, 
and we have an additional mapping step (which may not matter from a performance 
point of view but would be nice if we could avoid it).

I'll have to do some experiments to see how the code generated by ctypes looks 
like in this case (actually the 'cstubs' part of it), and how different it 
would be from manually writing it
(i.e. can we still reasonably review the generated code, and would it look like 
something that we'd write ourselves?)

Best regards,
--Edwin



Re: [PATCH v1 3/5] tools/ocaml/libs/xc: add hvm_param_get binding

2022-12-01 Thread Edwin Torok


> On 1 Dec 2022, at 11:51, Andrew Cooper  wrote:
> 
> On 30/11/2022 17:32, Edwin Török wrote:
>> diff --git a/tools/ocaml/libs/xc/xenctrl.mli 
>> b/tools/ocaml/libs/xc/xenctrl.mli
>> index 60e7902e66..f6c7e5b553 100644
>> --- a/tools/ocaml/libs/xc/xenctrl.mli
>> +++ b/tools/ocaml/libs/xc/xenctrl.mli
>> @@ -236,6 +236,51 @@ external map_foreign_range :
>>   handle -> domid -> int -> nativeint -> Xenmmap.mmap_interface
>>   = "stub_map_foreign_range"
>> 
>> +(* needs to be sorted according to its numeric value, watch out for gaps! *)
>> +type hvm_param =
>> +  | HVM_PARAM_CALLBACK_IRQ
>> +  | HVM_PARAM_STORE_PFN
>> +  | HVM_PARAM_STORE_EVTCHN
>> +  | HVM_PARAM_UNDEFINED_3
> 
> Can we perhaps use
> 
> | _HVM_PARAM_UNDEF_3
> 
> with a leading underscore to highlight that its just a placeholder for a
> hole ?

I tried this, but I get a compile error if I attempt to start a variant name 
with and underscore.

> 
>> +  | HVM_PARAM_PAE_ENABLED
>> +  | HVM_PARAM_IOREQ_PFN
>> +  | HVM_PARAM_BUFIOREQ_PFN
>> +  | HVM_PARAM_UNDEFINED_7
>> +  | HVM_PARAM_UNDEFINED_8
>> +  | HVM_PARAM_VIRIDIAN
>> +  | HVM_PARAM_TIMER_MODE0
> 
> From TIMER_MODE onwards, you appear to have a trailing digit on each
> constant name.  It appears to be the final digit of the params numeric
> value.
> 

I think I see how that happened (I had the numbers side by side to check that I 
filled in all the wholes, and then used the wrong regex to remove them,
which worked on single digit numbers, but not double).
I'm fixing this up in my tree now.

>> +  | HVM_PARAM_HPET_ENABLED1
>> +  | HVM_PARAM_IDENT_PT2
>> +  | HVM_PARAM_UNDEFINED_13
>> +  | HVM_PARAM_ACPI_S_STATE4
>> +  | HVM_PARAM_VM86_TSS5
>> +  | HVM_PARAM_VPT_ALIGN6
>> +  | HVM_PARAM_CONSOLE_PFN7
>> +  | HVM_PARAM_CONSOLE_EVTCHN8
>> +  | HVM_PARAM_ACPI_IOPORTS_LOCATION9
>> +  | HVM_PARAM_MEMORY_EVENT_CR00
>> +  | HVM_PARAM_MEMORY_EVENT_CR31
>> +  | HVM_PARAM_MEMORY_EVENT_CR42
>> +  | HVM_PARAM_MEMORY_EVENT_INT33
>> +  | HVM_PARAM_NESTEDHVM4
>> +  | HVM_PARAM_MEMORY_EVENT_SINGLE_STEP5
>> +  | HVM_PARAM_UNDEFINED_26
>> +  | HVM_PARAM_PAGING_RING_PFN7
>> +  | HVM_PARAM_MONITOR_RING_PFN8
>> +  | HVM_PARAM_SHARING_RING_PFN9
>> +  | HVM_PARAM_MEMORY_EVENT_MSR0
>> +  | HVM_PARAM_TRIPLE_FAULT_REASON1
>> +  | HVM_PARAM_IOREQ_SERVER_PFN2
>> +  | HVM_PARAM_NR_IOREQ_SERVER_PAGES3
>> +  | HVM_PARAM_VM_GENERATION_ID_ADDR4
>> +  | HVM_PARAM_ALTP2M5
>> +  | HVM_PARAM_X87_FIP_WIDTH6
>> +  | HVM_PARAM_VM86_TSS_SIZED7
>> +  | HVM_PARAM_MCA_CAP8
> 
> Similarly with EVENTSTAT_*, we ought to engage the build time ABI check.


It looks like we'd need to write a new ABI check, but I'm not familiar with the 
ABI checker script here,
and relying on a script to parse the OCaml source code and check the ABI seems 
brittle (but no less brittle than not having a check at all).

Should we instead switch to using ctypes to generate these constants? It can 
run at build time and produce a .ml file based on a build time test
by including the actual C header and getting the correct constant value. But 
it'd make cross-compilation (if Xen supports that?) more difficult.
Or we could run it ourselves by hand, and commit the result so that end-users 
do not need to have or run ctypes, just developers who change these bindings
(similar to how it is usual to commit the output from autoconf and automake 
into git to not require end-users to rerun these).

However a move to ctypes would require quite a lot of build time changes that 
I'd rather start only once we switched to Dune, it is not worthwhile doing in 
the current Makefile based
build system.

> 
> But there isn't a suitable end delimiter, and these are only ever an
> input into a binding (never a return), so it's not the end of the world
> if new constants get missed.  (Not that new constants are likely. 
> HVM_PARAMs are a gross bodge which I'm trying to phase out.)
> 
>> +
>> +external hvm_param_get: handle -> domid -> hvm_param -> int64
>> +  = "stub_xc_hvm_param_get"
> 
> IMO we should bind set at the same time.  It's trivial to do, and far
> easier to do now than at some point in the future when we first need it.
> 
>> +
>> external domain_assign_device: handle -> domid -> (int * int * int * int) -> 
>> unit
>>   = "stub_xc_domain_assign_device"
>> external domain_deassign_device: handle -> domid -> (int * int * int * int) 
>> -> unit
>> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
>> b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> index 67f3648391..b2df93d4f8 100644
>> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
>> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> @@ -1176,6 +1176,22 @@ CAMLprim value stub_xc_domain_irq_permission(value 
>> xch, value domid,
>> CAMLreturn(Val_unit);
>> }
>> 
>> +CAMLprim value stub_xc_hvm_param_get(value xch, value domid, value param)
>> +{
>> +CAMLparam3(xch, domid, param);
>> +uint64_t result;
> 
> result is commonly a value in these bindings.  'val' would be the more
> common name to use here.
> 
> ~Andrew



Re: [PATCH v1 2/5] tools/ocaml/libs/xc: add binding to xc_evtchn_status

2022-12-01 Thread Edwin Torok


> On 1 Dec 2022, at 11:34, Andrew Cooper  wrote:
> 
> On 30/11/2022 17:32, Edwin Török wrote:
>> There is no API or ioctl to query event channel status, it is only
>> present in xenctrl.h
> 
> Yeah, this is very unfortunate, because it really wanted to be part of
> the xenevtchn stable API/ABI.
> 
>> The C union is mapped to an OCaml variant exposing just the value from the
>> correct union tag.
>> 
>> Querying event channel status is useful when analyzing Windows VMs that
>> may have reset and changed the xenstore event channel port number from
>> what it initially got booted with.
> 
> This paragraph is why we need it now, but it's not really relevant for
> the upstream commit.  I'd drop this sentence, and simply how the lower
> one noting the similarity to lsevtchn.
> 
>> The information provided here is similar to 'lstevtchn', but rather than
> 
> "lsevtchn".
> 
>> parsing its output it queries the underlying API directly.
>> 
>> Signed-off-by: Edwin Török 
>> ---
>> tools/ocaml/libs/xc/xenctrl.ml  | 14 +++
>> tools/ocaml/libs/xc/xenctrl.mli | 15 +++
>> tools/ocaml/libs/xc/xenctrl_stubs.c | 65 +
>> 3 files changed, 94 insertions(+)
>> 
>> diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
>> index 2ed7454b16..c21e391f98 100644
>> --- a/tools/ocaml/libs/xc/xenctrl.ml
>> +++ b/tools/ocaml/libs/xc/xenctrl.ml
>> @@ -267,6 +267,20 @@ external evtchn_alloc_unbound: handle -> domid -> domid 
>> -> int
>>   = "stub_xc_evtchn_alloc_unbound"
>> external evtchn_reset: handle -> domid -> unit = "stub_xc_evtchn_reset"
>> 
>> +type evtchn_interdomain = { dom: domid; port: int}
> 
> Strictly speaking, port needs to be int32.
> 
> ABI-wise, it can be configured as large as 2^32-2 during domain creation.
> 
> However, FIFO currently tops out at 2^17 and has a theoretical maximum
> at 2^28, so perhaps int ought to enough for now.
> 
>> +
>> +type evtchn_stat =
>> +  | EVTCHNSTAT_unbound of domid
>> +  | EVTCHNSTAT_interdomain of evtchn_interdomain
>> +  | EVTCHNSTAT_pirq of int
>> +  | EVTCHNSTAT_virq of int
> 
> Similar comment.  A vcpu id should in principle be int32
> 
>> +  | EVTCHNSTAT_ipi
> 
> Normally when having an enumeration like this, we want to hook up the
> build-time ABI check.
> 
> But in this case, it's produced by the bindings (not consumed by them),
> and there's an exception raised in the default case, so I don't think we
> need the build-time ABI check for any kind of safety (and therefore
> shouldn't go to the reasonably-invasive effort of adding the check).

Yes, I was looking for how to add an ABI check there, but other places like the 
featureset enum doesn't have it either.
The ABI check only seems to exist for the case where the values are used as bit 
flags.

> 
>> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
>> b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> index d30585f21c..67f3648391 100644
>> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
>> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> @@ -641,6 +641,71 @@ CAMLprim value stub_xc_evtchn_reset(value xch, value 
>> domid)
>> CAMLreturn(Val_unit);
>> }
>> 
>> +CAMLprim value stub_xc_evtchn_status(value xch, value domid, value port)
>> +{
>> +CAMLparam3(xch, domid, port);
>> +CAMLlocal4(result, result_status, stat, interdomain);
>> +xc_evtchn_status_t status;
>> +int rc;
>> +
>> +memset(, 0, sizeof(status));
>> +status.dom = _D(domid);
>> +status.port = Int_val(port);
> 
> xc_evtchn_status_t status = {
> .dom = _D(domid),
> .port = Int_val(port),
> };
> 
> is the marginally preferred way of doing this.  It removes potential
> issues with typo-ing the memset().
> 
>> +
>> +caml_enter_blocking_section();
>> +rc = xc_evtchn_status(_H(xch), );
>> +caml_leave_blocking_section();
>> +
>> +if ( rc < 0 )
>> +failwith_xc(_H(xch));
>> +
>> +if ( status.status == EVTCHNSTAT_closed )
>> +result = Val_none;
>> +else
>> +{
> 
> This is actually one example where using a second CAMLreturn would
> simply things substantially.
> 
> switch ( status.status )
> {
> case EVTCHNSTAT_closed:
> CAMLreturn(Val_none);
> 
> case EVTCHNSTAT_unbound:
> ...
> 
> Would remove the need for the outer if/else.


CAMLreturn has some macro magic to ensure it gets paired with the toplevel 
CAMLparam correctly (one of them opens a { scope and the other closes it, or 
something like that),
so I'd avoid putting it into the middle of other syntactic elements, it might 
just cause the build to fail (either now or in the future).

> 
> 
>> +switch ( status.status )
>> +{
>> +case EVTCHNSTAT_unbound:
>> +stat = caml_alloc(1, 0); /* 1st non-constant constructor */
>> +Store_field(stat, 0, Val_int(status.u.unbound.dom));
>> +break;
>> +
>> +case EVTCHNSTAT_interdomain:
>> +interdomain = caml_alloc_tuple(2);
>> +Store_field(interdomain, 0, 

Re: [PATCH v1 5/5] CODING_STYLE: add .clang-format

2022-12-01 Thread Edwin Torok


> On 1 Dec 2022, at 09:11, Julien Grall  wrote:
> 
> Hi Edwin,
> 
> The title should have "OCaml" to clarify that .clang-format is not added at 
> the root level.
> 

Sure, I'll update that when I resend.

> On 30/11/2022 17:32, Edwin Török wrote:
>> Add a .clang-format configuration that tries to match CODING_STYLE where
>> possible.
>> I was not able to express the special casing of braces after 'do'
>> though, this can only be controlled generally for all control
>> statements.
>> It is imperfect, but should be better than the existing bindings, which
>> do not follow Xen coding style.
> 
> Right, from previous discussion, I was under the impression that it requires 
> some work to write a clang-format file for Xen.
> 
> I am hopeful that some day we will have a proper one. In fact, we have been 
> discussing about this as part of MISRA (+ Stefano).
> 
>> Add this to tools/ocaml first because:
>> * there are relatively few C files here, and it is a good place to start with
>> * it'd be useful to make these follow Xen's CODING_STYLE
>> (which they currently do not because they use tabs for example)
>> * they change relatively infrequently, so shouldn't cause issues with
>>   backporting security fixes (could either backport the reindentation
>>   patch too, or use git cherry-pick with `-Xignore-space-change`)
>> Once this is used it'll need inserting some '#include ', otherwise 
>> xs_wire.h
>> fails to compile due to the missing uint32_t define.
> 
> At first I was a bit concerned with this paragraph because a coding style 
> should not impact compilation. But I guess that's because the format will 
> convert u32 to uint32_t. Is that correct?
> 
> If so, I would expand the paragraph to explicit say that, per the coding 
> styel, u32 will be converted to uint32_t.


clang-format rearranges the order of '#include' in C files, it shouldn't 
convert types.
But rearranging (sorting) includes can sometimes reveal bugs where the code 
only happened to compile because the includes were done in a certain order
(e.g. we included something that included stdint.h, therefore the next include 
line worked, but if you swap the include order that no longer works), i.e.:

#include "c.h"
#include "b.h"

vs

/* post formatting */
#include "b.h"
#include "c.h"

Where c.h includes a.h, and b.h depends on declarations from a.h, then prior to 
reformat the code compiles, and afterwards it doesn't.

Which can be fixed by adding this to the C file (and then regardless of include 
order of the other 2 it compiles):
#include 

Or by fixing b.h to include a.h itself it it depends on it.


Perhaps this'd better be fixed in xs_wire.h itself to include all its 
dependencies, but that is a publicly installed header, and there might be a 
reason why it doesn't include stdint.h.
(neither u32, nor uint32_t is standard C, both require a header to be included 
for them to be available)

Best regards,
--Edwin
> 
> Cheers,
> 
> -- 
> Julien Grall



Re: [PATCH v2 5/6] tools/oxenstored: Rework Domain evtchn handling to use port_pair

2022-11-30 Thread Edwin Torok


> On 30 Nov 2022, at 16:54, Andrew Cooper  wrote:
> 
> Inter-domain event channels are always a pair of local and remote ports.
> Right now the handling is asymmetric, caused by the fact that the evtchn is
> bound after the associated Domain object is constructed.
> 
> First, move binding of the event channel into the Domain.make() constructor.
> This means the local port no longer needs to be an option.  It also removes
> the final callers of Domain.bind_interdomain.
> 
> Next, introduce a new port_pair type to encapsulate the fact that these two
> should be updated together, and replace the previous port and remote_port
> fields.  This refactoring also changes the Domain.get_port interface (removing
> an option) so take the opportunity to name it get_local_port instead.
> 
> Also, this fixes a use-after-free risk with Domain.close.  Once the evtchn has
> been unbound, the same local port number can be reused for a different
> purpose, so explicitly invalidate the ports to prevent their accidental misuse
> in the future.
> 
> This also cleans up some of the debugging, to always print a port pair.

Reviewed in-person, I've suggested to use explicit labeled arguments for the 
case where multiple integers with very close semantic meaning are passed as 
arguments,
e.g. local vs remote port, it'd be quite easy to accidentally swap them in the 
caller, leading to bugs.


Reviewed-by: Edwin Török 

> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Christian Lindig 
> CC: David Scott 
> CC: Edwin Torok 
> CC: Rob Hoes 
> 
> v2:
> * New
> ---
> tools/ocaml/xenstored/connections.ml |  9 +
> tools/ocaml/xenstored/domain.ml  | 75 ++--
> tools/ocaml/xenstored/domains.ml |  2 -
> 3 files changed, 39 insertions(+), 47 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/connections.ml 
> b/tools/ocaml/xenstored/connections.ml
> index 7d68c583b43a..a80ae0bed2ce 100644
> --- a/tools/ocaml/xenstored/connections.ml
> +++ b/tools/ocaml/xenstored/connections.ml
> @@ -48,9 +48,7 @@ let add_domain cons dom =
> let xbcon = Xenbus.Xb.open_mmap ~capacity (Domain.get_interface dom) (fun () 
> -> Domain.notify dom) in
> let con = Connection.create xbcon (Some dom) in
> Hashtbl.add cons.domains (Domain.get_id dom) con;
> - match Domain.get_port dom with
> - | Some p -> Hashtbl.add cons.ports p con;
> - | None -> ()
> + Hashtbl.add cons.ports (Domain.get_local_port dom) con
> 
> let select ?(only_if = (fun _ -> true)) cons =
> Hashtbl.fold (fun _ con (ins, outs) ->
> @@ -97,10 +95,7 @@ let del_domain cons id =
> let con = find_domain cons id in
> Hashtbl.remove cons.domains id;
> (match Connection.get_domain con with
> - | Some d ->
> -   (match Domain.get_port d with
> -| Some p -> Hashtbl.remove cons.ports p
> -| None -> ())
> + | Some d -> Hashtbl.remove cons.ports (Domain.get_local_port d)
> | None -> ());
> del_watches cons con;
> Connection.close con
> diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
> index d59a9401e211..ecdd65f3209a 100644
> --- a/tools/ocaml/xenstored/domain.ml
> +++ b/tools/ocaml/xenstored/domain.ml
> @@ -19,14 +19,31 @@ open Printf
> let debug fmt = Logging.debug "domain" fmt
> let warn  fmt = Logging.warn  "domain" fmt
> 
> +(* An event channel port pair.  The remote port, and the local port it is
> +   bound to. *)
> +type port_pair =
> +{
> + local: Xeneventchn.t;
> + remote: int;
> +}
> +
> +(* Sentinal port_pair with both set to EVTCHN_INVALID *)
> +let invalid_ports =
> +{
> + local = Xeneventchn.of_int 0;
> + remote = 0
> +}
> +
> +let string_of_port_pair p =
> + sprintf "(l %d, r %d)" (Xeneventchn.to_int p.local) p.remote
> +
> type t =
> {
> id: Xenctrl.domid;
> mfn: nativeint;
> interface: Xenmmap.mmap_interface;
> eventchn: Event.t;
> - mutable remote_port: int;
> - mutable port: Xeneventchn.t option;
> + mutable ports: port_pair;
> mutable bad_client: bool;
> mutable io_credit: int; (* the rounds of ring process left to do, default is 
> 0,
>   usually set to 1 when there is work detected, could
> @@ -41,8 +58,8 @@ let is_dom0 d = d.id = 0
> let get_id domain = domain.id
> let get_interface d = d.interface
> let get_mfn d = d.mfn
> -let get_remote_port d = d.remote_port
> -let get_port d = d.port
> +let get_remote_port d = d.ports.remote
> +let get_local_port d = d.ports.local
> 
> let is_bad_domain domain = domain.bad_client
> let mark_as_bad domain = domain.bad_client <- true
> @@ -56,54 +73,36 @@ let is_paused_for_conflict dom = dom.conflict_credit <= 
> 0.0
> 
&

Re: [PATCH v2 2/6] tools/oxenstored: Bind the DOM_EXC VIRQ in in Event.init()

2022-11-30 Thread Edwin Torok


> On 30 Nov 2022, at 16:54, Andrew Cooper  wrote:
> 
> Xenstored always needs to bind the DOM_EXC VIRQ.
> 
> Instead of doing it shortly after the call to Event.init(), do it in the
> init() call itself.  This removes the need for the field to be a mutable
> option.
> 
> It will also simplify a future change to restore both parts from the live
> update record, rather than re-initialising them from scratch.
> 
> Rename the field from virq_port (which could be any VIRQ) to it's proper name.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Christian Lindig 
> CC: David Scott 
> CC: Edwin Torok 
> CC: Rob Hoes 


reviewd in person:

Reviewed-by: Edwin Török 

> 
> v2:
> * New.
> ---
> tools/ocaml/xenstored/event.ml | 9 ++---
> tools/ocaml/xenstored/xenstored.ml | 4 +---
> 2 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/event.ml b/tools/ocaml/xenstored/event.ml
> index ccca90b6fc4f..a3be296374ff 100644
> --- a/tools/ocaml/xenstored/event.ml
> +++ b/tools/ocaml/xenstored/event.ml
> @@ -17,12 +17,15 @@
> ( high level binding )
> type t = {
> handle: Xeneventchn.handle;
> - mutable virq_port: Xeneventchn.t option;
> + domexc: Xeneventchn.t;
> }
> 
> -let init () = { handle = Xeneventchn.init (); virq_port = None; }
> +let init () =
> + let handle = Xeneventchn.init () in
> + let domexc = Xeneventchn.bind_dom_exc_virq handle in
> + { handle; domexc }
> +
> let fd eventchn = Xeneventchn.fd eventchn.handle
> -let bind_dom_exc_virq eventchn = eventchn.virq_port <- Some 
> (Xeneventchn.bind_dom_exc_virq eventchn.handle)
> let bind_interdomain eventchn domid port = Xeneventchn.bind_interdomain 
> eventchn.handle domid port
> let unbind eventchn port = Xeneventchn.unbind eventchn.handle port
> let notify eventchn port = Xeneventchn.notify eventchn.handle port
> diff --git a/tools/ocaml/xenstored/xenstored.ml 
> b/tools/ocaml/xenstored/xenstored.ml
> index c5dc7a28d082..55071b49eccb 100644
> --- a/tools/ocaml/xenstored/xenstored.ml
> +++ b/tools/ocaml/xenstored/xenstored.ml
> @@ -397,7 +397,6 @@ let _ =
> if cf.restart && Sys.file_exists Disk.xs_daemon_database then (
> let rwro = DB.from_file store domains cons Disk.xs_daemon_database in
> info "Live reload: database loaded";
> - Event.bind_dom_exc_virq eventchn;
> Process.LiveUpdate.completed ();
> rwro
> ) else (
> @@ -413,7 +412,6 @@ let _ =
> 
> if cf.domain_init then (
> Connections.add_domain cons (Domains.create0 domains);
> - Event.bind_dom_exc_virq eventchn
> );
> rw_sock
> ) in
> @@ -451,7 +449,7 @@ let _ =
> let port = Event.pending eventchn in
> debug "pending port %d" (Xeneventchn.to_int port);
> finally (fun () ->
> - if Some port = eventchn.Event.virq_port then (
> + if port = eventchn.Event.domexc then (
> let (notify, deaddom) = Domains.cleanup domains in
> List.iter (Store.reset_permissions store) deaddom;
> List.iter (Connections.del_domain cons) deaddom;
> -- 
> 2.11.0
> 



Re: [PATCH v2 3/6] tools/oxenstored: Rename some 'port' variables to 'remote_port'

2022-11-30 Thread Edwin Torok


> On 30 Nov 2022, at 16:54, Andrew Cooper  wrote:
> 
> This will make the logic clearer when we plumb local_port through these
> functions.
> 
> While changing this, simplify the construct in Domains.create0 to separate the
> remote port handling from the interface.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Christian Lindig 
> CC: David Scott 
> CC: Edwin Torok 
> CC: Rob Hoes 

We've reviewed this change in-person:
Reviewed-by: Edwin Török 


> 
> v2:
> * New.
> ---
> tools/ocaml/xenstored/domains.ml   | 26 --
> tools/ocaml/xenstored/process.ml   | 12 ++--
> tools/ocaml/xenstored/xenstored.ml |  8 
> 3 files changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/domains.ml 
> b/tools/ocaml/xenstored/domains.ml
> index 17fe2fa25772..26018ac0dd3d 100644
> --- a/tools/ocaml/xenstored/domains.ml
> +++ b/tools/ocaml/xenstored/domains.ml
> @@ -122,9 +122,9 @@ let cleanup doms =
> let resume _doms _domid =
> ()
> 
> -let create doms domid mfn port =
> +let create doms domid mfn remote_port =
> let interface = Xenctrl.map_foreign_range xc domid (Xenmmap.getpagesize()) 
> mfn in
> - let dom = Domain.make domid mfn port interface doms.eventchn in
> + let dom = Domain.make domid mfn remote_port interface doms.eventchn in
> Hashtbl.add doms.table domid dom;
> Domain.bind_interdomain dom;
> dom
> @@ -133,18 +133,16 @@ let xenstored_kva = ref ""
> let xenstored_port = ref ""
> 
> let create0 doms =
> - let port, interface =
> - (
> - let port = Utils.read_file_single_integer !xenstored_port
> - and fd = Unix.openfile !xenstored_kva
> -   [ Unix.O_RDWR ] 0o600 in
> - let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED
> -  (Xenmmap.getpagesize()) 0 in
> - Unix.close fd;
> - port, interface
> - )
> - in
> - let dom = Domain.make 0 Nativeint.zero port interface doms.eventchn in
> + let remote_port = Utils.read_file_single_integer !xenstored_port in
> +
> + let interface =
> + let fd = Unix.openfile !xenstored_kva [ Unix.O_RDWR ] 0o600 in
> + let interface = Xenmmap.mmap fd Xenmmap.RDWR Xenmmap.SHARED 
> (Xenmmap.getpagesize()) 0 in
> + Unix.close fd;
> + interface
> + in
> +
> + let dom = Domain.make 0 Nativeint.zero remote_port interface doms.eventchn 
> in
> Hashtbl.add doms.table 0 dom;
> Domain.bind_interdomain dom;
> Domain.notify dom;
> diff --git a/tools/ocaml/xenstored/process.ml 
> b/tools/ocaml/xenstored/process.ml
> index 72a79e9328dd..b2973aca2a82 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -558,10 +558,10 @@ let do_transaction_end con t domains cons data =
> let do_introduce con t domains cons data =
> if not (Connection.is_dom0 con)
> then raise Define.Permission_denied;
> - let (domid, mfn, port) =
> + let (domid, mfn, remote_port) =
> match (split None '\000' data) with
> - | domid :: mfn :: port :: _ ->
> - int_of_string domid, Nativeint.of_string mfn, int_of_string port
> + | domid :: mfn :: remote_port :: _ ->
> + int_of_string domid, Nativeint.of_string mfn, int_of_string remote_port
> | _ -> raise Invalid_Cmd_Args;
> in
> let dom =
> @@ -569,18 +569,18 @@ let do_introduce con t domains cons data =
> let edom = Domains.find domains domid in
> if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != con 
> then begin
> (* Use XS_INTRODUCE for recreating the xenbus event-channel. *)
> - edom.remote_port <- port;
> + edom.remote_port <- remote_port;
> Domain.bind_interdomain edom;
> end;
> edom
> else try
> - let ndom = Domains.create domains domid mfn port in
> + let ndom = Domains.create domains domid mfn remote_port in
> Connections.add_domain cons ndom;
> Connections.fire_spec_watches (Transaction.get_root t) cons 
> Store.Path.introduce_domain;
> ndom
> with _ -> raise Invalid_Cmd_Args
> in
> - if (Domain.get_remote_port dom) <> port || (Domain.get_mfn dom) <> mfn then
> + if (Domain.get_remote_port dom) <> remote_port || (Domain.get_mfn dom) <> 
> mfn then
> raise Domain_not_match
> 
> let do_release con t domains cons data =
> diff --git a/tools/ocaml/xenstored/xenstored.ml 
> b/tools/ocaml/xenstored/xenstored.ml
> index 55071b49eccb..1f11f576b515 100644
> --- a/tools/ocaml/xenstored/xenstored.ml
> +++ b/tools/ocaml/xenstored/xenstored.ml
> @@ -167,10 +167,10 @@ let from_channel_f chan global_f socket_f domain_f 
> watch_f store_f =
> global_f ~rw
> | "socket" :: fd :: [] ->
> socket_f ~fd:(int_of_string fd)
> - | "dom" :: domid :: mfn :: port :: []-&

Re: [PATCH v2 4/6] tools/oxenstored: Implement Domain.rebind_evtchn

2022-11-30 Thread Edwin Torok


> On 30 Nov 2022, at 16:54, Andrew Cooper  wrote:
> 
> Generally speaking, the event channel local/remote port is fixed for the
> lifetime of the associated domain object.  The exception to this is a
> secondary XS_INTRODUCE (defined to re-bind to a new event channel) which pokes
> around at the domain object's internal state.
> 
> We need to refactor the evtchn handling to support live update, so start by
> moving the relevant manipulation into Domain.
> 
> No practical change.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Christian Lindig 
> CC: David Scott 
> CC: Edwin Torok 
> CC: Rob Hoes 
> 
> Note: This change deliberately doesn't reuse Domain.bind_interdomain, which is
> removed by the end of the refactoring.


Reviewed-by: Edwin Török 

> 
> v2:
> * New.
> ---
> tools/ocaml/xenstored/domain.ml  | 12 
> tools/ocaml/xenstored/process.ml |  6 ++
> 2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
> index ab08dcf37f62..d59a9401e211 100644
> --- a/tools/ocaml/xenstored/domain.ml
> +++ b/tools/ocaml/xenstored/domain.ml
> @@ -63,6 +63,18 @@ let string_of_port = function
> let dump d chan =
> fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
> 
> +let rebind_evtchn d remote_port =
> + begin match d.port with
> + | None -> ()
> + | Some p -> Event.unbind d.eventchn p
> + end;
> + let local = Event.bind_interdomain d.eventchn d.id remote_port in
> + debug "domain %d rebind (l %s, r %d) => (l %d, r %d)"
> +  d.id (string_of_port d.port) d.remote_port
> +  (Xeneventchn.to_int local) remote_port;
> + d.remote_port <- remote_port;
> + d.port <- Some (local)
> +
> let notify dom =
> match dom.port with
> | None -> warn "domain %d: attempt to notify on unknown port" dom.id
> diff --git a/tools/ocaml/xenstored/process.ml 
> b/tools/ocaml/xenstored/process.ml
> index b2973aca2a82..2ea940d7e2d5 100644
> --- a/tools/ocaml/xenstored/process.ml
> +++ b/tools/ocaml/xenstored/process.ml
> @@ -567,11 +567,9 @@ let do_introduce con t domains cons data =
> let dom =
> if Domains.exist domains domid then
> let edom = Domains.find domains domid in
> - if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != 
> con then begin
> + if (Domain.get_mfn edom) = mfn && (Connections.find_domain cons domid) != 
> con then
> (* Use XS_INTRODUCE for recreating the xenbus event-channel. *)
> - edom.remote_port <- remote_port;
> - Domain.bind_interdomain edom;
> - end;
> + Domain.rebind_evtchn edom remote_port;
> edom
> else try
> let ndom = Domains.create domains domid mfn remote_port in
> -- 
> 2.11.0
> 



Re: [PATCH v2 1/6] tools/oxenstored: Style fixes to Domain

2022-11-30 Thread Edwin Torok


> On 30 Nov 2022, at 16:54, Andrew Cooper  wrote:
> 
> This file has some style problems so severe that they interfere with the
> readability of the subsequent bugfix patches.
> 
> Fix these issues ahead of time, to make the subsequent changes more readable.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Christian Lindig 
> CC: David Scott 
> CC: Edwin Torok 
> CC: Rob Hoes 


Reviewed-by: Edwin Török 

> 
> v2:
> * New
> ---
> tools/ocaml/xenstored/domain.ml | 16 +++-
> 1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/domain.ml b/tools/ocaml/xenstored/domain.ml
> index 81cb59b8f1a2..ab08dcf37f62 100644
> --- a/tools/ocaml/xenstored/domain.ml
> +++ b/tools/ocaml/xenstored/domain.ml
> @@ -57,17 +57,16 @@ let is_paused_for_conflict dom = dom.conflict_credit <= 
> 0.0
> let is_free_to_conflict = is_dom0
> 
> let string_of_port = function
> -| None -> "None"
> -| Some x -> string_of_int (Xeneventchn.to_int x)
> + | None -> "None"
> + | Some x -> string_of_int (Xeneventchn.to_int x)

I would've expected ocp-indent to already do the right thing on this part.

> 
> let dump d chan =
> fprintf chan "dom,%d,%nd,%d\n" d.id d.mfn d.remote_port
> 
> -let notify dom = match dom.port with
> -| None ->
> - warn "domain %d: attempt to notify on unknown port" dom.id
> -| Some port ->
> - Event.notify dom.eventchn port
> +let notify dom =
> + match dom.port with
> + | None -> warn "domain %d: attempt to notify on unknown port" dom.id
> + | Some port -> Event.notify dom.eventchn port

but yes for this we'd need ocamlformat, not ocp-indent.

> 
> let bind_interdomain dom =
> begin match dom.port with
> @@ -84,8 +83,7 @@ let close dom =
> | None -> ()
> | Some port -> Event.unbind dom.eventchn port
> end;
> - Xenmmap.unmap dom.interface;
> - ()
> + Xenmmap.unmap dom.interface
> 
> let make id mfn remote_port interface eventchn = {
> id = id;
> -- 
> 2.11.0
> 



Re: [PATCH] tools/ocaml/xenctrl: OCaml 5 support, fix use-after-free

2022-11-24 Thread Edwin Torok



> On 24 Nov 2022, at 13:43, Andrew Cooper  wrote:
> 
> On 24/11/2022 09:03, Edwin Torok wrote:
>>> On 23 Nov 2022, at 22:25, Andrew Cooper  wrote:
>>> 
>>> The binding for xc_interface_close() free the underlying handle while 
>>> leaving
>>> the Ocaml object still in scope and usable.  This would make it easy to 
>>> suffer
>>> a use-after-free, if it weren't for the fact that the typical usage is as a
>>> singleton that lives for the lifetime of the program.
>>> 
>>> Ocaml 5 no longer permits storing a naked C pointer in an Ocaml value.
>>> 
>>> Therefore, use a Custom block.  This allows us to use the finaliser callback
>>> to call xc_interface_close(), if the Ocaml object goes out of scope.
>>> 
>>> Signed-off-by: Andrew Cooper 
>>> ---
>>> CC: Christian Lindig 
>>> CC: David Scott 
>>> CC: Edwin Torok 
>>> CC: Rob Hoes 
>>> 
>>> I've confirmed that Xenctrl.close_handle does cause the finaliser to be
>>> called, simply by dropping the handle reference.
>> 
>> Thanks, a good way to test this is with OCAMLRUNPARAM=c, possible under 
>> valgrind, which causes all finalisers to be called on exit
>> (normally they are not because the program is exiting anyway)
> 
> I do that anyway, but it's not relevant here.
> 
> What matters is checking that calling close_handle releases the object
> (albeit with a forced GC sweep) before the program ends.
> 
>>> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
>>> b/tools/ocaml/libs/xc/xenctrl_stubs.c
>>> index f37848ae0bb3..4e1204085422 100644
>>> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
>>> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
>>> @@ -37,13 +37,28 @@
>>> 
>>> #include "mmap_stubs.h"
>>> 
>>> -#define _H(__h) ((xc_interface *)(__h))
>>> +#define _H(__h) (*((xc_interface **)Data_custom_val(__h)))
>>> #define _D(__d) ((uint32_t)Int_val(__d))
>> 
>> I think this requires an update in xenopsd too to match, otherwise it'll 
>> crash:
>> https://github.com/xapi-project/xenopsd/blob/master/c_stubs/xenctrlext_stubs.c#L32
> 
> Urgh.  I'll take a note to do that when bringing in the change.
> 
>> This wasn't an issue with the original patch which used Data_abstract_val 
>> here, because
>> that (currently) happens to boil down to just a cast (with some GC metadata 
>> *before* it),
>> so the old way of just casting OCaml value to C pointer still worked.
>> 
>> However Data_custom_val boils down to accessing a value at +sizeof(value) 
>> offset,
>> so xenopsd would now read the wrong pointer.
>> Perhaps it would've been better to have this _H defined in some header, 
>> otherwise extending Xenctrl the way xenopsd does it is quite brittle.
> 
> Exporting _H won't help because everything is statically built. 


As long as you don't rebuilt xenopsd you will keep using the old C stubs that 
xenopsd got compiled with, the change in Xen will have no effect until xenopsd 
is recompiled,
at which point it could pick up the new _H if available, but I agree with your 
point below.


> It's
> brittle because xenopsd has got a local piece of C playing around with
> the internals of someone else's library.  This violates more rules than
> I care to list.
> 
> We (XenServer) should definitely work to improve things, but this is
> entirely a xenopsd problem, not an upstream Xen problem.


It is a lot easier to add new xenctrl bindings and test them out in xenopsd 
than it is to add them to Xen.
We should try to either upstream all xenopsd xenctrl bindings to Xen, and make 
it easier to add them to Xen going forward;
or move all the Xenctrl bindings to xenopsd, then at least you only need to 
rebuild the one piece you are changing.

But to do the latter we first need to get everything that relies on xenctrl to 
move to more stable interfaces (including oxenstored).
There are some Mirage libraries as well that use xenctrl, when a more suitable 
stable interface exists in newer versions of Xen that they should use instead.

Perhaps a compromise between the 2 extremes would be for xenopsd to open and 
have its own xenctrl handle, even if that leads to some code duplication, it 
would at least not rely on an undocumented and unstable internal detail of an 
already unstable ABI. And that would still allow xenopsd to extend xenctrl with 
bindings that are not (yet) present in Xen.
What do you think?

Best regards,
--Edwin






Re: [PATCH] tools/ocaml/xenctrl: OCaml 5 support, fix use-after-free

2022-11-24 Thread Edwin Torok



> On 23 Nov 2022, at 22:25, Andrew Cooper  wrote:
> 
> The binding for xc_interface_close() free the underlying handle while leaving
> the Ocaml object still in scope and usable.  This would make it easy to suffer
> a use-after-free, if it weren't for the fact that the typical usage is as a
> singleton that lives for the lifetime of the program.
> 
> Ocaml 5 no longer permits storing a naked C pointer in an Ocaml value.
> 
> Therefore, use a Custom block.  This allows us to use the finaliser callback
> to call xc_interface_close(), if the Ocaml object goes out of scope.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Christian Lindig 
> CC: David Scott 
> CC: Edwin Torok 
> CC: Rob Hoes 
> 
> I've confirmed that Xenctrl.close_handle does cause the finaliser to be
> called, simply by dropping the handle reference.


Thanks, a good way to test this is with OCAMLRUNPARAM=c, possible under 
valgrind, which causes all finalisers to be called on exit
(normally they are not because the program is exiting anyway)

> ---
> tools/ocaml/libs/xc/xenctrl.ml  |  3 +--
> tools/ocaml/libs/xc/xenctrl.mli |  1 -
> tools/ocaml/libs/xc/xenctrl_stubs.c | 43 ++---
> 3 files changed, 27 insertions(+), 20 deletions(-)
> 
> diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
> index aa650533f718..4b74e31c75cb 100644
> --- a/tools/ocaml/libs/xc/xenctrl.ml
> +++ b/tools/ocaml/libs/xc/xenctrl.ml
> @@ -175,7 +175,6 @@ exception Error of string
> type handle
> 
> external interface_open: unit -> handle = "stub_xc_interface_open"
> -external interface_close: handle -> unit = "stub_xc_interface_close"
> 
> let handle = ref None
> 
> @@ -183,7 +182,7 @@ let get_handle () = !handle
> 
> let close_handle () =
> match !handle with
> - | Some h -> handle := None; interface_close h
> + | Some h -> handle := None
> | None -> ()
> 
> let with_intf f =
> diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
> index 5bf5f5dfea36..ddfe84dc22a9 100644
> --- a/tools/ocaml/libs/xc/xenctrl.mli
> +++ b/tools/ocaml/libs/xc/xenctrl.mli
> @@ -146,7 +146,6 @@ type shutdown_reason = Poweroff | Reboot | Suspend | 
> Crash | Watchdog | Soft_res
> exception Error of string
> type handle
> external interface_open : unit -> handle = "stub_xc_interface_open"
> -external interface_close : handle -> unit = "stub_xc_interface_close"
> 
> (** [with_intf f] runs [f] with a global handle that is opened on demand
>  * and kept open. Conceptually, a client should use either
> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
> b/tools/ocaml/libs/xc/xenctrl_stubs.c
> index f37848ae0bb3..4e1204085422 100644
> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> @@ -37,13 +37,28 @@
> 
> #include "mmap_stubs.h"
> 
> -#define _H(__h) ((xc_interface *)(__h))
> +#define _H(__h) (*((xc_interface **)Data_custom_val(__h)))
> #define _D(__d) ((uint32_t)Int_val(__d))


I think this requires an update in xenopsd too to match, otherwise it'll crash:
https://github.com/xapi-project/xenopsd/blob/master/c_stubs/xenctrlext_stubs.c#L32

This wasn't an issue with the original patch which used Data_abstract_val here, 
because
that (currently) happens to boil down to just a cast (with some GC metadata 
*before* it),
so the old way of just casting OCaml value to C pointer still worked.

However Data_custom_val boils down to accessing a value at +sizeof(value) 
offset,
so xenopsd would now read the wrong pointer.
Perhaps it would've been better to have this _H defined in some header, 
otherwise extending Xenctrl the way xenopsd does it is quite brittle.

Best regards,
--Edwin


> 
> #ifndef Val_none
> #define Val_none (Val_int(0))
> #endif
> 
> +static void stub_xenctrl_finalize(value v)
> +{
> + xc_interface_close(_H(v));
> +}
> +
> +static struct custom_operations xenctrl_ops = {
> + .identifier  = "xenctrl",
> + .finalize= stub_xenctrl_finalize,
> + .compare = custom_compare_default, /* Can't compare */
> + .hash= custom_hash_default,/* Can't hash*/
> + .serialize   = custom_serialize_default,   /* Can't serialize   */
> + .deserialize = custom_deserialize_default, /* Can't deserialize */
> + .compare_ext = custom_compare_ext_default, /* Can't compare */
> +};
> +
> #define string_of_option_array(array, index) \
> ((Field(array, index) == Val_none) ? NULL : String_val(Field(Field(array, 
> index), 0)))
> 
> @@ -70,26 +85,20 @@ static void Noreturn failwith_xc(xc_interface *xch)
> CAMLprim value stub_xc_interface_open(void)
> {
> CAML

Re: Summary: Re: [PATCH for-4.17 v3 00/15] OCaml fixes for Xen 4.17

2022-11-13 Thread Edwin Torok



> On 11 Nov 2022, at 20:46, Andrew Cooper  wrote:
> 
> Nothing here is critical enough to go into 4.17 at this juncture.

Agreed

> 
> Various notes/observations from having spent a day trying to untangle
> things.

Thanks.


> 
> 1) Patches 5/6 are a single bugfix and need merging.  Except there was
> also an error when taking feedback from the list, and the net result
> regresses the original optimisation.  I have a fix sorted in my local queue.
> 
> 2) The indentation fix (not attached to this series) should scope the
> logic, not delete a debug line which was presumably added for a good
> reason.  I've got a fix to this effect in my local queue, and we can
> discuss the pros/cons of the approach in due course.


I deleted the debug line to avoid reindenting code, which was frowned upon.
Either way it is fine for me.


> 
> 3) Patch 1, evtchn Ocaml 5.0 compat, is still missing some corrections
> which I gave on earlier postings.  I've fixed it up locally in my queue.
> 
> I also notice, while reviewing the whole, that stub_eventchn_init()
> passes NULL as a logger, which has the side effect of libxenevtchn
> instantiating a default logger which takes control of stdout/stderr. 
> Without starting the fight over toxic library behaviour yet again, it
> occurs to me in the context of Patch 13, uncaught exception handler,
> that in oxenstored, any logging from the C level needs to end up elsewhere.
> 
> While we do have ocaml bindings for xentoollog, nothing uses it, and
> none of the other libraries (save xl, which isn't used) have a way of
> passing the Ocaml Xentoollog down.  This probably wants rethinking, one
> way or another.

We should probably start by reviewing the xentoollog bindings, if they never 
got used they're probably buggy.
I think Pau might have some xentoollog bindings though.

> 
> 4) Patches 2/3.  All these libraries have far worse problems than
> evtchn, because they can easily use-after-free.  They all need to be
> Custom with a finaliser.

Indeed, the bindings aren't very safe, and that should be fixed separately.
I've got some patches somewhere to stop the mmap bindings from segfaulting on 
invalid data,
but I lost track whether that got commited or still in one of my local branches.

> 
> 5) Patch 4.  The commit message says "A better solution is being worked
> on for master", but this is master.  Also, it's not a prerequisite for a
> security fix; merely something to make a developers life easier.

It is to avoid having to add Makefile changes in each security patch commit 
(potentially).
Perhaps the commit message can be changed to say this is the minimal change to 
unbreak the build system,
and a more comprehensive solution is being worked on (using Dune, or dune 
generated makefiles).

> 
> 6) The re-indent patch.  Policies of when to do it aside, having tried
> using it, the format adjustment is incomplete (running ocp-indent gets
> me deltas in files I haven't touched),

Perhaps we need to use the strict_comments setting, I think it tries to leave 
comments untouched,
but that also means the outcome depends on the starting state.
And I hope it doesn't depend on ocp-indent version.

> and there needs to be some
> .gitignore changes.
> 
> That said, it is usually frowned upon to have logic depending on being
> in a git tree.  This was perhaps a bigger deal back when we used hg by
> default and mirrored into multiple SCMs, but it's still expected not to
> rely on this.

We could use 'find' instead.

> 
> 7) Patch 8, evtchn fdopen, is two separate patches.  One adding fdopen,
> and one adding a NOCLOEXEC argument to the existing init.
> 
> They want splitting in two.  fdopen() ought to pass flags so we don't
> have to break the ABI again

The ABI changes everytime a new variant is added (the interface hash will 
change, and you need to rebuild/relink),
so using a single flag variant doesn't avoid ABI changes like it would in C.

> when there is a flag needing passing, and
> cloexec probably shouldn't be a boolean.

The preferred way to bind CLOEXEC in OCaml is to use a boolean, see
Unix.html , in particular
`val socket : ?cloexec:bool ->
socket_domain -> socket_type -> int -> file_descr`
Perhaps I should've spelled this out in the commit message.

>   We should either pass a raw
> int32, or a list-of-enums like we do in the xenctrl stubs.  Also, this
> patch has inherited errors from patch 1.
> 
> 9) Patches 8 thru 15 need to be the other side of the intent patch,
> because they need backporting to branches which will never get it.

This was done on purpose like this to ensure that indentation is backported in 
some way,
because the lack of indentation has previously broken backports/rebases (see 
the debug line introduced in the wrong place in the live update patch).
Without a comprehensive testsuite (which is being worked on, but not ready yet) 
it is then impossible to tell whether a backport is correct or not,
even if it compiles, it 

Re: [PATCH for-4.17 v3 07/15] CODING_STYLE(tools/ocaml): add 'make format' and remove tabs

2022-11-09 Thread Edwin Torok


> On 9 Nov 2022, at 16:26, Edwin Török  wrote:
> 
> 
> 
>> On 9 Nov 2022, at 14:18, Edwin Torok  wrote:
>> 
>> 
>> 
>>> On 9 Nov 2022, at 02:40, Henry Wang  wrote:
>>> 
>>> Hi Julien,
>>> 
>>>> -Original Message-
>>>> From: Julien Grall 
>>>> Subject: Re: [PATCH for-4.17 v3 07/15] CODING_STYLE(tools/ocaml): add
>>>> 'make format' and remove tabs
>>>> While I understand the goal and support, this seems to be a bit too late
>>>> to do it in Xen 4.17 (we are only a couple of weeks away). At this stage
>>>> of the release we should only do bug fix.
>>>> 
>>>> This is clearly only a comesmetic change and there I would argue this
>>>> should be deferred to 4.18. That said the last call is from the RM.
>>> 
>>> I agree with your point. I think maybe defer the patch to 4.18 is better,
>>> given the deep freeze state we are currently in.
>> 
>> 
>> Indentation can't really be trusted to humans :)
>> 
> 
> 
> It might be better to consider oxenstored unsupported in 4.17 at this point 
> and try again for 4.17.1 or 4.18

Ah I see that 'Live Update' for oxenstored is already marked as 'not 
functional' (I indeed remember seeing a patch to that effect),
so looks like nothing else needs to be done here for 4.17.
https://xenbits.xen.org/docs/unstable/support-matrix.html

I can try fixing that up for master, 4.17.1 or 4.18, whenever that opens.

Best regards,
--Edwin



Re: [PATCH for-4.17 v3 07/15] CODING_STYLE(tools/ocaml): add 'make format' and remove tabs

2022-11-09 Thread Edwin Torok



> On 9 Nov 2022, at 14:18, Edwin Torok  wrote:
> 
> 
> 
>> On 9 Nov 2022, at 02:40, Henry Wang  wrote:
>> 
>> Hi Julien,
>> 
>>> -Original Message-
>>> From: Julien Grall 
>>> Subject: Re: [PATCH for-4.17 v3 07/15] CODING_STYLE(tools/ocaml): add
>>> 'make format' and remove tabs
>>> While I understand the goal and support, this seems to be a bit too late
>>> to do it in Xen 4.17 (we are only a couple of weeks away). At this stage
>>> of the release we should only do bug fix.
>>> 
>>> This is clearly only a comesmetic change and there I would argue this
>>> should be deferred to 4.18. That said the last call is from the RM.
>> 
>> I agree with your point. I think maybe defer the patch to 4.18 is better,
>> given the deep freeze state we are currently in.
> 
> 
> Indentation can't really be trusted to humans :)
> 


It might be better to consider oxenstored unsupported in 4.17 at this point and 
try again for 4.17.1 or 4.18, it is probably too late to fix up all the bugs 
that I keep finding in it in a way in which it can be (security) supported, 
although I thought the goal of a release candidate is to try and find bugs and 
fix them before release.

But doing that while also trying to keep whitespace and indentation changes out 
of the patches (or trying to keep the patches small), is more likely to 
introduce more bugs into it at this point rather than fix things.

I was not able to do or send any of these patches sooner because the XSA-326 
and XSA-115/etc. work in the past years has prevented any other kind of work 
from being sent (it'd have made rebasing the XSA series more difficult, and/or 
reveal the XSA as part of various other changes and commits), it is unfortunate 
that there is a release quite so close after an oxenstored XSA

I tried fixing up the rest of the series to not depend on this patch, but 
without an actual 'make format' rule to give it consistent indentation
it is just too error-prone or risky to fix it up at this point (I don't really 
mind whether indentation is tabs vs spaces, it is the inconsistency and 
non-automated nature of it that is the problem),
e.g. it is quite easy to accidentally duplicate code, or get the code in the 
wrong scope, or introduce bugs like the one I just mentioned before when a new 
statement gets inserted and the code seemingly looks ok but its semantics is 
entirely changed (and that is hidden by the inconsistent/non-automated 
indentation).


Perhaps by trying to merge some of the changes/fixes from 
https://github.com/mirage/ocaml-xenstore and getting that production ready, 
which is a much better codebase to start from than
the current one in the Xen tree.:
* it has actual unit tests (which I tried to introduce as part of a security 
fix for the intree version of oxenstored but got repeatedly discouraged from 
including it to not make the security fix too large)
* it was not vulnerable XSA-353
* it has fixed part of XSA-326 together with a unit test nearly 10 years before 
Xen project rediscovered the same bug and realized it is a security bug in the 
in-tree version: 
https://github.com/mirage/ocaml-xenstore/commit/21e96654c27c01cf52e5d7aabc5ee53e07f2cbb7
* (of course mirage version has never been used in production so it is entirely 
likely it has *different* bugs)
* in-tree Xen is difficult to contribute to: 
  * it has a broken build system that keeps throwing 'inconsistent assumptions'
  * it has inconsistent and as we can see sometimes wrong indentation
  * patches get posted to the mailing list and sometimes lost (e.g. 
https://patchwork.kernel.org/project/xen-devel/list/?series=339731=both 
is still not committed), and I'm fairly sure I've seen an ack somewhere, but 
patchwork can't find it now

So I think oxenstored in 4.18+ will likely be sufficiently different than 4.17 
that direct backports won't be possible anyway (indentation changes or not), so 
having this indentation patch in 4.17 wouldn't really help much.
The disadvantage is that we lose all the bugfixes in the patch series after the 
indentation change, but if we consider oxenstored unsupported in 4.17 that may 
not matter.

I can resend this patch series for master without a 'for-4.17 tag' and keep 
doing development there.

Best regards,
--Edwin




Re: [PATCH for-4.17 v3 07/15] CODING_STYLE(tools/ocaml): add 'make format' and remove tabs

2022-11-09 Thread Edwin Torok



> On 9 Nov 2022, at 02:40, Henry Wang  wrote:
> 
> Hi Julien,
> 
>> -Original Message-
>> From: Julien Grall 
>> Subject: Re: [PATCH for-4.17 v3 07/15] CODING_STYLE(tools/ocaml): add
>> 'make format' and remove tabs
>> While I understand the goal and support, this seems to be a bit too late
>> to do it in Xen 4.17 (we are only a couple of weeks away). At this stage
>> of the release we should only do bug fix.
>> 
>> This is clearly only a comesmetic change and there I would argue this
>> should be deferred to 4.18. That said the last call is from the RM.
> 
> I agree with your point. I think maybe defer the patch to 4.18 is better,
> given the deep freeze state we are currently in.

On the other hand here is a bug I just spotted when looking at indentation 
changes (or rather reading the code after the indentation change),
and there are probably more:

```
if not (Domain.get_io_credit dom > 0) then
debug "Looking up domid %d" (Domain.get_id dom);
-   let con = Connections.find_domain cons (Domain.get_id dom) in
-   if not (Connection.has_more_work con) then (
-   Process.do_output store cons domains con;
-   Process.do_input store cons domains con;
-   if Connection.has_more_work con then
-   (* Previously thought as no work, but detect some after 
scan (as
-  processing a new message involves multiple steps.) It's 
very
-  likely to be a "lazy" client, bump its credit. It could 
be false
-  positive though (due to time window), but it's no harm 
to give a
-  domain extra credit. *)
-   let n = 32 + 2 * (Domains.number domains) in
-   info "found lazy domain %d, credit %d" (Domain.get_id dom) 
n;
-   Domain.set_io_credit ~n dom
-   ) in
+   let con = Connections.find_domain cons (Domain.get_id dom) in
+   if not (Connection.has_more_work con) then (
+   Process.do_output store cons domains con;
+   Process.do_input store cons domains con;
+   if Connection.has_more_work con then
+   (* Previously thought as no work, but detect some after scan (as
+  processing a new message involves multiple steps.) It's very
+  likely to be a "lazy" client, bump its credit. It could be 
false
+  positive though (due to time window), but it's no harm to 
give a
+  domain extra credit. *)
+   let n = 32 + 2 * (Domains.number domains) in
+   info "found lazy domain %d, credit %d" (Domain.get_id dom) n;
+   Domain.set_io_credit ~n dom
+   ) in
```

Notice how all that code "seems" to be inside the if unless you read really 
closely, but in fact it isn't, just the debug statement is.
Which means whenever I reviewed this code (to look for performance or security 
bugs) I've been reading it wrong the same way the original author got it wrong 
when indenting it.
In this case the original author being me, as I've introduced this bug in 
42f0581a91d4340ae66768a29fd779f83415bdfe back in 2021, where prior to the 
change in that commit indentation was correct,
but the patch added the 'debug' line in the wrong place (before the let instead 
of after it, and had I had my usual tools available to indent the file correctly
this problem would've been detected and corrected before commiting the bug into 
the codebase...
And was probably a side-effect of trying not to reindent the code to reduce the 
patch size for the security fix, and by doing so introducing an actual 
functional bug
)
(And I've recently fixed a similar bug elsewhere in XAPI, in which case I 
wasn't the original author of such a bug)
Indentation can't really be trusted to humans :)

(It means that even if a domain already has IO credit we still scan its ring 
for more work)

So some indentation changes will probably come in as bugfixes for 4.17.1 (well 
maybe not reindenting the whole file, just the problematic region of 
code/function).

Best regards,
--Edwin


Re: [PATCH for-4.17 v3 15/15] tools/ocaml/libs/xc: fix use of uninitialized memory in shadow_allocation_get

2022-11-09 Thread Edwin Torok
On 8 Nov 2022, at 15:34, Edwin Török  wrote:
> 
> It has been noticed in 2013 that shadow allocation sometimes returns the
> wrong value, which got worked around by adding a limit to the shadow
> multiplier of 1000 and ignoring the value from Xen in that case
> to avoid a shadow multiplier causing a VM to request 6PB of memory for
> example:
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fxapi-project%2Fxen-api%2Fpull%2F1215%2Fcommits%2Fbe55a8c30b41d1cd7596fc100ab1cfd3539f74ebdata=05%7C01%7Cedvin.torok%40citrix.com%7C54fa199055674737536f08dac19f7026%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C638035187781870066%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=l3%2BSDQinoqZ9CZvWsAcXgFl5vEbJf7hjVzBPLKoVYp4%3Dreserved=0
> 
> However that is just a workaround, and I've just reproduced this by
> killing a VM mid migration, which resulted in a shadow multiplier of
> 629.42, rendering the VM unbootable even after a host reboot.


After some more discussion it looks like this is getting fixed in the 
hypervisor, so this workaround wouldn't be needed,
might want to hold off on this patch until the domctl discussion is settled at:
https://lore.kernel.org/xen-devel/20221108113850.61619-1-roger@citrix.com/


Best regards,
--Edwin

> 
> The real bug is in Xen: when a VM is dying it will return '0' for paging
> op domctls and log a message at info level
> 'Ignoring paging op on dying domain', which leaves the 'mb' parameter
> uninitialized upon return from the domctl.
> 
> The binding also doesn't initialize the 'c->mb' parameter (it is meant
> to be used only when setting, not when querying the allocation),
> which results in the VM getting a shadow allocation (and thus multiplier)
> set based on what value happened to be currently on the stack.
> 
> Explicitly initialize the value passed to the domctl, and detect the 
> uninitialized
> case (shadow allocation of 0), and raise an exception in that case.
> The exception will cause xenopsd to skip setting the shadow multiplier.
> 
> Note that the behaviour of Xen here is inconsistent between x86 and ARM:
> ARM would return EINVAL when it gets a paging op on a dying domain,
> and X86-64 would return 0 with possibly uninitialized data.
> 
> It might be desirable to change the x86 path in the hypervisor to return
> EINVAL, although that would require more testing in case it breaks
> somethig.
> But the bindings should be defensive anyway against bugs like this.
> 
> Signed-off-by: Edwin Török 
> ---
> Reason for inclusion in 4.17:
> - fixes a long-standing (>9y old) bug that is still happening today
> 
> Changes since v2:
> - new in v3
> ---
> tools/ocaml/libs/xc/xenctrl_stubs.c | 5 -
> 1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
> b/tools/ocaml/libs/xc/xenctrl_stubs.c
> index e2d897581f..9681a74e40 100644
> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> @@ -1019,7 +1019,7 @@ CAMLprim value stub_shadow_allocation_get(value xch, 
> value domid)
> {
> CAMLparam2(xch, domid);
> CAMLlocal1(mb);
> -unsigned int c_mb;
> +unsigned int c_mb = 0;
> int ret;
> 
> caml_enter_blocking_section();
> @@ -1029,6 +1029,9 @@ CAMLprim value stub_shadow_allocation_get(value xch, 
> value domid)
> caml_leave_blocking_section();
> if (ret != 0)
> failwith_xc(_H(xch));
> +if ( !c_mb )
> +caml_failwith("domctl returned uninitialized data for shadow "
> +  "allocation, dying domain?");
> 
> mb = Val_int(c_mb);
> CAMLreturn(mb);
> -- 
> 2.34.1
> 



Re: [PATCH for-4.17 v3 09/15] tools/ocaml/xenstored/store.ml: fix build error

2022-11-09 Thread Edwin Torok


> On 9 Nov 2022, at 09:52, Jan Beulich  wrote:
> 
> On 09.11.2022 10:36, Jan Beulich wrote:
>> On 09.11.2022 10:21, Edwin Torok wrote:
>>> 
>>> 
>>>> On 9 Nov 2022, at 07:10, Jan Beulich  wrote:
>>>> 
>>>> On 09.11.2022 03:47, Henry Wang wrote:
>>>>>> -Original Message-
>>>>>> From: Edwin Török 
>>>>>> Subject: [PATCH for-4.17 v3 09/15] tools/ocaml/xenstored/store.ml: fix 
>>>>>> build
>>>>>> error
>>>>>> 
>>>>>> Building with Dune in release mode fails with:
>>>>>> ```
>>>>>> File "ocaml/xenstored/store.ml", line 464, characters 13-32:
>>>>>> Warning 18: this type-based record disambiguation is not principal.
>>>>>> File "ocaml/xenstored/store.ml", line 1:
>>>>>> Error: Some fatal warnings were triggered (1 occurrences)
>>>>>> ```
>>>>>> 
>>>>>> This is a warning to help keep the code futureproof, quoting from its
>>>>>> documentation:
>>>>>>> Check information path during type-checking, to make sure that all types
>>>>>> are
>>>>>>> derived in a principal way. When using labelled arguments and/or
>>>>>> polymorphic
>>>>>>> methods, this flag is required to ensure future versions of the 
>>>>>>> compiler will
>>>>>>> be able to infer types correctly, even if internal algorithms change. 
>>>>>>> All
>>>>>>> programs accepted in -principal mode are also accepted in the default
>>>>>> mode with
>>>>>>> equivalent types, but different binary signatures, and this may slow 
>>>>>>> down
>>>>>> type
>>>>>>> checking; yet it is a good idea to use it once before publishing source 
>>>>>>> code.
>>>>>> 
>>>>>> Fixes: db471408edd46 "tools/ocaml/xenstored: Fix quota bypass on domain
>>>>>> shutdown"
>>>>> 
>>>>> Nit: The format of this "Fixes:" tag might need to be fixed?
>>>>> 
>>>>>> 
>>>>>> Signed-off-by: Edwin Török 
>>>>>> ---
>>>>>> Reason for inclusion in 4.17:
>>>>>> - fixes a build error in a previous commit that is already in master
>>>>> 
>>>>> Yes, given this is a simple enough patch:
>>>>> 
>>>>> Release-acked-by: Henry Wang 
>>>> 
>>>> Afaics this patch was previously posted in isolation, and it was
>>>> already release-acked. What's lacking there is a 2nd maintainer's
>>>> ack or a proper R-b. When it now is patch 9 in a series, it isn't
>>>> really obvious whether this could also be committed in isolation
>>>> (it looks like it does, but a clear statement to this effect
>>>> would have been beneficial).
>>>> 
>>> 
>>> 
>>> You're right it already has both acks, it just hasn't been commited yet: 
>> 
>> Oh, that's only because I overlooked Christian's ack. Will commit this now.
> 
> But, sigh, I had to fix up the patch: Even the one submitted standalone
> used space indentation when the file in the tree uses hard tabs. And
> even if I had wanted to pull from your github tree I would have had to
> fix up at least the Fixes: tag.


I thought I fixed it (the missing '('), but the format of the Fixes: line was 
not documented in 
https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches (which is the 
canonical resource I use when sending patches)
and I just tried to guess the format based on the Fixes: entries I found in 
master (some of which have or less characters in the git hash)

I see there is a 
https://xenbits.xen.org/docs/unstable/process/sending-patches.html which has 
some more useful details,
including a way to automatically generate the Fixes: line using a git config, 
which is very useful and I'll use that in the future instead of crafting it by 
hand
(which is what I've been doing so far).
I've edited the wiki to include this information now.

>  So I ended up hand-editing indentation



Thanks,
--Edwin

Re: [PATCH for-4.17 v3 07/15] CODING_STYLE(tools/ocaml): add 'make format' and remove tabs

2022-11-09 Thread Edwin Torok



> On 9 Nov 2022, at 03:19, Henry Wang  wrote:
> 
> Hi Edwin,
> 
>> -Original Message-
>>> [1]
>> AS8PR08MB7991145C8063D6939AFFED8F92829@AS8PR08MB7991.eurprd08
>> .prod.outlook.com
>> 
>> 
>> Hmm I thought that is my Outlook rewriting the link, but the archive at
>> lore.kernel.org seems to have this mangled URL as well which I cannot open.
>> Could you send it in such a way that it is not encoded when being sent (e.g.
>> base64 encode it...)
> 
> Appending 
> "https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-devel%2Fdata=05%7C01%7Cedvin.torok%40citrix.com%7C1c7455639df84f970eab08dac2012fce%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C638035607613905983%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=l0oQ0815ndwWt4buGasPYYmVV3FdnrC%2F4wKrIoLqki8%3Dreserved=0;
>  before the link from Julien
> works for me. But I think the link is just the release schedule :)


Thanks, this link works indeed (it is an Outlook message ID after all, not one 
of Outlook's link rewrites):
https://lore.kernel.org/xen-devel/as8pr08mb7991145c8063d6939affed8f92...@as8pr08mb7991.eurprd08.prod.outlook.com/

Best regards,
--Edwin


Re: [PATCH for-4.17?] x86/paging: return -EINVAL for paging domctls for dying domains

2022-11-09 Thread Edwin Torok


> On 9 Nov 2022, at 07:48, Jan Beulich  wrote:
> 
> On 08.11.2022 18:15, Roger Pau Monné wrote:
>> On Tue, Nov 08, 2022 at 06:03:54PM +0100, Jan Beulich wrote:
>>> On 08.11.2022 17:43, Roger Pau Monné wrote:
 On Tue, Nov 08, 2022 at 05:14:40PM +0100, Jan Beulich wrote:
> On 08.11.2022 12:38, Roger Pau Monne wrote:
>> Like on the Arm side, return -EINVAL when attempting to do a p2m
>> operation on dying domains.
>> 
>> The current logic returns 0 and leaves the domctl parameter
>> uninitialized for any parameter fetching operations (like the
>> GET_ALLOCATION operation), which is not helpful from a toolstack point
>> of view, because there's no indication that the data hasn't been
>> fetched.
> 
> While I can see how the present behavior is problematic when it comes
> to consuming supposedly returned data, ...
> 
>> --- a/xen/arch/x86/mm/paging.c
>> +++ b/xen/arch/x86/mm/paging.c
>> @@ -694,9 +694,10 @@ int paging_domctl(struct domain *d, struct 
>> xen_domctl_shadow_op *sc,
>> 
>> if ( unlikely(d->is_dying) )
>> {
>> -gdprintk(XENLOG_INFO, "Ignoring paging op on dying domain %u\n",
>> +gdprintk(XENLOG_INFO,
>> + "Tried to do a paging domctl op on dying domain %u\n",
>>  d->domain_id);
>> -return 0;
>> +return -EINVAL;
>> }
> 
> ... going from "success" to "failure" here has a meaningful risk of
> regressing callers. It is my understanding that it was deliberate to
> mimic success in this case (without meaning to assign "good" or "bad"
> to that decision).
 
 I would assume that was the original intention, yes, albeit the commit
 message doesn't go into details about why mimicking success is
 required, it's very well possible the code relying on this was xend.
>>> 
>>> Quite possible, but you never know who else has cloned code from there.
>>> 
> Can you instead fill the data to be returned in
> some simple enough way? I assume a mere memset() isn't going to be
> good enough, though (albeit public/domctl.h doesn't explicitly name
> any input-only fields, so it may not be necessary to preserve
> anything). Maybe zeroing ->mb and ->stats would do?
 
 Hm, it still feels kind of wrong.  We do return errors elsewhere for
 operations attempted against dying domains, and that seems all fine,
 not sure why paging operations need to be different in this regard.
 Arm does also return -EINVAL in that case.
 
 So what about postponing this change to 4.18 in order to avoid
 surprises, but then taking it in its current form at the start of the
 development window, as to have time to detect any issues?
>>> 
>>> Maybe, but to be honest I'm not convinced. Arm can't really be taken
>>> for comparison, since the op is pretty new there iirc.
>> 
>> Indeed, but the tools code paths are likely shared between x86 and
>> Arm, as the hypercalls are the same.
> 
> On x86 we have both xc_shadow_control() and (functional)
> xc_logdirty_control(); on Arm only the former is used, while the latter
> would also be impacted by your change. Plus you're not accounting for
> external tool stacks (like xend would be if anyone had cared to forward
> port it, when - as you said earlier - the suspicion is that the original
> change was made to "please" xend).

I don't see how returning random uninitialised data (current behaviour) or 
wrong data (all zeroes) is better than returning an explicit error code.

Toolstacks (e.g. XAPI) only seemingly "work" with the current behaviour, in 
that it is trying to detect invalid data being returned,
and the chances of the invalid data being >1000 is more likely than it being 
<1000, so "most of the time" the toolstack happens to work.

There was a comment in the code that the bug might be in the binding, but the 
binding itself seems fine (well that one at least, there might be memory 
corruption in other bindings
which is probably what the comment was hinting at...), which is probably why 
the original investigation of this bug stopped and worked around the problem 
toolstack
side instead of digging into the hypervisor to find the real bug, it probably 
trusted the hypervisor being correct too much.

Except when the random data is between 1 and 1000, in which case it accepts 
that value as plausible and things break, 
but not in a way in which you can reproduce (you need to rerun many times until 
the value on the stack happens to be in this range).

And for the latter the toolstack needs to be changed anyway to detect 0 as a 
bad value (i.e. same as an exception), and it already has code
to detect exceptions from these domctls.

Even if such a change was made to "please xend" it is very likely just hiding 
the bad/uninitialized behaviour the same way XAPI does,
it doesn't necessarily mean it worked reliably.

And even this 

Re: [PATCH for-4.17 v3 09/15] tools/ocaml/xenstored/store.ml: fix build error

2022-11-09 Thread Edwin Torok


> On 9 Nov 2022, at 07:10, Jan Beulich  wrote:
> 
> On 09.11.2022 03:47, Henry Wang wrote:
>>> -Original Message-
>>> From: Edwin Török 
>>> Subject: [PATCH for-4.17 v3 09/15] tools/ocaml/xenstored/store.ml: fix build
>>> error
>>> 
>>> Building with Dune in release mode fails with:
>>> ```
>>> File "ocaml/xenstored/store.ml", line 464, characters 13-32:
>>> Warning 18: this type-based record disambiguation is not principal.
>>> File "ocaml/xenstored/store.ml", line 1:
>>> Error: Some fatal warnings were triggered (1 occurrences)
>>> ```
>>> 
>>> This is a warning to help keep the code futureproof, quoting from its
>>> documentation:
 Check information path during type-checking, to make sure that all types
>>> are
 derived in a principal way. When using labelled arguments and/or
>>> polymorphic
 methods, this flag is required to ensure future versions of the compiler 
 will
 be able to infer types correctly, even if internal algorithms change. All
 programs accepted in -principal mode are also accepted in the default
>>> mode with
 equivalent types, but different binary signatures, and this may slow down
>>> type
 checking; yet it is a good idea to use it once before publishing source 
 code.
>>> 
>>> Fixes: db471408edd46 "tools/ocaml/xenstored: Fix quota bypass on domain
>>> shutdown"
>> 
>> Nit: The format of this "Fixes:" tag might need to be fixed?
>> 
>>> 
>>> Signed-off-by: Edwin Török 
>>> ---
>>> Reason for inclusion in 4.17:
>>> - fixes a build error in a previous commit that is already in master
>> 
>> Yes, given this is a simple enough patch:
>> 
>> Release-acked-by: Henry Wang 
> 
> Afaics this patch was previously posted in isolation, and it was
> already release-acked. What's lacking there is a 2nd maintainer's
> ack or a proper R-b. When it now is patch 9 in a series, it isn't
> really obvious whether this could also be committed in isolation
> (it looks like it does, but a clear statement to this effect
> would have been beneficial).
> 


You're right it already has both acks, it just hasn't been commited yet: 
https://patchwork.kernel.org/project/xen-devel/patch/5a453393dad1de8286fe5db16504d3db2906eef8.1667500970.git.edvin.to...@citrix.com/
I've added the acks now to my github branch, so next time I resend the series 
it should be there.
It can be applied independently, I've rebased and moved it at the beginning of 
the series (there was just some whitespace to fix up)

If it helps here is the commit in isolation that could be cherry-picked onto 
master:
https://github.com/edwintorok/xen/commit/da88b438e03da36212d07d24d67ab151ae287f4e

Best regards,
--Edwin

Re: [PATCH for-4.17 v3 07/15] CODING_STYLE(tools/ocaml): add 'make format' and remove tabs

2022-11-08 Thread Edwin Torok


> On 8 Nov 2022, at 17:26, Julien Grall  wrote:
> 
> 
> 
> On 08/11/2022 17:02, Edwin Torok wrote:
>>> On 8 Nov 2022, at 16:03, Julien Grall  wrote:
>>> 
>>> Hi,
>>> 
>>> On 08/11/2022 15:33, Edwin Török wrote:
>>>> See CODING_STYLE: Xen uses spaces, not tabs.
>>>> * OCaml code:
>>>> Using `ocp-indent` for now to just make minimal modifications in
>>>> tabs vs spaces and get the right indentation.
>>>> We can introduce `ocamlformat` later.
>>>> * C stubs:
>>>> just replace tabs with spaces now, using `indent` or `clang-format`
>>>> would change code too much for 4.17.
>>>> This avoids perpetuating a formatting style that is inconsistent with
>>>> the rest of Xen, and that makes preparing and submitting patches more
>>>> difficult (OCaml indentation tools usually only support spaces, not tabs).
>>>> Contains a bugfix for `abi-check` script to handle the change in the
>>>> amount of whitespace.
>>>> No functional change.
>>>> Signed-off-by: Edwin Török 
>>>> --
>>>> Reason for inclusion in 4.17:
>>>> - makes it easier to backport changes from master to 4.17
>>> 
>>> Right, but you will have the problem when backporting to 4.16 and older. So 
>>> the overhead will always be there for a couple of years.
>> There will always be more than one Xen release in support, which means we'd 
>> never be able to fix this.
> 
> Note that I haven't said this should never be done. This just need to be 
> correctly timed. Doing it in the middle of a deep freeze doesn't look right 
> to me.
> 
> [...]
> 
>>>> - avoid perpetuating a different coding style (I thought tabs were
>>>>   mandated by Xen, and was about to fix up my editor config to match
>>>>   when I realized Xen already mandates the use of spaces)
>>>> - should make submitting patches for OCaml easier (OCaml indentation
>>>>   tools know only about spaces, so I either can't use them, or have to
>>>>   manually adjust indentation every time I submit a patch)
>>>> - it can be verified that the only change here is the Makefile change
>>>>   for the new rule, 'git log -p -1 -w' should be otherwise empty
>>> 
>>> While I understand the goal and support, this seems to be a bit too late to 
>>> do it in Xen 4.17 (we are only a couple of weeks away). At this stage of 
>>> the release we should only do bug fix.
>> I think it can be fairly easily proven that there is no functional change by 
>> rerunning the make format command manually, and by looking at the diff with 
>> ignore whitespace as suggested above.
> 
> That's not really the point here. The point is that if we start to allow 
> large coding style change (whether automatic or manual) very late in the 
> release then it will be hard to reject it in the future.
> 
> In fact we already have guidelines for that. If you look at [1], only bug 
> fixes should be done past the code freeze (23rd September).
> 
> As I wrote before, this patch only seem to be a cosmetic/quality improvement. 
> IOW this is not a bug fix and would not qualify for 4.17.
> 
>> I understand the reluctance in including it (which is why I was not sure 
>> whether to post it in the first place), but I think it might be beneficial 
>> to do it.
>> There is a large backlog of work in oxenstored that got piled up during the 
>> past couple of years of XSA work, and it'd be a lot easier to update and 
>> upstream those if we wouldn't have to worry
>> about indentation at all.
> 
> This is an argument for including this patch in Xen 4.18. As I wrote above, I 
> am not against that.
> 
>> Usually patches on LCM and security branches are avoided to reduce the risk 
>> of breaking anything, but a reindentation patch should not really break 
>> anything (well other than the abi-check script in the build, but I fixed 
>> that to accept both ways).
> 
> What does LCM stands for?

LCM is what we internally call the long term release branch where we backport 
fixes (i.e. 4.16 is an LCM branch after it is released). I think it'd be 
equivalent to the stable branch of Xen/Linux, I should've used that 
terminology. (I think it is an abbreviation of software lifecycle management)

> 
>> One alternative would be that I add another step after reformat that runs 
>> sed and turns spaces back into tabs for now, and that way I can still run 
>> 'make format' at each step while preparing patches for master, or 4.17 or 
>> security patches and get 

Re: [PATCH for-4.17?] x86/paging: return -EINVAL for paging domctls for dying domains

2022-11-08 Thread Edwin Torok



> On 8 Nov 2022, at 16:43, Roger Pau Monne  wrote:
> 
> On Tue, Nov 08, 2022 at 05:14:40PM +0100, Jan Beulich wrote:
>> On 08.11.2022 12:38, Roger Pau Monne wrote:
>>> Like on the Arm side, return -EINVAL when attempting to do a p2m
>>> operation on dying domains.
>>> 
>>> The current logic returns 0 and leaves the domctl parameter
>>> uninitialized for any parameter fetching operations (like the
>>> GET_ALLOCATION operation), which is not helpful from a toolstack point
>>> of view, because there's no indication that the data hasn't been
>>> fetched.
>> 
>> While I can see how the present behavior is problematic when it comes
>> to consuming supposedly returned data, ...
>> 
>>> --- a/xen/arch/x86/mm/paging.c
>>> +++ b/xen/arch/x86/mm/paging.c
>>> @@ -694,9 +694,10 @@ int paging_domctl(struct domain *d, struct 
>>> xen_domctl_shadow_op *sc,
>>> 
>>> if ( unlikely(d->is_dying) )
>>> {
>>> -gdprintk(XENLOG_INFO, "Ignoring paging op on dying domain %u\n",
>>> +gdprintk(XENLOG_INFO,
>>> + "Tried to do a paging domctl op on dying domain %u\n",
>>>  d->domain_id);
>>> -return 0;
>>> +return -EINVAL;
>>> }
>> 
>> ... going from "success" to "failure" here has a meaningful risk of
>> regressing callers. It is my understanding that it was deliberate to
>> mimic success in this case (without meaning to assign "good" or "bad"
>> to that decision).
> 
> I would assume that was the original intention, yes, albeit the commit
> message doesn't go into details about why mimicking success is
> required, it's very well possible the code relying on this was xend.
> 
>> Can you instead fill the data to be returned in
>> some simple enough way? I assume a mere memset() isn't going to be
>> good enough, though (albeit public/domctl.h doesn't explicitly name
>> any input-only fields, so it may not be necessary to preserve
>> anything). Maybe zeroing ->mb and ->stats would do?
> 
> Hm, it still feels kind of wrong.

Without the fix in the bindings (and any fixup in other toolstacks as needed, I 
haven't checked): 
https://lore.kernel.org/xen-devel/94f93ee61a4d0bd2fac3f5a753cb935962be20bb.1667920496.git.edvin.to...@citrix.com/T/#u
a value of 0 here might still cause things to go subtly wrong, i.e. cause the 
HVM shadow multiplier of a VM to decrease, although I think there are 
safeguards against it going below 1.0.
However a ->mb value of all zeroes is much easier to debug (and detect!) than a 
completely uninitialised value.

I'd prefer if the return value of domctls could be trusted to mean that 0 = all 
values are initialized on output, and to be potentially uninitialised only on 
failure.
There are a lot of other calls with similar pattern (particularly around vcpu, 
which will go down a different path) in the xenctrl bindings, and I haven't 
checked them all yet, but it'd be a good rule of thumb if the behaviour was 
consistent with other hypercalls/domctls.


>  We do return errors elsewhere for
> operations attempted against dying domains, and that seems all fine,
> not sure why paging operations need to be different in this regard.
> Arm does also return -EINVAL in that case.

How about we return EINVAL *and* in DEBUG builds fill the struct with an easily 
recognizable poison value?

> 
> So what about postponing this change to 4.18 in order to avoid
> surprises, but then taking it in its current form at the start of the
> development window, as to have time to detect any issues?
> 
>> As a minor remark: _If_ you're changing the printk(), then please
>> also switch to using %pd.
> 
> I've considered this, but then printing: "Tried to do a paging domctl
> op on dying domain dX" felt kind of repetitive to me because of the
> usage of domain and dX in the same sentence.  Anyway, will adjust.
> 
> Thanks, Roger.

Thanks,
--Edwin




Re: [PATCH for-4.17 v3 14/15] tools/ocaml/xenstored/syslog_stubs.c: avoid potential NULL dereference

2022-11-08 Thread Edwin Torok


> On 8 Nov 2022, at 16:07, Julien Grall  wrote:
> 
> 
> 
> On 08/11/2022 15:34, Edwin Török wrote:
>> If we are out of memory then strdup may return NULL, and passing NULL to
>> syslog may cause a crash.
>> Avoid this by using `caml_stat_strdup` which will raise an OCaml out of
>> memory exception instead.
>> This then needs to be paired with caml_stat_free.
>> Signed-off-by: Edwin Török 
>> ---
>> Reason for inclusion in 4.17:
>> - fixes a bug in out of memory situations
>> Changes since v2:
>> - new in v3
>> ---
>>  tools/ocaml/xenstored/syslog_stubs.c | 7 +--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>> diff --git a/tools/ocaml/xenstored/syslog_stubs.c 
>> b/tools/ocaml/xenstored/syslog_stubs.c
>> index 4e5e49b557..4ad85c8eb5 100644
>> --- a/tools/ocaml/xenstored/syslog_stubs.c
>> +++ b/tools/ocaml/xenstored/syslog_stubs.c
>> @@ -14,6 +14,7 @@
>>#include 
>>  #include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>> @@ -35,14 +36,16 @@ static int __syslog_facility_table[] = {
>>  value stub_syslog(value facility, value level, value msg)
>>  {
>>  CAMLparam3(facility, level, msg);
>> -const char *c_msg = strdup(String_val(msg));
>> +char *c_msg = strdup(String_val(msg));
> 
> This change seems to be unrelated with the goal of the commit. IMHO, this 
> should be done in a separate patch.
> 
> The minimum would be to mention in the commit message.

That is to avoid freeing 'const char*' (there is a typecast below).
I'll mention it.

> 
> Cheers,
> 
> -- 
> Julien Grall



Re: [PATCH for-4.17 v3 07/15] CODING_STYLE(tools/ocaml): add 'make format' and remove tabs

2022-11-08 Thread Edwin Torok


> On 8 Nov 2022, at 16:03, Julien Grall  wrote:
> 
> Hi,
> 
> On 08/11/2022 15:33, Edwin Török wrote:
>> See CODING_STYLE: Xen uses spaces, not tabs.
>> * OCaml code:
>> Using `ocp-indent` for now to just make minimal modifications in
>> tabs vs spaces and get the right indentation.
>> We can introduce `ocamlformat` later.
>> * C stubs:
>> just replace tabs with spaces now, using `indent` or `clang-format`
>> would change code too much for 4.17.
>> This avoids perpetuating a formatting style that is inconsistent with
>> the rest of Xen, and that makes preparing and submitting patches more
>> difficult (OCaml indentation tools usually only support spaces, not tabs).
>> Contains a bugfix for `abi-check` script to handle the change in the
>> amount of whitespace.
>> No functional change.
>> Signed-off-by: Edwin Török 
>> --
>> Reason for inclusion in 4.17:
>> - makes it easier to backport changes from master to 4.17
> 
> Right, but you will have the problem when backporting to 4.16 and older. So 
> the overhead will always be there for a couple of years.


There will always be more than one Xen release in support, which means we'd 
never be able to fix this.
At some point we need to decide to go ahead and just do it (which may or may 
not be 4.17)

The whitespace fixup needs to happen somewhere anyway, there are a couple of 
options:
* use git rebase/cherry-pick -Xignore-space-change to avoid backport failing to 
apply
* do a reformat on 4.16 branch too (backport the couple of lines of Makefile 
change and run 'make format' and commit the result)
* include the reformat as a prerequisite of the next security patch (I'd prefer 
not to, hence including it in 4.17)
* do the whitespace fixup prior to submission, turning a correctly formatted 
patch with spaces into an incorrectly formatted one with tabs. I wouldn't have 
minded doing this if the coding style said tabs,
I could've run my reformat tool locally and fix it up with 'sed' prior to 
submission. However this seems unnecessary overhead if doing this works 
actively against the existing coding style.

However if we put this into 4.17 and there are no new security issues 
discovered before 4.16 goes out of support then we'll be in a much better 
position for 4.18 onward.
There is still quite a lot of work outstanding in testing of oxenstored 
(updating my fuzzer for one), so I can't really predict at this point whether 
there are more security bugs in oxenstored or not,
so maybe we need to take this decision at 4.18 time? Not sure.

> 
>> - avoid perpetuating a different coding style (I thought tabs were
>>   mandated by Xen, and was about to fix up my editor config to match
>>   when I realized Xen already mandates the use of spaces)
>> - should make submitting patches for OCaml easier (OCaml indentation
>>   tools know only about spaces, so I either can't use them, or have to
>>   manually adjust indentation every time I submit a patch)
>> - it can be verified that the only change here is the Makefile change
>>   for the new rule, 'git log -p -1 -w' should be otherwise empty
> 
> While I understand the goal and support, this seems to be a bit too late to 
> do it in Xen 4.17 (we are only a couple of weeks away). At this stage of the 
> release we should only do bug fix.

I think it can be fairly easily proven that there is no functional change by 
rerunning the make format command manually, and by looking at the diff with 
ignore whitespace as suggested above.
I understand the reluctance in including it (which is why I was not sure 
whether to post it in the first place), but I think it might be beneficial to 
do it.
There is a large backlog of work in oxenstored that got piled up during the 
past couple of years of XSA work, and it'd be a lot easier to update and 
upstream those if we wouldn't have to worry
about indentation at all.

Usually patches on LCM and security branches are avoided to reduce the risk of 
breaking anything, but a reindentation patch should not really break anything 
(well other than the abi-check script in the build, but I fixed that to accept 
both ways).

One alternative would be that I add another step after reformat that runs sed 
and turns spaces back into tabs for now, and that way I can still run 'make 
format' at each step while preparing patches for master, or 4.17 or security 
patches and get something consistent, and that minimizes other whitespace 
changes, but it wouldn't completely eliminate them (e.g. there are pieces of 
code that are just wrongly indented, so there'd be at least a diff to fix all 
that).

> 
> This is clearly only a comesmetic change and there I would argue this should 
> be deferred to 4.18. That said the last call is from the RM.


Best regards,
--Edwin

> 
> Cheers,


> 
> 
> -- 
> Julien Grall



Re: [PATCH for-4.17 v1 0/2] xenctrl.ml: improve scalability of domain_getinfolist

2022-11-02 Thread Edwin Torok


> On 2 Nov 2022, at 09:11, Christian Lindig  wrote:
> 
> 
> 
>> On 1 Nov 2022, at 17:59, Edwin Török  wrote:
>> 
>> 
>> Edwin Török (2):
>> xenctrl.ml: make domain_getinfolist tail recursive
>> xenctrl: use larger chunksize in domain_getinfolist
>> 
>> tools/ocaml/libs/xc/xenctrl.ml | 25 ++---
>> 1 file changed, 18 insertions(+), 7 deletions(-)
> 
> Acked-by: Christian Lindig 
> 
> 
>> It was calling the Xen domainfolist hypercall N/2 times.
>> Optimize this such that it is called at most 2 times during normal use.
>> 
>> Implement a tail recursive `rev_concat` equivalent to `concat |> rev`,
>> and use it instead of calling `@` multiple times.
> 
> Are there any assurances about the order in elements returned by 
> domain_getinfolist? I understand that the change maintains the current 
> behaviour but are we even required to maintain that order? Because otherwise 
> we could return the reverse list and save more work.


After some discussion with Andrew Cooper apparently the xenctrl API is broken 
and cannot be used safely as is, so I'll be rewriting this patch.
Domids can be assigned randomly, or toolstacks can set a custom domid, so it is 
not guaranteed that new domids always show up as higher numbers,
and the only safe way to use infolist is to either request 1 domid, or request 
them all (domids are 15-bit, so 32768).
Anything else is prone to race conditions and might give you an inconsistent 
snapshot.

This is probably better fixed by changing the prototype of the C function in 
xenctrl to not take a count to force updating all callers
(the callers in the 'xl' toolstack are just as buggy as the one in this OCaml 
binding, and probably better to fix the API in xenctrl than working around the 
race condition in each caller).

Stay tuned for more patches...

Best regards,
--Edwin

Re: [PATCH for-4.17 v1 0/2] xenctrl.ml: improve scalability of domain_getinfolist

2022-11-02 Thread Edwin Torok


> On 2 Nov 2022, at 09:11, Christian Lindig  wrote:
> 
> 
> 
>> On 1 Nov 2022, at 17:59, Edwin Török  wrote:
>> 
>> 
>> Edwin Török (2):
>> xenctrl.ml: make domain_getinfolist tail recursive
>> xenctrl: use larger chunksize in domain_getinfolist
>> 
>> tools/ocaml/libs/xc/xenctrl.ml | 25 ++---
>> 1 file changed, 18 insertions(+), 7 deletions(-)
> 
> Acked-by: Christian Lindig 
> 
> 
>> It was calling the Xen domainfolist hypercall N/2 times.
>> Optimize this such that it is called at most 2 times during normal use.
>> 
>> Implement a tail recursive `rev_concat` equivalent to `concat |> rev`,
>> and use it instead of calling `@` multiple times.
> 
> Are there any assurances about the order in elements returned by 
> domain_getinfolist? I understand that the change maintains the current 
> behaviour but are we even required to maintain that order? Because otherwise 
> we could return the reverse list and save more work.


Maintaining the current (reversed) order in the C stubs is useful to be able to 
extract the largest domid so we know where to continue.
However we don't necessarily have to maintain the order on the higher level 
function available in the .mli, it just happens that `rev_concat` gives us the 
results in the order that we want.

Although perhaps using an array rather a list here might be better, and we 
could move resizing/reallocing that array into the C stub, and have just a 
nicer (and perhaps faster) interface in the C/OCaml API,
by producing less garbage than lists (a single Array.t allocation rather than N 
list elements).
(Although I don't usually like pushing complexity like that into C stubs, for 
now I'm happy with the performance of the code as is).

There are more issues in these C stubs in xenctrl, I'll be pouring over the 
code and try to send a few more patches.

Best regards,
--Edwin

Re: [PATCH for-4.17] tools/ocaml/xenstored: fix live update exception

2022-10-21 Thread Edwin Torok


> On 21 Oct 2022, at 08:50, Christian Lindig  
> wrote:
> 
> 
> 
>> On 20 Oct 2022, at 17:54, Edwin Török  wrote:
>> 
>> During live update we will load the /tool/xenstored path from the previous 
>> binary,
>> and then try to mkdir /tool again which will fail with EEXIST.
>> Check for existence of the path before creating it.
>> 
>> The write call to /tool/xenstored should not need any changes
>> (and we do want to overwrite any previous path, in case it changed).
>> 
>> Prior to 7110192b1df6 live update would work only if the binary path was
>> specified, and with 7110192b1df6 and this live update also works when
>> no binary path is specified in `xenstore-control live-update`.
>> 
>> Fixes: 7110192b1df6 ("tools/oxenstored: Fix Oxenstored Live Update")
>> Signed-off-by: Edwin Török 
>> ---
>> tools/ocaml/xenstored/xenstored.ml | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>> 
>> diff --git a/tools/ocaml/xenstored/xenstored.ml 
>> b/tools/ocaml/xenstored/xenstored.ml
>> index fc90fcdeb5..3299fe73f7 100644
>> --- a/tools/ocaml/xenstored/xenstored.ml
>> +++ b/tools/ocaml/xenstored/xenstored.ml
>> @@ -353,7 +353,9 @@ let _ =
>>  ) in
>> 
>>  (* required for xenstore-control to detect availability of live-update 
>> *)
>> -Store.mkdir store Perms.Connection.full_rights (Store.Path.of_string 
>> "/tool");
>> +let tool_path = Store.Path.of_string "/tool" in
>> +if not (Store.path_exists store tool_path) then
>> +Store.mkdir store 
>> Perms.Connection.full_rights tool_path;
>>  Store.write store Perms.Connection.full_rights
>>  (Store.Path.of_string "/tool/xenstored") Sys.executable_name;
> 
> I notice inconsistent indentation but let's ignore that or fix it before the 
> committing.
> 
> Acked-by: Christian Lindig 
> 


Thanks, fixed indentation here: 
https://github.com/edwintorok/xen/commit/4a89f1f44cb171e1f92dae2401a580a10fd0c5a0
And v2 patch should show up on the ML with the 2 acks included and fixed 
indentation soon too.

Best regards,
--Edwin

Re: [PATCH for-4.17] tools/oxenstored: Fix Oxenstored Live Update

2022-10-20 Thread Edwin Torok



> On 20 Oct 2022, at 13:45, Henry Wang  wrote:
> 
> Hi Andrew,
> 
>> -Original Message-
>> From: Andrew Cooper 
>> Subject: [PATCH for-4.17] tools/oxenstored: Fix Oxenstored Live Update
>> 
>> tl;dr This hunk was part of the patch emailed to xen-devel, but was missing
>> from what ultimately got committed.
>> 
>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fxen-data=05%7C01%7Cedvin.torok%40citrix.com%7Cfd3a03c987ce448875f808dab29903ee%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C638018667544851864%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=8AnK0Bhh9D%2B0KR75I0zeqwB8GB10JLjUtqkwTtprans%3Dreserved=0
>> devel/4164cb728313c3b9fc38cf5e9ecb790ac93a9600.1610748224.git.edvin.t
>> o...@citrix.com/
>> is the patch in question, but was part of a series that had threading issues.
>> I have a vague recollection that I sourced the commits from a local branch,
>> which clearly wasn't as up-to-date as I had thought.
>> 
>> Either way, it's my fault/mistake, and this hunk should have been part of
>> what
>> got comitted.
>> 
>> Fixes: 00c48f57ab36 ("tools/oxenstored: Start live update process")
>> Signed-off-by: Andrew Cooper 
>> ---
>> CC: Christian Lindig 
>> CC: David Scott 
>> CC: Edwin Torok 
>> CC: Rob Hoes 
>> CC: Henry Wang 
>> 
>> Found while reviewing the XenServer patchqueue.  This is low risk for 4.17
>> and
>> fixes a feature which we thought had been working since 4.15.
> 
> The commit message and above scissors line have described the situation
> quite clear, so I don't think there is any reason to ignore this patch.
> 
> Release-acked-by: Henry Wang 

Thanks.
Further testing has revealed another bug, patch here:
 
https://lore.kernel.org/xen-devel/12d90632bf881e96e0b6c256df193f00df187dc1.1666284745.git.edvin.to...@citrix.com/T/#u

For convenience the commit is also available from git:
 
https://github.com/edwintorok/xen/commit/12d90632bf881e96e0b6c256df193f00df187dc1

With both of these patches a smoketest 'xenstore-control live-update' with a 
stopped toolstack works now.

Best regards,
--Edwin

> 
> Kind regards,
> Henry
> 
> 
>> ---
>> tools/ocaml/xenstored/xenstored.ml | 5 +
>> 1 file changed, 5 insertions(+)
>> 
>> diff --git a/tools/ocaml/xenstored/xenstored.ml
>> b/tools/ocaml/xenstored/xenstored.ml
>> index d44ae673c42a..fc90fcdeb5d6 100644
>> --- a/tools/ocaml/xenstored/xenstored.ml
>> +++ b/tools/ocaml/xenstored/xenstored.ml
>> @@ -352,6 +352,11 @@ let _ =
>>  rw_sock
>>  ) in
>> 
>> +(* required for xenstore-control to detect availability of live-update 
>> *)
>> +Store.mkdir store Perms.Connection.full_rights (Store.Path.of_string
>> "/tool");
>> +Store.write store Perms.Connection.full_rights
>> +(Store.Path.of_string "/tool/xenstored")
>> Sys.executable_name;
>> +
>>  Sys.set_signal Sys.sighup (Sys.Signal_handle sighup_handler);
>>  Sys.set_signal Sys.sigterm (Sys.Signal_handle (fun _ ->
>>  info "Received SIGTERM";
>> --
>> 2.11.0
> 




Re: [PATCH v2 4/5] tools/ocaml/libs/xc: OCaml 5.0 compatibility

2022-09-30 Thread Edwin Torok


> On 30 Sep 2022, at 15:59, Christian Lindig  
> wrote:
> 
> 
> 
>> On 27 Sep 2022, at 17:13, Edwin Torok  wrote:
>> 
>> 
>> See below for a patch for that. I've included this patch in the correct 
>> place (before the patch that breaks it) in the git repository at: 
>> https://github.com/edwintorok/xen/compare/private/edvint/public0
>> 
> 
> 
> Acked-by: Christian Lindig 
> 
> I believe these changes are fine. We are now allocating the event channel 
> dynamically and this requires using a finaliser to free that memory. 

Thanks,

> 
> 
>> -ifneq ($(MAKECMDGOALS),clean)
>> +ifeq (,$(findstring clean,$(MAKECMDGOALS)))
>> .ocamldep.make: $(ALL_OCAML_OBJ_SOURCES) Makefile 
>> $(OCAML_TOPLEVEL)/Makefile.rules
>>  $(call quiet-command, $(OCAMLDEP) $(ALL_OCAML_OBJ_SOURCES) *.mli 
>> $o,MLDEP,)
>> endif
> 
> Is this the right logic? Moving from ifneq to ifeq here.
> 
> I am not so familiar with the Makfile rules. The gist seems to be: we depend 
> on auto-generated Make dependencies that the Makefile in general depends on. 
> But in a “make clean” (or other “*clean” it is wasteful to generate these 
> only to later remove them. However, these kind of subtleties are obvious 
> enough while we are working on this but over time accumulate to Makefiles 
> that are hard to change. So I wonder if this kind of optimisation, while 
> correct, is worth it, but fine going along with it.
> 

Makefile functions can be a bit confusing to read.

"ifneq ($(MAKECMDGOALS), clean)" means $(MAKECMDGOALS) != "clean"
"ifeq (,$(findstring clean,$(MAKECMDGOALS)))" means that "clean" in 
$(MAKECMDGOALS) == "" (the empty string), or i.o.w. "clean" not in 
$(MAKECMDGOALS), which is a bit more generic than the previous one,
since we have all sorts of rules in the Makefile (especially around subdirs) 
where 'clean' is a substring.
This is quite subtle and I had to reread this line many times too to check it 
is correct.

The real solution here would be to have a single non-recursive Makefile (and 
there is some discussion/patches heading in that direction in xen-devel 
particularly from Anthony), and then evaluating the "clean" rules would be a 
lot less expensive, it'd only have to be done once. But there might be a while 
until we get there, and meanwhile these clean rules slow down the OCaml build 
too much (just running the "clean" takes a lot longer than building the entire 
OCaml libraries and oxenstored sequentially).

Although I only need to use 'clean' when using the upstream Makefiles (where 
almost every incremental change requires a 'clean' inbetween because the 
Makefiles express the dependencies incorrectly),
or when switching from upstream Makefile to 'dune' (a one-off event usually).
Since I use 'Dune' for my daily work anyway (and the makefile is used in our 
internal build system) perhaps this Makefile patch is not needed at all, I can 
change 'Makefile.dune' to not call 'make clean' at all, and I'll know to 
remember to run it if things fail anyway (it'll be pretty obvious when Dune 
says you've got a build artifact in the wrong place).

Best regards,
--Edwin

Re: [PATCH v2 4/5] tools/ocaml/libs/xc: OCaml 5.0 compatibility

2022-09-27 Thread Edwin Torok
This accidentally broke compatibility with OCaml 4.02.3, 
only realized when I went back to my Dune based build system which can 
automatically compile with multiple compiler versions.

See below for a patch for that. I've included this patch in the correct place 
(before the patch that breaks it) in the git repository at: 
https://github.com/edwintorok/xen/compare/private/edvint/public0


From 78a613cbb8db7033fe741488912f21b24eaaef56 Mon Sep 17 00:00:00 2001
Message-Id: 
<78a613cbb8db7033fe741488912f21b24eaaef56.1664295046.git.edvin.to...@citrix.com>
From: =?UTF-8?q?Edwin=20T=C3=B6r=C3=B6k?= 
Date: Tue, 27 Sep 2022 17:06:57 +0100
Subject: [PATCH] tools/ocaml: fix compatibility with OCaml 4.02.3
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Edwin Török 
---
 tools/ocaml/libs/mmap/mmap_stubs.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tools/ocaml/libs/mmap/mmap_stubs.h 
b/tools/ocaml/libs/mmap/mmap_stubs.h
index 65e4239890..5c65cc86fb 100644
--- a/tools/ocaml/libs/mmap/mmap_stubs.h
+++ b/tools/ocaml/libs/mmap/mmap_stubs.h
@@ -30,4 +30,9 @@ struct mmap_interface
int len;
 };

+/* for compatibility with OCaml 4.02.3 */
+#ifndef Data_abstract_val
+#define Data_abstract_val(v) ((void*) Op_val(v))
+#endif
+
 #endif
--
2.34.1

> On 27 Sep 2022, at 12:15, Edwin Török  wrote:
> 
> Follow the manual to avoid naked pointers:
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fv2.ocaml.org%2Fmanual%2Fintfc.html%23ss%3Ac-outside-headdata=05%7C01%7Cedvin.torok%40citrix.com%7C4bf28f7a32074a49cdf008daa079a807%7C335836de42ef43a2b145348c2ee9ca5b%7C0%7C0%7C637998741634627105%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=GU%2BbmB1ai0llQg0z5zWctzwW0ocUQUOHVlMxkeD3U0U%3Dreserved=0
> 
> No functional change, except on OCaml 5.0 where it is a bugfix.
> 
> Signed-off-by: Edwin Török 
> ---
> tools/ocaml/libs/xc/xenctrl_stubs.c | 11 ++-
> 1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
> b/tools/ocaml/libs/xc/xenctrl_stubs.c
> index 19335bdf45..7ff4e00314 100644
> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> @@ -37,7 +37,7 @@
> 
> #include "mmap_stubs.h"
> 
> -#define _H(__h) ((xc_interface *)(__h))
> +#define _H(__h) *((xc_interface **) Data_abstract_val(__h))
> #define _D(__d) ((uint32_t)Int_val(__d))
> 
> #ifndef Val_none
> @@ -70,14 +70,15 @@ static void Noreturn failwith_xc(xc_interface *xch)
> CAMLprim value stub_xc_interface_open(void)
> {
>   CAMLparam0();
> -xc_interface *xch;
> + CAMLlocal1(result);
> 
> + result = caml_alloc(1, Abstract_tag);
>   /* Don't assert XC_OPENFLAG_NON_REENTRANT because these bindings
>* do not prevent re-entrancy to libxc */
> -xch = xc_interface_open(NULL, NULL, 0);
> -if (xch == NULL)
> + _H(result) = xc_interface_open(NULL, NULL, 0);
> + if (_H(result) == NULL)
>   failwith_xc(NULL);
> -CAMLreturn((value)xch);
> + CAMLreturn(result);
> }
> 
> 
> -- 
> 2.34.1
> 



Design Session notes: tool chain (and other) dependencies of our build and runtime system

2022-09-22 Thread Edwin Torok
Hi,

These are the raw notes for this design session from the Xen Summit: 
https://design-sessions.xenproject.org/uid/discussion/disc_JEWVIhuv2TRAo2AXAZnP/view

Dependencies haven't been updated in quite some time:
 * Toolchain
 * system libraries, and other runtime deps

E.g. when to drop Python2.
Testing on old hardware might need it. Old hardware is difficult to keep 
running,
e.g. older than N years.
Vendors may not support it anymore, but that is a separate discussion.

What distros for newest versions of Xen, as minimum?

Minor releases keep working.
Next release can make another choice. Not necessarily actively break, but 
support might be best-effort.

Support old guests of course, but what about build?

Might limit testing on old hardware, if it is difficult to get new distros 
running on it.

E.g. after announcing new release, announce set of distros for master.
Update list at least once in release cycle.

make change for 4.18.

Accept nominations of distros from community for consideration, and then take 
decision.
Or perhaps use backport (e.g. EPEL).
Distros which have support policy (i.e. they say how long a particular release 
is supported).

Can't test everything on all distros, might be package version based. E.g. GCC 
>= version

CI testing vs do we want to fix the breakage or not even if we can't detect in 
CI

Informal survey on xen-users, resend patch on xen-devel (perhaps survey if lots 
of concerns raised)

Distro packages, even if not available by default.

Might be helpful: repology.org  for checking versions


Re: [PATCH v1 7/7] tools/ocaml/libs/eventchn: do not leak event channels and OCaml 5.0 compat

2022-08-08 Thread Edwin Torok


> On 5 Aug 2022, at 19:06, Andrew Cooper  wrote:
> 
> On 29/07/2022 18:53, Edwin Török wrote:
>> Add a finalizer on the event channel value, so that it calls
>> `xenevtchn_close` when the value would be GCed.
>> 
>> In practice oxenstored seems to be the only user of this,
>> and it creates a single global event channel only,
>> but freeing this could still be useful when run with OCAMLRUNPARAM=c
>> 
>> The code was previously casting a C pointer to an OCaml value,
>> which should be avoided: OCaml 5.0 won't support it.
>> (all "naked" C pointers must be wrapped inside an OCaml value,
>> either an Abstract tag, or Nativeint, see the manual
>> https://ocaml.org/manual/intfc.html#ss:c-outside-head)
>> 
>> Signed-off-by: Edwin Török 
> 
> So while this looks like a good improvement, don't we need to do it for
> all libraries, not just evtchn?
> 
> It doesn't look as if Ocaml 5.0 is very far away.

There are 2 ways to make it safe: use a block with Abstract_tag, or a block 
with Custom_tag:
https://v2.ocaml.org/manual/intfc.html#ss:c-outside-head

(technically it only ever was safe to use naked pointers for word-aligned 
pointers previously, although malloc usually
 ensures some minimal alignment).

There is a mode where the runtime can warn on stderr whenever naked pointers 
are used (there is an opam switch for it,
or one can specify a flag during the compiler build time), but it only does so 
when that codepath is hit,
not at build/link time.

A quick look at the libs:
* libs/mmap: uses Abstract_tag
* eventchn: fixed here to use Custom_tag
* libs/xc: needs fixing, stub_xc_interface_open
* libs/xl: uses Custom_tag (although it has other known issues (it should never 
use caml_callbackN directly, but use caml_callbackN_exn, see 
https://v2.ocaml.org/manual/intfc.html#ss:c-callbacks)
* libs/xentoollog: uses Custom_tag

So we need a patch for libs/xc.

> 
>> ---
>> tools/ocaml/libs/eventchn/xeneventchn_stubs.c | 29 +--
>> 1 file changed, 27 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tools/ocaml/libs/eventchn/xeneventchn_stubs.c 
>> b/tools/ocaml/libs/eventchn/xeneventchn_stubs.c
>> index f889a7a2e4..c0d57e2954 100644
>> --- a/tools/ocaml/libs/eventchn/xeneventchn_stubs.c
>> +++ b/tools/ocaml/libs/eventchn/xeneventchn_stubs.c
>> @@ -33,7 +33,30 @@
>> #include 
>> #include 
>> 
>> -#define _H(__h) ((xenevtchn_handle *)(__h))
>> +/* We want to close the event channel when it is no longer in use,
>> +   which can only be done safely with a finalizer.
>> +   Event channels are typically long lived, so we don't need tighter 
>> control over resource deallocation.
>> +   Use a custom block
>> +*/
>> +
>> +/* Access the xenevtchn_t* part of the OCaml custom block */
>> +#define _H(__h) (*((xenevtchn_handle**)Data_custom_val(__h)))
>> +
>> +static void stub_evtchn_finalize(value v)
>> +{
>> +/* docs say to not use any CAMLparam* macros here */
>> +xenevtchn_close(_H(v));
>> +}
>> +
>> +static struct custom_operations xenevtchn_ops = {
>> +"xenevtchn",
>> +stub_evtchn_finalize,
>> +custom_compare_default, /* raises Failure, cannot compare */
>> +custom_hash_default, /* ignored */
>> +custom_serialize_default, /* raises Failure, can't serialize */
>> +custom_deserialize_default, /* raises Failure, can't deserialize */
>> +custom_compare_ext_default /* raises Failure */
> 
> This wants to gain a trailing comma.
> 
>> +};
>> 
>> CAMLprim value stub_eventchn_init(void)
>> {
>> @@ -48,7 +71,9 @@ CAMLprim value stub_eventchn_init(void)
>>  if (xce == NULL)
>>  caml_failwith("open failed");
>> 
>> -result = (value)xce;
>> +/* contains file descriptors, trigger full GC at least every 128 
>> allocations */
>> +result = caml_alloc_custom(_ops, sizeof(xce), 1, 128);
> 
> The memory allocated for xce is tiny (48 bytes) and invariant for the
> lifetime of the evtchn object, which we've already established tends to
> operate as a singleton anyway.
> 
> Don't we want to use the recommended 0 and 1 here?

It is not just about the memory itself, but also about the file descriptors: 
those are a limited resource,
and if we use the 0 and 1 it means that this will be garbage collected very 
infrequently since the allocation itself is very small,
and you could potentially run out of file descriptors if you keep opening new 
event channels.
Notice there is no API for the user to close the event channel, so it has to 
rely on the GC, which is not ideal.

The mirage version of the eventchn lib does provide a close function: 
https://github.com/mirage/ocaml-evtchn/blob/main/lib/eventchn.mli,
although its implementation just leaks it (to avoid use-after-free): 
https://github.com/mirage/ocaml-evtchn/blob/main/lib/eventchn.ml#L42

Are event channel always expected to be singletons, is there a valid use case 
where you'd want more than 1 event channel/process?

Best regards,
--Edwin

Re: [PATCH v1 4/7] tools/ocaml: Makefile to drive dune

2022-08-03 Thread Edwin Torok


> On 3 Aug 2022, at 14:46, Anthony PERARD  wrote:
> 
> On Fri, Jul 29, 2022 at 06:53:27PM +0100, Edwin Török wrote:
>> create a separate Makefile that can be used to drive dune.
>> 
>> Usage:
>> `make -f Makefile.dune`
>> 
>> There are some files that need to be created by the Makefile based
>> build system (such as all the C code in $(XEN_ROOT)/tools/libs),
>> and those need to exist before dune runs.
>> 
>> Although it'd be possible to automatically call the necessary makefile
>> rules from dune, it wouldn't work reliably:
>> * dune uses sandboxing by default (only files declared or known as
>>  dependencies are visible to individual build commands,
>>  symlinks/hardlinks are used by dune to implement this)
>> * the dune builds always run in a _build subdir, and calling the
>>  makefiles from there would get the wrong XEN_ROOT set
>> * running the make command in the source tree would work, but dune still
>>  wouldn't immediately see the build dependencies since they wouldn't
>>  have been copied/linked under _build
>> 
>> The approach here is to:
>> * use the Makefile to build C-only prerequisites (i.e. most of Xen)
>> * use Dune only to build the OCaml parts once the C prerequisites exist
>> * dune has dependencies declared on the C bits, so if they are missing
>>  you will get an error about a missing rule to create them instead of a
>>  cryptic compilation error
>> * dune is still optional - the old Makefile based buildsystem is still
>>  there for now
>> * use dune exclusively for new code going forward (e.g. OCaml test-suites)
>> 
>> The workspace file needs to be generated by make because this currently
>> cannot be generated by dune, and it doesn't support including external
>> files. But could be generated by configure?
> 
> Potentially, but ./configure doesn't have access to the list of
> xen libraries, so I'm not sure it would be a good idea.

ok I'll remove it from the commit message.

> 
>> LD_LIBRARY_PATH needs to be set, because even with -Wl,-rpath
>> executables wouldn't be able to run using the just-built libraries,
>> unless we'd also link all the transitive dependencies of libs.
>> 
>> No functional change.
>> 
>> Signed-off-by: Edwin Török 
>> ---
>> Makefile  |  5 ++
>> tools/ocaml/Makefile.dune | 88 +++
>> tools/ocaml/dune-workspace.dev.in |  2 +
>> tools/ocaml/dune-workspace.in | 18 +++
>> 4 files changed, 113 insertions(+)
>> create mode 100644 tools/ocaml/Makefile.dune
>> create mode 100644 tools/ocaml/dune-workspace.dev.in
>> create mode 100644 tools/ocaml/dune-workspace.in
> 
> You've added dune-workspace* to .gitignore in the previous patch, should
> the addition be done in this patch instead? (Also feel free to create
> "tools/ocaml/.gitignore".
> 
>> diff --git a/Makefile b/Makefile
>> index b93b22c752..ddb33c3555 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -68,6 +68,11 @@ build-tools-oxenstored: build-tools-public-headers
>>  $(MAKE) -s -C tools/libs
>>  $(MAKE) -C tools/ocaml build-tools-oxenstored
>> 
>> +.PHONY: build-tools-oxenstored-prepare
>> +build-tools-oxenstored-prepare: build-tools-public-headers
>> +test -f tools/config.status || (cd tools && ./configure 
>> --with-xenstored=oxenstored)
> 
> No, do not run ./configure from the makefile. ./configure needs to be
> run before running make.


Perhaps I can add the necessary instructions to a README in tools/ocaml instead 
of this Makefile rule (which was here for documentation/convenience reasons).
The toplevel configure can fail due to various missing dependencies, but for 
OCaml just the configure in tools should be sufficient.

> 
>> +$(MAKE) -C tools/libs V=
> 
> No, do not start a build of the libraries from the root make file. If a
> user were to run `make build-tools-oxenstored-prepare build-tools`, we
> would end up with 2 make running `make -C tools/libs` concurrently which
> isn't going to end well.


I'd like a single command to build everything needed related to oxenstored, 
without necessarily building the rest of Xen (which could either take a long 
time, or fail due to missing dependencies).
Ideally Xen wouldn't use recursive invocations of make, but just one single 
makefile that is aware of all source files (and you could then refer to 
objects/libraries in another directory as dependencies)
and what I'd like to do could be achieved by simply asking 'make' to build 
tools/ocaml/xenstored/oxenstored and let it figure out the minimal set of code 
that needs to be built for that.
However such a change would be quite invasive to the build system (and there 
probably was a reason to use recursive makefiles, they might have some 
advantage I'm not aware of).
Where do you recommend to put this rule instead, should it be in `tools/ocaml`? 
(although in that case it'd have to do a make invocation in `../libs` which 
isn't necessarily nicer)

Or should it be a shell script instead that invokes all the necessary make 

Re: [PATCH v1 1/7] tools/ocaml/Makefile: do not run ocamldep during make clean

2022-08-03 Thread Edwin Torok


> On 3 Aug 2022, at 11:16, Jan Beulich  wrote:
> 
> On 29.07.2022 19:53, Edwin Török wrote:
>> Trying to include .ocamldep.make will cause it to be generated if it
>> doesn't exist.
>> We do not want this during make clean: we would remove it anyway.
>> 
>> Speeds up make clean.
>> 
>> Before (measured on f732240fd3bac25116151db5ddeb7203b62e85ce, July 2022):
>> ```
>> Parsing 
>> /home/edwin/xen2/tools/ocaml/libs/xl/../../../../tools/libs/light/libxl_types.idl
>> Parsing 
>> /home/edwin/xen2/tools/ocaml/libs/xl/../../../../tools/libs/light/libxl_types.idl
>> Parsing 
>> /home/edwin/xen2/tools/ocaml/libs/xl/../../../../tools/libs/light/libxl_types.idl
>> Parsing 
>> /home/edwin/xen2/tools/ocaml/libs/xl/../../../../tools/libs/light/libxl_types.idl
>> Parsing 
>> /home/edwin/xen2/tools/ocaml/libs/xl/../../../../tools/libs/light/libxl_types.idl
>> 
>> Performance counter stats for 'make clean -j8 -s' (5 runs):
>> 
>>4.2233 +- 0.0208 seconds time elapsed  ( +-  0.49% )
>> ```
>> 
>> After:
>> ```
>> perf stat -r 5 --null make clean -j8 -s
>> 
>> Performance counter stats for 'make clean -j8 -s' (5 runs):
>> 
>>2.7325 +- 0.0138 seconds time elapsed  ( +-  0.51% )
>> ```
>> 
>> No functional change.
>> 
>> Signed-off-by: Edwin Török 
> 
> I've committed this as is since it was acked and is an improvement
> in any event, but ...
> 
>> --- a/tools/ocaml/Makefile.rules
>> +++ b/tools/ocaml/Makefile.rules
>> @@ -44,8 +44,10 @@ META: META.in
>> 
>> ALL_OCAML_OBJ_SOURCES=$(addsuffix .ml, $(ALL_OCAML_OBJS))
>> 
>> +ifneq ($(MAKECMDGOALS),clean)
>> .ocamldep.make: $(ALL_OCAML_OBJ_SOURCES) Makefile 
>> $(OCAML_TOPLEVEL)/Makefile.rules
>>  $(call quiet-command, $(OCAMLDEP) $(ALL_OCAML_OBJ_SOURCES) *.mli 
>> $o,MLDEP,)
>> +endif
> 
> ... what about the distclean goal?


Thanks for the suggestion, I see other Makefiles using 'findstring clean', 
would that be appropriate here?

Something like this perhaps?

From 53cbf81a9c7d5090443b5fe5de37a47118ac53d8 Mon Sep 17 00:00:00 2001
Message-Id: 
<53cbf81a9c7d5090443b5fe5de37a47118ac53d8.1659522178.git.edvin.to...@citrix.com>
From: =?UTF-8?q?Edwin=20T=C3=B6r=C3=B6k?= 
To: xen-devel@lists.xenproject.org
Cc: Christian Lindig 
Cc: David Scott 
Cc: Wei Liu 
Cc: Anthony PERARD 
Date: Wed, 3 Aug 2022 11:21:46 +0100
Subject: [PATCH] tools/ocaml/Makefile.rules: do not run ocamldep on distclean
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Edwin Török 
---
 tools/ocaml/Makefile.rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/ocaml/Makefile.rules b/tools/ocaml/Makefile.rules
index d368308d9b..c1c5dd3b39 100644
--- a/tools/ocaml/Makefile.rules
+++ b/tools/ocaml/Makefile.rules
@@ -44,7 +44,7 @@ META: META.in

 ALL_OCAML_OBJ_SOURCES=$(addsuffix .ml, $(ALL_OCAML_OBJS))

-ifneq ($(MAKECMDGOALS),clean)
+ifeq (,$(findstring clean,$(MAKECMDGOALS)))
 .ocamldep.make: $(ALL_OCAML_OBJ_SOURCES) Makefile 
$(OCAML_TOPLEVEL)/Makefile.rules
$(call quiet-command, $(OCAMLDEP) $(ALL_OCAML_OBJ_SOURCES) *.mli 
$o,MLDEP,)
 endif
--
2.34.1

Re: [PATCH 2/2] tools/ocaml: Fix stubs the introduction of domain_create.cpupool_id

2022-05-18 Thread Edwin Torok


> On 17 May 2022, at 20:41, Andrew Cooper  wrote:
> 
> Sadly, cpupool IDs are chosen by the caller, not assigned sequentially, so
> this does need to have a full 32 bits of range.
> 
> Also leave a BUILD_BUG_ON() to catch more obvious ABI changes in the future.
> 
> Fixes: 92ea9c54fc81 ("arm/dom0less: assign dom0less guests to cpupools")
> Signed-off-by: Andrew Cooper 

Thanks for the fix.


> ---
> CC: Christian Lindig 
> CC: Edwin Török 
> CC: Luca Fancellu 
> ---
> tools/ocaml/libs/xc/xenctrl.ml  | 1 +
> tools/ocaml/libs/xc/xenctrl.mli | 1 +
> tools/ocaml/libs/xc/xenctrl_stubs.c | 8 +++-
> 3 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/ocaml/libs/xc/xenctrl.ml b/tools/ocaml/libs/xc/xenctrl.ml
> index 7503031d8f61..8eab6f60eb14 100644
> --- a/tools/ocaml/libs/xc/xenctrl.ml
> +++ b/tools/ocaml/libs/xc/xenctrl.ml
> @@ -85,6 +85,7 @@ type domctl_create_config =
>   max_grant_frames: int;
>   max_maptrack_frames: int;
>   max_grant_version: int;
> + cpupool_id: int32;

What are the valid values for a CPU pool id, in particular what value should be 
passed here to get back the behaviour prior to these changes in Xen?
(i.e. would it be cpu pool id 0 or -1 if cpu pools aren't otherwise explicitly 
configured on the system)

Thanks,
--Edwin

>   arch: arch_domainconfig;
> }
> 
> diff --git a/tools/ocaml/libs/xc/xenctrl.mli b/tools/ocaml/libs/xc/xenctrl.mli
> index d1d9c9247afc..d3014a2708d8 100644
> --- a/tools/ocaml/libs/xc/xenctrl.mli
> +++ b/tools/ocaml/libs/xc/xenctrl.mli
> @@ -77,6 +77,7 @@ type domctl_create_config = {
>   max_grant_frames: int;
>   max_maptrack_frames: int;
>   max_grant_version: int;
> +  cpupool_id: int32;
>   arch: arch_domainconfig;
> }
> 
> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
> b/tools/ocaml/libs/xc/xenctrl_stubs.c
> index 5b4fe72c8dec..513ee142d2a0 100644
> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> @@ -189,7 +189,8 @@ CAMLprim value stub_xc_domain_create(value xch, value 
> wanted_domid, value config
> #define VAL_MAX_GRANT_FRAMESField(config, 6)
> #define VAL_MAX_MAPTRACK_FRAMES Field(config, 7)
> #define VAL_MAX_GRANT_VERSION   Field(config, 8)
> -#define VAL_ARCHField(config, 9)
> +#define VAL_CPUPOOL_ID  Field(config, 9)
> +#define VAL_ARCHField(config, 10)
> 
>   uint32_t domid = Int_val(wanted_domid);
>   int result;
> @@ -201,6 +202,7 @@ CAMLprim value stub_xc_domain_create(value xch, value 
> wanted_domid, value config
>   .max_maptrack_frames = Int_val(VAL_MAX_MAPTRACK_FRAMES),
>   .grant_opts =
>   XEN_DOMCTL_GRANT_version(Int_val(VAL_MAX_GRANT_VERSION)),
> + .cpupool_id = Int32_val(VAL_CPUPOOL_ID),
>   };
> 
>   domain_handle_of_uuid_string(cfg.handle, String_val(VAL_HANDLE));
> @@ -225,6 +227,9 @@ CAMLprim value stub_xc_domain_create(value xch, value 
> wanted_domid, value config
>   case 1: /* X86 - emulation flags in the block */
> #if defined(__i386__) || defined(__x86_64__)
> 
> + /* Quick & dirty check for ABI changes. */
> + BUILD_BUG_ON(sizeof(cfg) != 64);
> +
> /* Mnemonics for the named fields inside xen_x86_arch_domainconfig */
> #define VAL_EMUL_FLAGS  Field(arch_domconfig, 0)
> 
> @@ -254,6 +259,7 @@ CAMLprim value stub_xc_domain_create(value xch, value 
> wanted_domid, value config
>   }
> 
> #undef VAL_ARCH
> +#undef VAL_CPUPOOL_ID
> #undef VAL_MAX_GRANT_VERSION
> #undef VAL_MAX_MAPTRACK_FRAMES
> #undef VAL_MAX_GRANT_FRAMES
> -- 
> 2.11.0
> 



Re: [XEN PATCH v2 29/29] tools/ocaml: fix build dependency target

2022-02-25 Thread Edwin Torok



> On 25 Feb 2022, at 16:28, Anthony PERARD  wrote:
> 
> On Fri, Feb 25, 2022 at 03:30:59PM +, Christian Lindig wrote:
>> 
>> 
>>> On 25 Feb 2022, at 15:13, Anthony PERARD  wrote:
>>> 
>>> This patch fix ".ocamldep.make" rule by always spelling the variable
>>> $(OCAML_TOPLEVEL).
>>> 
>>> Signed-off-by: Anthony PERARD 
>>> ---
>>> 
>>> Notes:
>>>   v2:
>>>   - new patch
>>> 
>>> tools/ocaml/libs/eventchn/Makefile   | 8 
>>> tools/ocaml/libs/mmap/Makefile   | 8 
>>> tools/ocaml/libs/xb/Makefile | 8 
>>> tools/ocaml/libs/xc/Makefile | 8 
>>> tools/ocaml/libs/xentoollog/Makefile | 8 
>>> tools/ocaml/libs/xl/Makefile | 8 
>>> tools/ocaml/libs/xs/Makefile | 8 
>>> tools/ocaml/Makefile.rules   | 2 +-
>> 
>> Acked-by: Christian Lindig 
>> 
>> I am fine with this but in general think that the OCaml part should be built 
>> using Dune (but invoked from Make), which is now the standard tool to build 
>> OCaml projects and is simple, fast, and accurate. Edwin maintains such a 
>> build for all development work on the OCaml side but it is not upstreamed.
> 
> ocaml-dune doesn't seems to be available on debian oldstable. So I don't
> think we can use it for now.
> 
> But thanks for pointing that out.
> 


I think we should try to add it as an optional build-system: when available use 
it, and at some point in the future remove the old one.
It is pretty much impossible to do development on the codebase without it, any 
developer who wants to make the changes to the OCaml code will likely want it.
(Of course those who only want to build and install oxenstored may not require 
dune, and may be fine with the Makefiles as they wouldn't require incremental 
builds or editor support).

Best regards,
--Edwin

> -- 
> Anthony PERARD




Re: [PATCH 0/6] gnttab: add per-domain controls

2021-09-20 Thread Edwin Torok



> On 20 Sep 2021, at 08:26, Roger Pau Monne  wrote:
> 
> On Fri, Sep 17, 2021 at 06:06:42PM +0200, Christian Lindig wrote:
>> 
>> 
>>> On 17 Sep 2021, at 16:46, Roger Pau Monne  wrote:
>>> 
>>> Hello,
>>> 
>>> The first two patches of this series allows setting the preisoutly host
>>> wide command line `gnttab` option on a per domain basis. That means
>>> selecting the max allowed grant table version and whether transitive
>>> grants are allowed.
>>> 
>>> The last 4 patches attempt to implement support for creating guests
>>> without grant table support at all. This requires some changes to
>>> xenstore in order to map shared ring using foreign memory instead of
>>> grant table.
>>> 
>>> Note that patch 5 will break the save format for xenstore records, and
>>> should not be applied.
>> 
>> Has this relevance for the format used by oxenstored?
> 
> I'm no expert on oxenstored, but I think it has always mapped the
> shared ring as foreign memory, and hence no changes are needed there.
> AFAICT it also stores the mfn on the save format, so I think this is
> all fine.
> 
> Should have mentioned it on the cover letter.
>  


There is a patch series from last year to make oxenstored use gnttab instead of 
map_foreign_range.
https://patchwork.kernel.org/project/xen-devel/cover/cover.1598548832.git.edvin.to...@citrix.com/
This got lost/forgotten amid all the oxenstored XSA work.

Later on I discovered and fixed some bugs in it, and is part of this refreshed 
patch series (part of which got committed, part of which didn't):
https://patchwork.kernel.org/project/xen-devel/list/?series=480623
https://github.com/edwintorok/xen/pull/2

I think the current status is:
* there was an objection that the commit vendoring the external dependencies 
for the unit tests was too big, and should be replaced by just an opam and 
lockfile telling 'opam' or 'opam monorepo' where to download it from
* I've discovered some bugs while testing this code together with other code, 
and need to retest with just this code alone to check that the bug was not in 
this code


As for the save format, that is part of this patch series too, and we don't 
store the mfn anymore. Do we need to go back to storing the mfn?

What do I need to change here? The reason to move away from foreign memory was 
that we could avoid relying on xenctrl for that function (and thus having one 
less unstable interface to link to). If we need to conditionally use foreign 
memory mapping then we're back to using unstable interfaces, unless there is a 
stable interface equivalent to mapping foreign pages?
I see there is a libs/foreignmemory (it has no OCaml bindings though). If we 
wrote OCaml bindings would the API/ABI of libs/foreignmemory be stable?
In which case we should probably replace the commit introducing the use of 
gnttab with the one using foreignmemory and always use foreignmemory instead of 
gnttab libs.

What do you think?

Best regards,
--Edwin






Re: [PATCH] tools/ocaml/libs/xc: add OCaml stubs to query CPU policy

2021-06-18 Thread Edwin Torok


> On 18 Jun 2021, at 14:17, Andrew Cooper  wrote:
> 
> On 18/06/2021 11:45, Edwin Török wrote:
>> diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
>> b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> index d05d7bb30e..4a230de8b7 100644
>> --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
>> +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
>> @@ -34,6 +34,9 @@
>> #include 
>> #include 
>> 
>> +#include 
>> +#include 
> 
> https://gitlab.com/xen-project/patchew/xen/-/jobs/1358403495
> 
> CI says no.  This needs to be behind a suitable ifdef, for non-x86 builds.


Should the stubs be disabled completely and raise ENOSYS/failwith on non-x86 
(e.g. ARM), or are there plans on doing equivalent CPU policy on ARM at some 
point?

—Edwin

> 
> (I've not looked at the rest of the patch yet.  I'll get around to it at
> some point.)
> 
> ~Andrew



Re: [PATCH] tools/ocaml/libs/xc: add OCaml stubs to query CPU policy

2021-06-18 Thread Edwin Torok


> On 18 Jun 2021, at 14:09, Christian Lindig  
> wrote:
> 
> 
> 
>> On 18 Jun 2021, at 11:45, Edwin Török  wrote:
>> 
>> Introduces following functions in Xenctrl and associated types:
>> get_system_cpu_policy
>> cpu_policy_to_featureset,
>> string_of_xen_cpu_policy_index
>> 
>> These are wrappers around the existing C functions in xenctrl.h,
>> that will be used by xenopsd initially.
>> 
>> -Wno-declaration-after-statement is disabled to allow mixing
>> declarations and code to simplify writing the stubs
>> by using variable length arrays on the stack instead of
>> allocating/freeing memory
>> (which would require additional error-handling logic).
>> 
>> Signed-off-by: Edwin Török 
>> ---
>> tools/ocaml/libs/xc/Makefile|   2 +-
>> tools/ocaml/libs/xc/xenctrl.ml  |  37 ++
>> tools/ocaml/libs/xc/xenctrl.mli |  71 ++
>> tools/ocaml/libs/xc/xenctrl_stubs.c | 195 
>> 4 files changed, 304 insertions(+), 1 deletion(-)
> 
> Acked-by: Christian Lindig 
> 
>> 
>> +static CAMLprim value Val_leaves(const xen_cpuid_leaf_t *leaves, uint32_t 
>> nr_leaves)
>> +{
>> +CAMLparam0();
>> +CAMLlocal1(result);
>> +uint32_t i;
>> +
>> +result = caml_alloc(nr_leaves, 0);
>> +for (i=0;i> +Store_field(result, i, Val_cpuid_leaf([i]));
>> +
>> +CAMLreturn(result);
>> +}
> 
> Is  caml_alloc(nr_leaves, 0) the right allocation? The 0 is the tag. There is 
> another instance of this below. What is the type of the returned value from 
> an OCaml perspective?


It is an array, so I think tag 0 is. Right (that is what the implementation of 
call_alloc_array does too). I could perhaps try to use caml_alloc_array instead 
to make this more obvious.


Best regards,
—Edwin

> 
> — C



Re: [PATCH v1.1 5/5] tests: Introduce a TSX test

2021-06-14 Thread Edwin Torok
On Mon, 2021-06-14 at 11:47 +0100, Andrew Cooper wrote:
> See the comment at the top of test-tsx.c for details.
> 
> This covers various complexities encountered while trying to address
> the
> recent TSX deprecation on client parts.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Roger Pau Monné 
> CC: Wei Liu 
> 
> v1.1:
>  * Set alternative guest policy, and check.
>  * Cope with !HAP configurations.
>  * Complete the comment at the top of test-tsx.c
> ---
>  tools/tests/Makefile   |   1 +
>  tools/tests/tsx/.gitignore |   1 +
>  tools/tests/tsx/Makefile   |  43 
>  tools/tests/tsx/test-tsx.c | 515
> +
>  4 files changed, 560 insertions(+)
>  create mode 100644 tools/tests/tsx/.gitignore
>  create mode 100644 tools/tests/tsx/Makefile
>  create mode 100644 tools/tests/tsx/test-tsx.c
> 
> diff --git a/tools/tests/Makefile b/tools/tests/Makefile
> index 8746aabe6b..25531a984a 100644
> --- a/tools/tests/Makefile
> +++ b/tools/tests/Makefile
> @@ -5,6 +5,7 @@ SUBDIRS-y :=
>  SUBDIRS-y += resource
>  SUBDIRS-$(CONFIG_X86) += cpu-policy
>  SUBDIRS-$(CONFIG_X86) += mce-test
> +SUBDIRS-$(CONFIG_X86) += tsx
>  ifneq ($(clang),y)
>  SUBDIRS-$(CONFIG_X86) += x86_emulator
>  endif
> diff --git a/tools/tests/tsx/.gitignore b/tools/tests/tsx/.gitignore
> new file mode 100644
> index 00..97ec4db7ff
> --- /dev/null
> +++ b/tools/tests/tsx/.gitignore
> @@ -0,0 +1 @@
> +test-tsx
> diff --git a/tools/tests/tsx/Makefile b/tools/tests/tsx/Makefile
> new file mode 100644
> index 00..7381a4f5a4
> --- /dev/null
> +++ b/tools/tests/tsx/Makefile
> @@ -0,0 +1,43 @@
> +XEN_ROOT = $(CURDIR)/../../..
> +include $(XEN_ROOT)/tools/Rules.mk
> +
> +TARGET := test-tsx
> +
> +.PHONY: all
> +all: $(TARGET)
> +
> +.PHONY: run
> +run: $(TARGET)
> + ./$(TARGET)
> +
> +.PHONY: clean
> +clean:
> + $(RM) -f -- *.o $(TARGET) $(DEPS_RM)
> +
> +.PHONY: distclean
> +distclean: clean
> + $(RM) -f -- *~
> +
> +.PHONY: install
> +install: all
> +
> +.PHONY: uninstall
> +uninstall:
> +
> +CFLAGS += -Werror -std=gnu11
> +CFLAGS += $(CFLAGS_xeninclude)
> +CFLAGS += $(CFLAGS_libxenctrl)
> +CFLAGS += $(CFLAGS_libxenguest)
> +CFLAGS += -I$(XEN_ROOT)/tools/libs/ctrl
> -I$(XEN_ROOT)/tools/libs/guest
> +CFLAGS += $(APPEND_CFLAGS)
> +
> +LDFLAGS += $(LDLIBS_libxenctrl)
> +LDFLAGS += $(LDLIBS_libxenguest)
> +LDFLAGS += $(APPEND_LDFLAGS)
> +
> +test-tsx.o: Makefile
> +
> +test-tsx: test-tsx.o
> + $(CC) -o $@ $< $(LDFLAGS)
> +
> +-include $(DEPS_INCLUDE)
> diff --git a/tools/tests/tsx/test-tsx.c b/tools/tests/tsx/test-tsx.c
> new file mode 100644
> index 00..036b36e797
> --- /dev/null
> +++ b/tools/tests/tsx/test-tsx.c
> @@ -0,0 +1,515 @@
> +/*
> + * TSX settings and consistency tests
> + *
> + * This tests various behaviours and invariants with regards to
> TSX.  It
> + * ideally wants running for several microcode versions, and all
> applicable
> + * tsx= commandline settings, on a single CPU, including after an S3
> + * suspend/resume event.
> + *
> + * It tests specifically:
> + *  - The consistency of MSR_TSX_CTRL/MSR_TSX_FORCE_ABORT values
> across the
> + *system, and their accessibility WRT data in the host CPU
> policy.
> + *  - The actual behaviour of RTM on the system.
> + *  - Cross-check the default/max policies based on the actual RTM
> behaviour.
> + *  - Create some guests, check their defaults, and check that the
> defaults
> + *can be changed.
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "xg_private.h"
> +
> +enum {
> +#define XEN_CPUFEATURE(name, value) X86_FEATURE_##name = value,
> +#include 
> +};
> +#define bitmaskof(idx)  (1u << ((idx) & 31))
> +
> +#define MSR_ARCH_CAPABILITIES   0x010a
> +#define  ARCH_CAPS_TSX_CTRL (1 <<  7)
> +#define MSR_TSX_FORCE_ABORT 0x010f
> +#define MSR_TSX_CTRL0x0122
> +
> +static unsigned int nr_failures;
> +#define fail(fmt, ...)  \
> +({  \
> +nr_failures++;  \
> +(void)printf(fmt, ##__VA_ARGS__);   \
> +})
> +
> +static xc_interface *xch;
> +
> +/*
> + * Policies, arranged as an array for easy collection of all of
> them.  We
> + * don't care about the raw policy (index 0) so reuse that for the
> guest
> + * policy.
> + */
> +static struct xc_cpu_policy policies[6];
> +#define guest_policy policies[0]
> +#define host policies[XEN_SYSCTL_cpu_policy_host]
> +#define pv_max   policies[XEN_SYSCTL_cpu_policy_pv_max]
> +#define hvm_max  policies[XEN_SYSCTL_cpu_policy_hvm_max]
> +#define pv_default   policies[XEN_SYSCTL_cpu_policy_pv_default]
> +#define hvm_default  policies[XEN_SYSCTL_cpu_policy_hvm_default]
> +
> +static bool xen_has_pv = 

Re: Preserving transactions accross Xenstored Live-Update

2021-05-19 Thread Edwin Torok
On Wed, 2021-05-19 at 11:09 +0200, Juergen Gross wrote:

On 18.05.21 20:11, Julien Grall wrote:

Hi Juergen,


I have started to look at preserving transaction accross Live-update in


C Xenstored. So far, I managed to transfer transaction that read/write

existing nodes.


Now, I am running into trouble to transfer new/deleted node within a

transaction with the existing migration format.


C Xenstored will keep track of nodes accessed during the transaction but

not the children (AFAICT for performance reason).


Not performance reasons, but because there isn't any need for that:


The children are either unchanged (so the non-transaction node records

apply), or they will be among the tracked nodes (transaction node

records apply). So in both cases all children should be known.


In case a child has been deleted in the transaction, the stream should

contain a node record for that child with the transaction-id and the

number of permissions being zero: see docs/designs/xenstore-migration.md


The problem for oxenstored is that you might've taken a snapshot in the past, 
your root has moved on, but you have in your snapshot a lot of nodes that have 
been deleted in the latest root.

A brute force way might be to diff the transaction's state and the latest root 
state and dump the delta entries as adding/deleting nodes in the migration 
stream.
This could lead to dumping a lot of duplicate state, and result in an explosion 
of file size (e.g. if you run 1000 domain, the current max supported limit  and 
each has one tiny transaction from the past
this will lead to 1000x amplification of xenstore size in the dump. In-memory 
is fine because OCaml will share common tree nodes that are unchanged).
This should correctly restore content but have a bad effect on conflict 
semantics: your migrated transactions will all then likely conflict at the 
root, or near the root and fail anyway.
Whereas without a live-update as long as you do not modify any of the old state 
you would get the conflict marker further down the tree and most of the time 
able to avoid conflicts.
I've tried implementing this last year: 
https://github.com/edwintorok/xen/pull/2/commits/a9f057131b75e1bd2dcb49c795630ab5875b7f76#diff-0f4826471775d78bfc6922c63152e268ef386171ebd985208cb82e21c621e749R288-R365
(ignore the awful indentation that code has been rebased with ignore_all_space 
so many times between different branches of Xen that whitespace correctness has 
been lost)

I've got a fuzzer/unit test for live-update (see xen-devel), but it has 
transactions turned off currently because I couldn't get it to work reliably, 
it always found examples where the transaction conflict state was not identical 
pre/post update.
If we abort all transactions after migration as discussed previously then it 
might be possible to get this to work if we accept the size explosion as a 
possibility and dump transaction state to /var/tmp, not to /tmp (which might be 
a tmpfs that gives you ENOSPC).

Live updates are a fairly niche use case and I'd like to see the current 
variant without transactions proven to work on an actual XSA (likely the next 
oxenstored XSA about queue limits if we find a solution to that),
and only after that deploy live-update support with transactions.
We also completely lack any unit tests for transactions (aside from the fuzzer 
that I started writing, which does just some very minimal state comparisons), 
we do not have a formal model on how transactions
and transaction conflicts should be handled to check whether transactions 
behave correctly, though a fairly good appromixation is: run 2 oxenstored one 
with and without live-update and check that they produce equivalent
(not necessarily identical, txid can change) answers. As long as we do not have 
to change the transaction semantics or code in any way to support live update.

Best regards,
--Edwin




Juergen

[CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments 
unless you have verified the sender and know the content is safe.


Re: [PATCH v2 00/17] live update and gnttab patches

2021-05-12 Thread Edwin Torok
On Wed, 2021-05-12 at 13:51 +0100, Andrew Cooper wrote:
> On 12/05/2021 11:10, Edwin Torok wrote:
> > On Tue, 2021-05-11 at 21:05 +0100, Andrew Cooper wrote:
> > > 
> > diff --git a/tools/ocaml/xenstored/disk.ml
> > b/tools/ocaml/xenstored/disk.ml
> > index 59794324e1..b7678af87f 100644
> > --- a/tools/ocaml/xenstored/disk.ml
> > +++ b/tools/ocaml/xenstored/disk.ml
> > @@ -176,7 +176,7 @@ let write store =
> > output_byte ch i
> >  
> > let w32 ch v =
> > -   assert (v >= 0 && v <= 0x_);
> > +   assert (v >= 0 && Int64.of_int v <= 0x_L);
> 
> In the case that v is 32 bits wide, it will underflow and fail the v
> >=
> 0 check, before the upcast to Int64.

I'll have to review the callers of this, I think my intention was to
forbid dumping negative values because it is ambigous what it means.
In case you are running on 64-bit that is most likely a bug because I
think most 32-bit values were defined as unsigned in the migration spec
or in the original xen public headers (I'll have to double check).

However in case of a 32-bit system we can have negative values where an
otherwise unsigned 32-bit quantity in xen is represented as an ocaml
int, or even silently truncated (if the xen value actually uses all 32-
bits, because OCaml ints are only 31-bits on 32-bit systems, one would
have to use the int32 type to get true 32-bit quantities in ocaml but
that comes with additional boxing and a more complicated syntax,
so in most places in Xen I see the difference just being ignored).

Perhaps this should forbid negative values only on 64-bit systems
(where that would be a bug), and allow it on 32-bit systems (where a
negative value might be legitimate or a bug, we can't tell).
Checking Sys.word_size should tell us what system we are on.

Best regards,
--Edwin


Re: [PATCH v2 00/17] live update and gnttab patches

2021-05-12 Thread Edwin Torok
On Tue, 2021-05-11 at 21:05 +0100, Andrew Cooper wrote:
> On 11/05/2021 19:05, Edwin Török wrote:
> > These patches have been posted previously.
> > The gnttab patches (tools/ocaml/libs/mmap) were not applied at the
> > time
> > to avoid conflicts with an in-progress XSA.
> > The binary format live-update and fuzzing patches were not applied
> > because it was too close to the next Xen release freeze.
> > 
> > The patches depend on each-other: live-update only works correctly
> > when the gnttab
> > patches are taken too (MFN is not part of the binary live-update
> > stream),
> > so they are included here as a single series.
> > The gnttab patches replaces one use of libxenctrl with stable
> > interfaces, leaving one unstable
> > libxenctrl interface used by oxenstored.
> > 
> > The 'vendor external dependencies' may be optional, it is useful to
> > be part
> > of a patchqueue in a specfile so that you can build everything
> > without external dependencies,
> > but might as well commit it so everyone has it easily available not
> > just XenServer.
> > 
> > Note that the live-update fuzz test doesn't yet pass, it is still
> > able to find bugs.
> > However the reduced version with a fixed seed used as a unit test
> > does pass,
> > so it is useful to have it committed, and further improvements can
> > be made later
> > as more bugs are discovered and fixed.
> > 
> > Edwin Török (17):
> >   docs/designs/xenstore-migration.md: clarify that deletes are
> > recursive
> >   tools/ocaml: add unit test skeleton with Dune build system
> >   tools/ocaml: vendor external dependencies for convenience
> >   tools/ocaml/xenstored: implement the live migration binary format
> >   tools/ocaml/xenstored: add binary dump format support
> >   tools/ocaml/xenstored: add support for binary format
> >   tools/ocaml/xenstored: validate config file before live update
> >   Add structured fuzzing unit test
> >   tools/ocaml: use common macros for manipulating mmap_interface
> >   tools/ocaml/libs/mmap: allocate correct number of bytes
> >   tools/ocaml/libs/mmap: Expose stub_mmap_alloc
> >   tools/ocaml/libs/mmap: mark mmap/munmap as blocking
> >   tools/ocaml/libs/xb: import gnttab stubs from mirage
> >   tools/ocaml: safer Xenmmap interface
> >   tools/ocaml/xenstored: use gnttab instead of xenctrl's
> > foreign_map_range
> >   tools/ocaml/xenstored: don't store domU's mfn of ring page
> >   tools/ocaml/libs/mmap: Clean up unused read/write
> 
> Gitlab CI reports failures across the board in Debian Stretch 32-bit
> builds.  All logs
> https://gitlab.com/xen-project/patchew/xen/-/pipelines/301146112 but
> the
> tl;dr seems to be:
> 
> File "disk.ml", line 179, characters 26-37:
> Error: Integer literal exceeds the range of representable integers of
> type int

Thanks, this should fix it, I refreshed my git tree (there is also a
fix there for the older version of Make):
https://gitlab.com/xen-project/patchew/xen/-/pipelines/301146112

Not sure whether it is worth continuing to support 32-bit i686 builds,
any modern Intel/AMD CPU would be 64-bit capable, but perhaps 32-bit is
still popular in the ARM world and keeping 32-bit Intel supported is
the easiest way to build-test it?

diff --git a/tools/ocaml/xenstored/disk.ml
b/tools/ocaml/xenstored/disk.ml
index 59794324e1..b7678af87f 100644
--- a/tools/ocaml/xenstored/disk.ml
+++ b/tools/ocaml/xenstored/disk.ml
@@ -176,7 +176,7 @@ let write store =
output_byte ch i
 
let w32 ch v =
-   assert (v >= 0 && v <= 0x_);
+   assert (v >= 0 && Int64.of_int v <= 0x_L);
output_binary_int ch v
 
let pos = pos_out
@@ -213,7 +213,7 @@ let write store =
 
let r32 t =
(* read unsigned 32-bit int *)
-   let r = input_binary_int t land 0x_ in
+   let r = Int64.logand (Int64.of_int (input_binary_int t))
0x_L |> Int64.to_int in
assert (r >= 0);
r
 
@@ -293,7 +293,7 @@ module LiveRecord = struct
write_record t Type.global_data 8 @@ fun b ->
O.w32 b (FD.to_int rw_sock);
 (* TODO: this needs a unit test/live update test too!
*)
-   O.w32 b 0x_
+   O.w32 b 0x3FFF_
 
let read_global_data t ~len f =
read_expect t "global_data" 8 len;
> 
> ~Andrew


Re: [PATCH v2 00/17] live update and gnttab patches

2021-05-11 Thread Edwin Torok
On Tue, 2021-05-11 at 19:05 +0100, Edwin Török wrote:
> These patches have been posted previously.

A git tree is available here for easier reviewing:
https://github.com/edwintorok/xen/pull/2


> The gnttab patches (tools/ocaml/libs/mmap) were not applied at the
> time
> to avoid conflicts with an in-progress XSA.
> The binary format live-update and fuzzing patches were not applied
> because it was too close to the next Xen release freeze.
> 
> The patches depend on each-other: live-update only works correctly
> when the gnttab
> patches are taken too (MFN is not part of the binary live-update
> stream),
> so they are included here as a single series.
> The gnttab patches replaces one use of libxenctrl with stable
> interfaces, leaving one unstable
> libxenctrl interface used by oxenstored.
> 
> The 'vendor external dependencies' may be optional, it is useful to
> be part
> of a patchqueue in a specfile so that you can build everything
> without external dependencies,
> but might as well commit it so everyone has it easily available not
> just XenServer.
> 
> Note that the live-update fuzz test doesn't yet pass, it is still
> able to find bugs.
> However the reduced version with a fixed seed used as a unit test
> does pass,
> so it is useful to have it committed, and further improvements can be
> made later
> as more bugs are discovered and fixed.
> 
> Edwin Török (17):
>   docs/designs/xenstore-migration.md: clarify that deletes are
> recursive
>   tools/ocaml: add unit test skeleton with Dune build system
>   tools/ocaml: vendor external dependencies for convenience
>   tools/ocaml/xenstored: implement the live migration binary format
>   tools/ocaml/xenstored: add binary dump format support
>   tools/ocaml/xenstored: add support for binary format
>   tools/ocaml/xenstored: validate config file before live update
>   Add structured fuzzing unit test
>   tools/ocaml: use common macros for manipulating mmap_interface
>   tools/ocaml/libs/mmap: allocate correct number of bytes
>   tools/ocaml/libs/mmap: Expose stub_mmap_alloc
>   tools/ocaml/libs/mmap: mark mmap/munmap as blocking
>   tools/ocaml/libs/xb: import gnttab stubs from mirage
>   tools/ocaml: safer Xenmmap interface
>   tools/ocaml/xenstored: use gnttab instead of xenctrl's
> foreign_map_range
>   tools/ocaml/xenstored: don't store domU's mfn of ring page
>   tools/ocaml/libs/mmap: Clean up unused read/write
> 
>  docs/designs/xenstore-migration.md|3 +-
>  tools/ocaml/.gitignore|2 +
>  tools/ocaml/Makefile  |   53 +
>  tools/ocaml/dune-project  |5 +
>  tools/ocaml/duniverse/cmdliner/.gitignore |   10 +
>  tools/ocaml/duniverse/cmdliner/.ocp-indent|1 +
>  tools/ocaml/duniverse/cmdliner/B0.ml  |9 +
>  tools/ocaml/duniverse/cmdliner/CHANGES.md |  255 +++
>  tools/ocaml/duniverse/cmdliner/LICENSE.md |   13 +
>  tools/ocaml/duniverse/cmdliner/Makefile   |   77 +
>  tools/ocaml/duniverse/cmdliner/README.md  |   51 +
>  tools/ocaml/duniverse/cmdliner/_tags  |3 +
>  tools/ocaml/duniverse/cmdliner/build.ml   |  155 ++
>  tools/ocaml/duniverse/cmdliner/cmdliner.opam  |   32 +
>  tools/ocaml/duniverse/cmdliner/doc/api.odocl  |1 +
>  tools/ocaml/duniverse/cmdliner/dune-project   |2 +
>  tools/ocaml/duniverse/cmdliner/pkg/META   |7 +
>  tools/ocaml/duniverse/cmdliner/pkg/pkg.ml |   33 +
>  .../ocaml/duniverse/cmdliner/src/cmdliner.ml  |  309 
>  .../ocaml/duniverse/cmdliner/src/cmdliner.mli | 1624
> +
>  .../duniverse/cmdliner/src/cmdliner.mllib |   11 +
>  .../duniverse/cmdliner/src/cmdliner_arg.ml|  356 
>  .../duniverse/cmdliner/src/cmdliner_arg.mli   |  111 ++
>  .../duniverse/cmdliner/src/cmdliner_base.ml   |  302 +++
>  .../duniverse/cmdliner/src/cmdliner_base.mli  |   68 +
>  .../duniverse/cmdliner/src/cmdliner_cline.ml  |  199 ++
>  .../duniverse/cmdliner/src/cmdliner_cline.mli |   34 +
>  .../duniverse/cmdliner/src/cmdliner_docgen.ml |  352 
>  .../cmdliner/src/cmdliner_docgen.mli  |   30 +
>  .../duniverse/cmdliner/src/cmdliner_info.ml   |  233 +++
>  .../duniverse/cmdliner/src/cmdliner_info.mli  |  140 ++
>  .../cmdliner/src/cmdliner_manpage.ml  |  502 +
>  .../cmdliner/src/cmdliner_manpage.mli |  100 +
>  .../duniverse/cmdliner/src/cmdliner_msg.ml|  116 ++
>  .../duniverse/cmdliner/src/cmdliner_msg.mli   |   56 +
>  .../cmdliner/src/cmdliner_suggest.ml  |   54 +
>  .../cmdliner/src/cmdliner_suggest.mli |   25 +
>  .../duniverse/cmdliner/src/cmdliner_term.ml   |   41 +
>  .../duniverse/cmdliner/src/cmdliner_term.mli  |   40 +
>  .../duniverse/cmdliner/src/cmdliner_trie.ml   |   97 +
>  .../duniverse/cmdliner/src/cmdliner_trie.mli  |   35 +
>  tools/ocaml/duniverse/cmdliner/src/dune   |4 +
>  tools/ocaml/duniverse/cmdliner/test/chorus.ml |   31 +
>  

Re: oxenstored restart after system crash

2021-02-18 Thread Edwin Torok
Hi,

oxenstored doesn't have a tdb file, by default it stores the entire tree in 
memory only.

There is a way to persistently store the tree (--persistent), but that is not 
enabled by default and I don't know whether it even works.
Master (or the hotfixed releases) have a live-update functionality now that 
dump and restore state properly (and reuses some of the persistent disk code, 
but also dumps some additional state).

The default location of the "persistent" database is /var/run/xenstored, which 
is a tmpfs and thus cleared on every boot. So if you'd ensure that oxenstored 
uses the equivalent of that on FreeBSD (or have a script on boot that clears 
it) that would solve any issues like this.

I don't know about C xenstored's behaviour, I'll let someone else answer that.

Best regards,
--Edwin


From: Roger Pau Monne 
Sent: 18 February 2021 09:46
To: xen-devel@lists.xenproject.org 
Cc: Christian Lindig ; Edwin Torok 
; Jürgen Groß ; Ian Jackson 

Subject: oxenstored restart after system crash

Hello,

Last month I got a query from a FreeBSD Xen user having issues with
xenstored after a power failure:

https://lists.freebsd.org/pipermail/freebsd-xen/2021-January/003446.html

I'm not sure what's the right approach here. I've been told cxenstored
will attempt to unlink the tdb file when starting, does oxenstored
attempt to do the same?

Should the tdb file be placed in a path that's cleaned up on boot?

Should xencommons remove the stale tdb before starting xenstored?

Mostly wanted to know what's the approach on Linux so that I can do
the same on FreeBSD.

Thanks, Roger.


Re: [OSSTEST PATCH 7/7] make-flight: Stripy xenstored

2021-01-22 Thread Edwin Torok
On Fri, 2021-01-22 at 15:56 +, Ian Jackson wrote:
> Previously, we let the Xen build system and startup scripts choose
> which xenstored to use.  Before we upgraded to Debian buster, that
> gave us C xentored tests on ARM.  Since then, armhf and arm64 have
> both had enough ocaml support and we haven't been testing C xenstored
> at all !
> 
> Change this, by selecting between C xenstored and Ocaml xenstored
> "at random".  Actually, this is based on the job name.  So the same
> job in different branches will use the same xenstored - which helps
> avoid confusion.
> 
> I have diffed the output of standalone-generate-dump-flight-runvars.
> As expected, this addes a variable all_host_xenstored to every job.
> 
> To make sure we have enough diversity, I eyeballed the results.  In
> particular:
>   * The smoke tests now have 2 C and 2 Ocaml, one of each on
>     ARM and x86.
>   * XTF tests have 2 oxenstored and 3 C xenstored.
>   * The ovmf flight has one of each
>   * The seabios and libvirt flights look reasonably mixed.
> 
> Most other flights have enough jobs that I think things are diverse
> enough without looking at them all in detail.
> 
> I think this lack of testing needs fixing for the Xen 4.15 release.
> So after review I intend to push this to osstest pretest, and may
> force push it even if shows regressions.

Sounds good.

In the patch series that I've recently posted to xen-devel there is
also a 'make check' target in tools/ocaml/xenstored.

There is a bit of plumbing to do before that is ready to be run
automatically (there is a Dockerfile in the patchseries, so it should
be possible to either adapt that, or vendor the 3rdparty dependencies
for the purposes of osstest, whichever is preferable).

These unit tests require OCaml 4.06+, so it'd be good to ensure that at
least one of the builders has that (the Ubuntu:focal from my patch
series should).

Best regards,
--Edwin



Re: [PATCH v2 3/8] docs/designs/xenstore-migration.md: clarify that deletes are recursive

2021-01-22 Thread Edwin Torok
On Fri, 2021-01-22 at 14:04 +0100, Jürgen Groß wrote:
> On 15.01.21 23:28, Edwin Török wrote:
> > Signed-off-by: Edwin Török 
> > ---
> > Changed since V1:
> > * post publicly now that the XSA is out
> > ---
> >   docs/designs/xenstore-migration.md | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/docs/designs/xenstore-migration.md
> > b/docs/designs/xenstore-migration.md
> > index 2ce2c836f5..f44bc0c61d 100644
> > --- a/docs/designs/xenstore-migration.md
> > +++ b/docs/designs/xenstore-migration.md
> > @@ -365,7 +365,8 @@ record previously present).
> >   |  | 0x0001: read   |
> >   |  | 0x0002: written    |
> >   |  |    |
> > -|  | The value will be zero for a deleted node  |
> > +|  | The value will be zero for a recursively   |
> > +|  | deleted node   |
> 
> I don't see the value in this modification.
> 
> The wording is ambiguous: is the value zero only for nodes which were
> deleted due to recursion, or do you mean deletes are recursive?

I was looking at this from the point of view of generating the diff,
where you can optimize and reduce the size of the diff if you notice
that it is sufficient to add a record only for the topmost entry when
the entire subtree is deleted.

You are right that looking at it from the point of view of applying the
transaction record you would reuse the existing delete implementation
which is already recursive.

> 
> Per docs/misc/xenstore.txt all deletes are recursive, so there is no
> need to repeat that here. And a zero value only for the recursions
> makes
> no sense at all.
> 
> So I'd nack this patch.

We can drop it.
--Edwin
> 
> Juergen



Re: [PATCH v2 1/8] tools/xenstore: add live update command to xenstore-control

2021-01-18 Thread Edwin Torok
On Mon, 2021-01-18 at 08:50 +0100, Jürgen Groß wrote:
> On 15.01.21 23:28, Edwin Török wrote:
> > From: Juergen Gross 
> > 
> > Add the "live-update" command to xenstore-control enabling updating
> > xenstored to a new version in a running Xen system.
> > 
> > With -c  it is possible to pass a different command line to
> > the
> > new instance of xenstored. This will replace the command line used
> > for the invocation of the just running xenstored instance.
> > 
> > The running xenstored (or xenstore-stubdom) needs to support live
> > updating, of course.
> > 
> > For now just add a small dummy handler to C xenstore denying any
> > live update action.
> > 
> > Signed-off-by: Juergen Gross 
> > Reviewed-by: Paul Durrant 
> > Reviewed-by: Julien Grall 
> 
> Instead of merging multiple patches of mine into a single one and
> sending it here with my S-o-b I'd prefer a simple caveat for your
> series to depend on my C-xenstore series.

Yes, I should've added a link to that in my cover letter,
in the next revision I'll point to:
https://lore.kernel.org/xen-devel/20210115083000.14186-1-jgr...@suse.com/

> 
> This additionally reduces the risk to miss any modifications in
> my series done in a later iteration than the one you have taken.

I've reordered the patches and changed the base, it should be gone the
next I do git format-patch:
https://github.com/edwintorok/xen/pull/1/commits

Will also remove
https://github.com/edwintorok/xen/pull/1/commits/47dd1d9b99b9210e94fbd4af28afee8088d5267
once I implemented the change in oxenstored to not return BUSY that we
discussed.

> 
> 
> Juergen



Re: [PATCH v2 4/8] tools/ocaml/xenstored: only quit on SIGTERM when a reload is possible

2021-01-18 Thread Edwin Torok
On Mon, 2021-01-18 at 08:51 +0100, Jürgen Groß wrote:
> On 15.01.21 23:28, Edwin Török wrote:
> > Currently when oxenstored receives SIGTERM it dumps its state and
> > quits.
> > It is possible to then restart it if --restart is given, however
> > that is
> > not always safe:
> > 
> > * domains could have active transactions, and after a restart they
> > would
> > either reuse transaction IDs of already open transactions, or get
> > an
> > error back that the transaction doesn't exist
> > 
> > * there could be pending data to send to a VM still in oxenstored's
> >    queue which would be lost
> > 
> > * there could be pending input to be processed from a VM in
> > oxenstored's
> >    queue which would be lost
> > 
> > Prevent shutting down oxenstored via SIGTERM in the above
> > situations.
> > Also ignore domains marked as bad because oxenstored would never
> > talk
> > to them again.
> > 
> > Signed-off-by: Edwin Török 
> > Reviewed-by: Pau Ruiz Safont 
> > Reviewed-by: Christian Lindig 
> > 
> > ---
> > Changed since V1:
> > * post publicly now that the XSA is out
> > ---
> >   tools/ocaml/xenstored/connection.ml  | 35
> > 
> >   tools/ocaml/xenstored/connections.ml |  8 +++
> >   tools/ocaml/xenstored/xenstored.ml   | 13 +--
> >   tools/xenstore/xenstored_core.c  |  7 +-
> 
> I don't think you should modify tools/xenstore/xenstored_core.c in
> your
> series.
> 

Thanks for spotting, I think that hunk ended up in the wrong place
during a patchqueue rebase when solving conflicts, it should be gone
when I post a V3:
https://github.com/edwintorok/xen/pull/1/commits/4ec9ffcee83a9668431b220bef4abdcd9ac51175

Best regards,
--Edwin


Re: [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control

2021-01-13 Thread Edwin Torok
On Wed, 2021-01-13 at 17:34 +0100, Jürgen Groß wrote:
> On 06.01.21 15:42, Edwin Torok wrote:
> > [...]
> > 
> > I'd prefer it if there was a more asynchronous protocol available
> > for
> > live-update:
> > * the live-update on its own queues up the live update request and
> > returns a generation ID. xenstored retries on its own during each
> > of
> > its main loop iterations whether the conditions for live-update are
> > now
> > met
> > * when a generation ID is specified with the live-update command it
> > acts as a query to return the status. A query for generation ID <
> > current ID return success, and for generation ID = current ID it
> > returns either BUSY, or a specific error if known (e.g. timeout
> > reached
> > and list of domains preventing live update)
> > * the generation ID gets incremented every time a live update
> > succeeds
> >   
> > This would eliminiate the need for a client-side timeout, since
> > each of
> > these commands would be non-blocking.
> > It'd also avoid the busy-poll flood.
> > 
> > Obviously any change here has to be backwards compatible with the
> > already deployed xenstored and oxenstored which doesn't know about
> > generation IDs, but you can tell them apart based on the reply: if
> > you
> > get back OK/BUSY/some error it is the old version, if you get back
> > a
> > generation ID it is the new version.
> > 
> > I ran out of time to implement this before the embargo was up,
> > should I
> > go ahead with prototyping a patch for this now, or do you see an
> > alternative way to make the live-update command more robust?
> 
> I think this can be made much easier:
> 
> The live-update should be handled completely in the daemon, so
> returning only with success or failure. Returning BUSY wouldn't
> occur this way, but either "OK" (after the successful LU) or a
> failure reason (e.g. list of domains blocking) would be
> returned.
> 
> I'll have a try with this approach.
> 
> 

In oxenstored each request wants an immediate reply, so delaying the
reply to after the live-update is not trivial.
Should be possible to do though (e.g. mark the socket such that no
further xenstore protocol is processed on it, and "put back" the live
update request, to be replied by the new xenstored after the live
update completes, which would clear the 'do not process this socket'
flag), I'll be curious to see what your approach will look like.

Best regards,
--Edwin


Re: [PATCH v10 06/25] tools/xenstore: add live update command to xenstore-control

2021-01-06 Thread Edwin Torok
On Tue, 2020-12-15 at 17:35 +0100, Juergen Gross wrote:

> [...] 
> +static int live_update_start(struct xs_handle *xsh, bool force,
> unsigned int to)
> +{
> +    int len = 0;
> +    char *buf = NULL, *ret;
> +    time_t time_start;
> +
> +    if (asprintf(, "%u", to) < 0)
> +    return 1;
> +    len = add_to_buf(, "-s", len);
> +    len = add_to_buf(, "-t", len);
> +    len = add_to_buf(, ret, len);
> +    free(ret);
> +    if (force)
> +    len = add_to_buf(, "-F", len);
> +    if (len < 0)
> +    return 1;
> +
> +    for (time_start = time(NULL); time(NULL) - time_start < to;) {
> +    ret = xs_control_command(xsh, "live-update", buf, len);
> +    if (!ret)
> +    goto err;
> +    if (strcmp(ret, "BUSY"))
> +    break;
> +    }

We've discovered issues with this during testing: when a live-update is
not possible (e.g. guest with active transaction held open on purpose)
xenstored gets flooded with live-update requests until the timeout is
reached.

This is problematic for multiple reasons:
* flooding xenstored reduces its throughput, and makes it use 100% CPU.
Depending on the implementation and configuration it may also flood the
logs (which isn't fatal, the system still stays alive if you ENOSPC on
/var/log, but it makes it difficult to debug issues if a live update
gets you to ENOSPC as you may not see actual failures from after the
live update).
* flooding xenstored causes the evtchn to overflow and AFAICT neither
xenstored or oxenstored is capable of coping with that (oxenstored
infinite loops when that happens). IIUC this needs to be fixed in the
kernel, so it doesn't return EFBIG in the first place. As a workaround
I added a sleep(1) in this loop
* the timeout is hit on both client and server sides, but depending on
rounding almost always happens on the client side first which prevents
us from displaying a more informative error message from the server. As
a workaround I increased the client side timeout by 2s to cope with
rounding and give the server a chance to reply. Oxenstored can reply
with a list of domains preventing the live-update for example.

My workarounds looked like this:
@@ -42,6 +43,9 @@ static int live_update_start(struct xs_handle *xsh,
bool force, unsigned int to)
 len = add_to_buf(, "-F", len);
 if (len < 0)
 return 1;
+/* +1 for rounding issues
+ * +1 to give oxenstored a chance to timeout and report back first
*/
+to += 2;
 
 for (time_start = time(NULL); time(NULL) - time_start < to;) {
 ret = xs_control_command(xsh, "live-update", buf, len);
@@ -49,6 +53,12 @@ static int live_update_start(struct xs_handle *xsh,
bool force, unsigned int to)
 goto err;
 if (strcmp(ret, "BUSY"))
 break;
+/* TODO: use task ID for commands, avoid busy loop polling
here
+ * oxenstored checks BUSY condition internally on every main
loop iteration anyway.
+ * Avoid flooding xenstored with live-update requests.
+ * The flooding can also cause the evtchn to overflow in
xenstored which makes
+ * xenstored enter an infinite loop */
+sleep(1);
 }

This also required various heuristics in oxenstored to differentiate
between a new live-update command and querying the status of an already
completed live-update command, otherwise I almost always ended up doing
2 live-updates in a row (first one queued up, returned BUSY, completed
after a while, and then another one from all the repeated live-update
requests).

I'd prefer it if there was a more asynchronous protocol available for
live-update:
* the live-update on its own queues up the live update request and
returns a generation ID. xenstored retries on its own during each of
its main loop iterations whether the conditions for live-update are now
met
* when a generation ID is specified with the live-update command it
acts as a query to return the status. A query for generation ID <
current ID return success, and for generation ID = current ID it
returns either BUSY, or a specific error if known (e.g. timeout reached
and list of domains preventing live update)
* the generation ID gets incremented every time a live update succeeds
 
This would eliminiate the need for a client-side timeout, since each of
these commands would be non-blocking.
It'd also avoid the busy-poll flood.

Obviously any change here has to be backwards compatible with the
already deployed xenstored and oxenstored which doesn't know about
generation IDs, but you can tell them apart based on the reply: if you
get back OK/BUSY/some error it is the old version, if you get back a
generation ID it is the new version.

I ran out of time to implement this before the embargo was up, should I
go ahead with prototyping a patch for this now, or do you see an
alternative way to make the live-update command more robust?


Best regards,
--Edwin


Re: [PATCH v1 4/4] tools/ocaml/libs/xc: backward compatible domid control at domain creation time

2020-11-19 Thread Edwin Torok
On Wed, 2020-11-18 at 18:13 +, Andrew Cooper wrote:
> On 17/11/2020 18:24, Edwin Török wrote:
> > One can specify the domid to use when creating the domain, but this
> > was hardcoded to 0.
> > 
> > Keep the existing `domain_create` function (and the type of its
> > parameters) as is to make
> > backwards compatibility easier.
> > Introduce a new `domain_create_domid` OCaml API that allows
> > specifying the domid.
> > A new version of xenopsd can choose to start using this, while old
> > versions of xenopsd will keep
> > building and using the old API.
> > 
> > Controlling the domid can be useful during testing or migration.
> > 
> > Signed-off-by: Edwin Török 
> > ---
> >  tools/ocaml/libs/xc/xenctrl.ml  | 3 +++
> >  tools/ocaml/libs/xc/xenctrl.mli | 2 ++
> >  tools/ocaml/libs/xc/xenctrl_stubs.c | 9 +++--
> >  3 files changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/ocaml/libs/xc/xenctrl.ml
> > b/tools/ocaml/libs/xc/xenctrl.ml
> > index e878699b0a..9d720886e9 100644
> > --- a/tools/ocaml/libs/xc/xenctrl.ml
> > +++ b/tools/ocaml/libs/xc/xenctrl.ml
> > @@ -182,6 +182,9 @@ let with_intf f =
> >  external domain_create: handle -> domctl_create_config -> domid
> >     = "stub_xc_domain_create"
> >  
> > +external domain_create_domid: handle -> domctl_create_config ->
> > domid -> domid
> > +   = "stub_xc_domain_create_domid"
> 
> Wouldn't this be better as handle -> domid -> domctl_create_config ->
> domid ?
> 
> I'm not overwhelmed with the name, but
> "domain_create_{specific,with}_domid" don't seem much better either.
> 
> > +
> >  external domain_sethandle: handle -> domid -> string -> unit
> >     = "stub_xc_domain_sethandle"
> >  
> > diff --git a/tools/ocaml/libs/xc/xenctrl.mli
> > b/tools/ocaml/libs/xc/xenctrl.mli
> > index e64907df8e..e629022901 100644
> > --- a/tools/ocaml/libs/xc/xenctrl.mli
> > +++ b/tools/ocaml/libs/xc/xenctrl.mli
> > @@ -145,6 +145,8 @@ val close_handle: unit -> unit
> >  
> >  external domain_create : handle -> domctl_create_config -> domid
> >    = "stub_xc_domain_create"
> > +external domain_create_domid : handle -> domctl_create_config ->
> > domid -> domid
> > +  = "stub_xc_domain_create_domid"
> >  external domain_sethandle : handle -> domid -> string -> unit =
> > "stub_xc_domain_sethandle"
> >  external domain_max_vcpus : handle -> domid -> int -> unit
> >    = "stub_xc_domain_max_vcpus"
> > diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c
> > b/tools/ocaml/libs/xc/xenctrl_stubs.c
> > index 94aba38a42..bb718fd164 100644
> > --- a/tools/ocaml/libs/xc/xenctrl_stubs.c
> > +++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
> > @@ -175,7 +175,7 @@ static unsigned int
> > ocaml_list_to_c_bitmap(value l)
> > return val;
> >  }
> >  
> > -CAMLprim value stub_xc_domain_create(value xch, value config)
> > +CAMLprim value stub_xc_domain_create_domid(value xch, value
> > config, value want_domid)
> >  {
> > CAMLparam2(xch, config);
> > CAMLlocal2(l, arch_domconfig);
> > @@ -191,7 +191,7 @@ CAMLprim value stub_xc_domain_create(value xch,
> > value config)
> >  #define VAL_MAX_MAPTRACK_FRAMES Field(config, 7)
> >  #define VAL_ARCH    Field(config, 8)
> >  
> > -   uint32_t domid = 0;
> > +   uint32_t domid = Int_val(want_domid);
> 
> wanted_domid would be a slightly better name, because it isn't
> ambiguous
> with a boolean flag.
> 
> > int result;
> > struct xen_domctl_createdomain cfg = {
> > .ssidref = Int32_val(VAL_SSIDREF),
> > @@ -262,6 +262,11 @@ CAMLprim value stub_xc_domain_create(value
> > xch, value config)
> > CAMLreturn(Val_int(domid));
> >  }
> >  
> > +CAMLprim value stub_xc_domain_create(value xch, value config,
> > value want_domid)
> > +{
> > +    return stub_xc_domain_create_domid(xch, config, Val_int(0));
> > +}
> 
> Using
> https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=36d94c17fa1e48cc9fb9ed15bc9a2237a1738bbb
> as reverse inspiration, couldn't we do the insertion of 0 at the
> Ocaml
> level and avoid doubling up the C stub?

I wanted to retain the old API for backwards compatibility, but you are
right that this could be done just on the OCaml level, I'll update the
patch.

If you upgrade Xen without upgrading xenopsd you'll get a fairly
obvious failure to start xenopsd due to the missing symbol, but that
could be solved with an appropriate dependency at the distro package
level. As long as matching Xen and xenopsd gets installed (even if not
booted into) xenopsd should succeed in restarting then.

Best regards,
--Edwin

> 
> ~Andrew
> 
> > +
> >  CAMLprim value stub_xc_domain_max_vcpus(value xch, value domid,
> >  value max_vcpus)
> >  {
> 



Re: [PATCH v1 2/4] automation/: add Ubuntu:focal container

2020-11-18 Thread Edwin Torok
On Wed, 2020-11-18 at 10:40 -0600, Doug Goldstein wrote:
> 
> 
> On 11/17/20 12:24 PM, Edwin Török wrote:
> > Signed-off-by: Edwin Török 
> 
> Looks good. Do you have permissions to push the container or do you
> need
> me to?
> 
> Acked-by: Doug Goldstein 

Thanks, if you could push it that'd be great.
I don't have any special permissions on the gitlab registry.

Best regards,
--Edwin


Re: [XEN PATCH] tools/xenstore: Log xenstored build ID on startup

2020-11-13 Thread Edwin Torok
On Fri, 2020-11-13 at 17:13 +, Andrew Cooper wrote:
> On 13/11/2020 16:56, Bjoern Doebel wrote:
> > On 13.11.20 16:36, Jürgen Groß wrote:
> > > On 13.11.20 15:18, Bjoern Doebel wrote:
> > > > Right now we do not have a mechanism to determine the version
> > > > of the
> > > > currently running xenstored at runtime. As xenstored runs
> > > > throughout
> > > > the
> > > > lifetime of a Xen host, this may lead to problems when newer
> > > > user space
> > > > builds are staged. Then, the running xenstored will no longer
> > > > match the
> > > > version of the installed xenstored.
> > > > 
> > > > To allow users to always identify the running version of
> > > > xenstored, add
> > > > a linker-generated unique build ID to every xenstored build.
> > > > Add
> > > > functionality to log this build ID into a file upon service
> > > > startup.
> > > > 
> > > > Signed-off-by: Bjoern Doebel 
> > > > Reviewed-by: Martin Mazein 
> > > > Reviewed-by: Paul Durrant 
> > > 
> > > No support for oxenstored or xenstore-stubdom?
> > 
> > Your suggestion further down will apparently help for stubdom. I do
> > not speak ocaml at all - how do we address this?
> 
> CC'ing Edwin and Christian who have done the bulk of the oxenstored
> recently.
> 
> It sounds like it might not be possible right now, but would be
> possible
> with a future plan to switch the Ocaml build system over to dune (the
> new/preferred Ocaml upstream toolchain).

See here what is possible with Dune:
https://dune.readthedocs.io/en/stable/dune-libs.html#build-info

Would the output of 'git describe --always --dirty' (perhaps combined
with a build date) serve as a useful build ID?

> 
> If it does end up being an XS_CONTROL sub-op, we can implement it at
> a
> future point when we can usefully answer the question.

Wouldn't using readelf on /proc//exe give you the running buildid?

readelf -a /usr/sbin/oxenstored /proc/$(pidof oxenstored)/exe | grep
'Build ID'
Build ID: bdd5304c8984ed22570d51308ae8717d03fe60ae
Build ID: bdd5304c8984ed22570d51308ae8717d03fe60ae

readelf -a /usr/sbin/oxenstored /proc/$(pidof oxenstored)/exe | grep
'Build ID'
Build ID: b44ff99b216db7526e3ee7841068d584cc9c2b95
Build ID: bdd5304c8984ed22570d51308ae8717d03fe60ae


When you're inside a stubdom it is probably not so easy though.

Best regards,
--Edwin

> 
> ~Andrew



Re: [PATCH 3/8] xen/domctl: Introduce and use XEN_DOMCTL_CDF_nested_virt

2020-09-30 Thread Edwin Torok
On Wed, 2020-09-30 at 14:42 +0100, Andrew Cooper wrote:
> Like other major areas of functionality, nested virt (or not) needs
> to be
> known at domain creation time for sensible CPUID handling, and wants
> to be
> known this early for sensible infrastructure handling in Xen.
> 
> Introduce XEN_DOMCTL_CDF_nested_virt and modify libxl to set it
> appropriately
> when creating domains.  There is no need to adjust the ARM logic to
> reject the
> use of this new flag.
> 
> No functional change yet.
> [...]
> diff --git a/tools/ocaml/libs/xc/xenctrl.ml
> b/tools/ocaml/libs/xc/xenctrl.ml
> index 497ded7ce2..e878699b0a 100644
> --- a/tools/ocaml/libs/xc/xenctrl.ml
> +++ b/tools/ocaml/libs/xc/xenctrl.ml
> @@ -64,6 +64,7 @@ type domain_create_flag =
>   | CDF_OOS_OFF
>   | CDF_XS_DOMAIN
>   | CDF_IOMMU
> + | CDF_NESTED_VIRT
>  
>  type domain_create_iommu_opts =
>   | IOMMU_NO_SHAREPT
> diff --git a/tools/ocaml/libs/xc/xenctrl.mli
> b/tools/ocaml/libs/xc/xenctrl.mli
> index f7f6ec570d..e64907df8e 100644
> --- a/tools/ocaml/libs/xc/xenctrl.mli
> +++ b/tools/ocaml/libs/xc/xenctrl.mli
> @@ -57,6 +57,7 @@ type domain_create_flag =
>| CDF_OOS_OFF
>| CDF_XS_DOMAIN
>| CDF_IOMMU
> +  | CDF_NESTED_VIRT

OCaml changes LGTM.
Just a reminder that we'll need to apply the xenctrl changes to the
mock xenctrl used by the xapi-project CI at 
https://github.com/lindig/xenctrl too.

Best regards,
--Edwin


Re: oxenstored performance issue when starting VMs in parallel

2020-09-22 Thread Edwin Torok
On Tue, 2020-09-22 at 15:17 +0200, jerome leseinne wrote:
> Hi,
> 
> Edwin you rock ! This call in qemu is effectively the culprit !
> I have disabled this xen_bus_add_watch call and re-run test on our
> big server:
> 
> - oxenstored is now  between 10% to 20%  CPU usage (previously was
> 100% all the time)
> - All our VMs are responsive
> - All our VM start in less than 10 seconds (before the fix some VMs
> could take more than one minute to be fully up
> - Dom0 is more responsive
> 
> Disabling the watch may not be the ideal solution ( I let the qemu
> experts answer this and the possible side effects),

Hi,

CC-ed Qemu maintainer of Xen code, please see this discussion about
scalability issues with the backend watching code in qemu 4.1+.

I think the scalability issue is due to this code in qemu, which causes
an instance of qemu to see watches from all devices (even those
belonging to other qemu instances), such that adding a single device
causes N watches to be fired on each N instances of qemu:
  xenbus->backend_watch =
   xen_bus_add_watch(xenbus, "", /* domain root node */
 "backend", xen_bus_backend_changed,
 _err);
 
I can understand that for backwards compatibility you might need this
code, but is there a way that an up-to-date (xl) toolstack could tell
qemu what it needs to look at (e.g. via QMP, or other keys in xenstore)
instead of relying on an overly broad watch?

Best regards,
--Edwin


>  but in our
> scenario and usage this fixes the problem and dramatically boosts the
> performance.
> 
> So far we haven't seen any side effect, all the xl orders are ok, the
> VMs are fully functional, no devices leak (like network vif for
> exemple) and once all the VMs are down a call to xenstore-ls show
> that
> the store is indeed empty (dom0 excluded)
> 
> We will continue additional testing and stress but in all cases a
> huge thanks to you and Julien  for your help on this issue !
> 
> Jerome
> 
> Le lun. 21 sept. 2020 à 18:57, Edwin Torok  a
> écrit :
> > On Mon, 2020-09-21 at 17:40 +0100, Julien Grall wrote:
> > > [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open
> > > attachments unless you have verified the sender and know the
> > > content
> > > is safe.
> > > 
> > > On 21/09/2020 14:05, jerome leseinne wrote:
> > > > Hello,
> > > 
> > > Hello,
> > > 
> > > I am only CCing Edwin who is working on OXenStored. Hopefully, he
> > > will
> > > be able to give you some pointers.
> > > 
> > > > We are developing a solution based on Xen 4.13 who is
> > > > constantly
> > > > creating / destroying VMs.
> > > > 
> > > > To summarize our lifecycle :
> > > > 
> > > > - xl restore vmX
> > > > - xl cd-insert 
> > > > - We do our stuff for ~ 2 minutes
> > > > - xl destroy vmX
> > > > 
> > > > So our VMs have a life of approximately 2 minutes.
> > > > 
> > > > The number of VMs we ran in parallel depends on the underlying
> > > > server.
> > > > 
> > > > We are seeing the issue with our larger server who is running
> > > > 30
> > > > VMs
> > > > (HVM) in parallel.
> > > > 
> > > > On this server oxenstored is constantly running at 100% cpu
> > > > usage
> > > > and
> > > > some VMs are almost stucked or unresponsive.
> > > > 
> > > > This is not an hardware issue, 72 xeon cores, 160 GB of memory
> > > > and
> > > > very fast I/O subsystem.
> > > > Everything else is running smoothly on the server.
> > > > 
> > > > what we witness in the xenstore-access.log is that the number
> > > > of
> > > > WATCH
> > > > event is matching the number of currently running VMs
> > > > 
> > > > so for example for a single WRITE event is followed by around
> > > > 30
> > > > watch events :
> > > > 
> > > > [20200918T15:15:18.045Z]  A41354   write
> > > > /local/domain/0/backend/qdisk/1311/5632
> > > > [20200918T15:15:18.046Z]  A41248   w event
> > > > backend/qdisk/1311/5632 38ed11d9-9a38-4022-ad75-7c571d4886ed
> > > > [20200918T15:15:18.046Z]  A41257   w event
> > > > backend/qdisk/1311/5632 98fa91b8-e88b-4667-9813-d95196257288
> > > > [20200918T15:15:18.046Z]  A40648   w event
> > > > backend/qdisk/1311/5632 e6fd9a35-61ec-4750-93eb-999fb7f

Re: [PATCH v4 2/4] Map: backport find_opt/update from 4.06

2020-08-28 Thread Edwin Torok
On Fri, 2020-08-28 at 10:30 +0200, Christian Lindig wrote:
> +let find_opt k t =
> 
> +   (* avoid raising exceptions, they can be expensive *)
> 
> +   if mem k t then Some (find k t) else None
> 
> 
> 
> I disagree with this argument. Exceptions in OCaml are cheap because
> they don't walk the stack and cut to the exception handler directly.
> Is there a reason why they could be expensive here? In any case, the
> code is correct.
> 

Interesting question, it is best to measure.

The answer depends on whether the key is found or not in the map.
I'll change the impl to the one with find+catch, I think we  might be
looking up keys that are present more often than those that are missing
(although the 4.06 series will make this moot, 4.06 version is faster
than both approaches, regardless whether key is present or not).

To sum up the measurements below:
If the key is not found then my approach is faster (takes only 80% of
time), if the key is found then find+catch is faster (again an approx
80% of the time taken).

I based my comment on the documentation of raise_notrace, which says:
"A faster version raise which does not record the backtrace.",
which implies that recording the backtrace has a measurable perf
impact.
One could argue that if performance matters backtraces should be turned
off in production, but I think the value of having backtraces when some
hard-to-reprodue bug occurs outweights any perf penalty.
We should try to use exceptions only for unexpected situations though,
not finding a value in a map doesn't qualify.

See the attachment for a small micro-benchmark:
$ dune exec --profile=release -- ./updatet.exe raise
Estimated testing time 20s (2 benchmarks x 10s). Change using '-quota'.
┌───┬──┬┐
│ Name  │ Time/Run │ Percentage │
├───┼──┼┤
│ raise │  33.52ns │100.00% │
│ raise_notrace │  19.16ns │ 57.16% │
└───┴──┴┘

So raising with a backtrace is measurably slower, taking the backtrace
spends some CPU cycles.

$ dune exec --profile=release -- ./updatet.exe find-opt
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──┬──┬┐
│ Name │ Time/Run │ Percentage │
├──┼──┼┤
│ find_opt 4.06:10 │  49.10ns │ 24.06% │
│ find_opt 4.06:100│ 115.38ns │ 56.55% │
│ find_opt 4.06:1000   │ 161.27ns │ 79.03% │
│ find_opt=mem+find:10 │  50.48ns │ 24.74% │
│ find_opt=mem+find:100│ 110.39ns │ 54.10% │
│ find_opt=mem+find:1000   │ 162.48ns │ 79.63% │
│ find_opt=find+catch:10   │  89.10ns │ 43.67% │
│ find_opt=find+catch:100  │ 160.80ns │ 78.80% │
│ find_opt=find+catch:1000 │ 204.04ns │100.00% │
└──┴──┴┘


4.06 and mem+find take 80% of the time of find+catch.

But of course if the key is actually found in the map then we have
this: 
edwin@edvin-tower:~/uex % dune exec --profile=release -- ./updatet.exe
find-opt-found
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──┬──┬─┬┐
│ Name │ Time/Run │ mWd/Run │ Percentage │
├──┼──┼─┼┤
│ find_opt 4.06:10 │  38.38ns │   2.00w │ 52.65% │
│ find_opt 4.06:100│  20.66ns │   2.00w │ 28.35% │
│ find_opt 4.06:1000   │  20.63ns │   2.00w │ 28.30% │
│ find_opt=mem+find:10 │  72.89ns │   2.00w │100.00% │
│ find_opt=mem+find:100│  39.06ns │   2.00w │ 53.59% │
│ find_opt=mem+find:1000   │  39.07ns │   2.00w │ 53.60% │
│ find_opt=find+catch:10   │  49.54ns │   2.00w │ 67.97% │
│ find_opt=find+catch:100  │  33.01ns │   2.00w │ 45.29% │
│ find_opt=find+catch:1000 │  32.97ns │   2.00w │ 45.23% │
└──┴──┴─┴┘

In this case find+catch is faster.

And here is update for a key that is not present:
$ dune exec --profile=release -- ./updatet.exe update
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌───┬──┬─┬┐
│ Name  │ Time/Run │ mWd/Run │ Percentage │
├───┼──┼─┼┤
│ update 4.06:10│  79.96ns │  24.00w │ 17.27% │
│ update 4.06:100   │ 171.96ns │  48.00w │ 37.15% │
│ update 4.06:1000  │ 243.95ns │  66.00w │ 52.70% │
│ update=find+catch+add/remove:10   │ 183.46ns │  24.00w │ 39.63% │
│ update=find+catch+add/remove:100  │ 340.00ns │  48.00w │ 73.45% │
│ update=find+catch+add/remove:1000 │ 462.89ns │  66.00w │100.00% │
│ update=mem+find+add/remove:10 │ 126.06ns │  24.00w │ 27.23% │
│ update=mem+find+add/remove:100│ 274.79ns 

Re: [PATCH v1 0/6] tools/ocaml/xenstored: simplify code

2020-08-27 Thread Edwin Torok
On Thu, 2020-08-27 at 09:41 +, Wei Liu wrote:
> On Tue, Aug 18, 2020 at 01:40:18PM +0100, Andrew Cooper wrote:
> > On 18/08/2020 10:25, Christian Lindig wrote:
> > > I see little reason to support old OCaml releases and requiring
> > > OCaml 4.06 would be fine with me but I assume that the project
> > > might have its own ideas about this.
> > > 
> > > ____
> > > From: Edwin Torok
> > > Sent: 18 August 2020 08:28
> > > To: Christian Lindig; xen-devel@lists.xenproject.org
> > > Cc: Ian Jackson; d...@recoil.org; w...@xen.org
> > > Subject: Re: [PATCH v1 0/6] tools/ocaml/xenstored: simplify code
> > > 
> > > On Mon, 2020-08-17 at 14:56 +0200, Christian Lindig wrote:
> > > > This all looks good - I left a small comment on one of the
> > > > patches
> > > > and I agree that this needs testing. I also wonder about
> > > > compatibility with earlier OCaml releases that we support but I
> > > > see
> > > > no real obstacles.
> > > > 
> > > I've developed the series using OCaml 4.08.1. I think the newest
> > > feature I used was Map.update (OCaml 4.06, nearly 3 years ago).
> > > Looking through https://repology.org/project/ocaml/versions I'm
> > > not
> > > sure if we can require more than 4.05 though.
> > > The README in Xen doesn't specify a minimum version, but
> > > configure
> > > checks for >=4.02.
> > > 
> > > I can try to backport my series to OCaml 4.05 (to use
> > > Map.find_opt
> > > instead of Map.update) and update the configure check to require
> > > 4.05.
> > > It would be possible to backport even further to 4.02 by
> > > introducing
> > > additional inefficiencies (Map.mem + Map.find would traverse the
> > > map
> > > twice, and Map.find on its own would raise an exception on Not
> > > found,
> > > which is more costly than returning None in Map.find_opt), I'd
> > > avoid
> > > doing that.
> > > 
> > > Xen's CI from automation might need some updates to use latest
> > > stable
> > > versions:
> > > * Fedora 29 is EOL, should use at least Fedora 31
> > > * Debian Jessie is EOL. Stretch is present, but Buster is missing
> > 
> > We're working on the CI loop.
> > 
> > As maintainer, it is ultimately Christian's choice to as to if/when
> > to
> > bump the minimum versions.
> > 
> > 
> > As a general rule, we don't want to be sufficiently bleeding edge
> > to
> > rule out in-use distros.  I have no idea if 4.06 is ok there, or
> > whether
> > it is too new.  Then again, the Ocaml components are strictly
> > optional
> > so it is perhaps less important.
> > 
> > Whatever happens WRT version, the configure change should occur
> > before
> > changes in the code which would fail on older versions.
> 
> Yes I would like to see the bump happen before applying a version of
> this series too.

I have a branch that removes the requirement on 4.06 from the series,
and then have a separate one that bumps to 4.06 and removes the
redundant and inefficient workaround.
In the end I only had to backport 2 functions: Map.find_opt and
Map.update.

Will send out the updated series for review once I've done some more
testing on it.

Best regards,
--Edwin

> 
> Wei.
> 
> > ~Andrew


Re: [PATCH v1 0/6] tools/ocaml/xenstored: simplify code

2020-08-18 Thread Edwin Torok
On Mon, 2020-08-17 at 14:56 +0200, Christian Lindig wrote:
> This all looks good - I left a small comment on one of the patches
> and I agree that this needs testing. I also wonder about
> compatibility with earlier OCaml releases that we support but I see
> no real obstacles.
> 

I've developed the series using OCaml 4.08.1. I think the newest
feature I used was Map.update (OCaml 4.06, nearly 3 years ago).
Looking through https://repology.org/project/ocaml/versions I'm not
sure if we can require more than 4.05 though.
The README in Xen doesn't specify a minimum version, but configure
checks for >=4.02.

I can try to backport my series to OCaml 4.05 (to use Map.find_opt
instead of Map.update) and update the configure check to require 4.05.
It would be possible to backport even further to 4.02 by introducing
additional inefficiencies (Map.mem + Map.find would traverse the map
twice, and Map.find on its own would raise an exception on Not found,
which is more costly than returning None in Map.find_opt), I'd avoid
doing that.

Xen's CI from automation might need some updates to use latest stable
versions:
* Fedora 29 is EOL, should use at least Fedora 31
* Debian Jessie is EOL. Stretch is present, but Buster is missing

Best regards,
--Edwin


Re: [PATCH v1 5/6] tools/ocaml/xenstored: use more efficient node trees

2020-08-17 Thread Edwin Torok
On Mon, 2020-08-17 at 14:52 +0200, Christian Lindig wrote:
> +let compare a b =
> +  if equal a b then 0
> +  else -(String.compare a b)
> 
> I think this bit could use an inline comment why the sort order is
> reversed. This could be also simplified to -(String.compare a b)
> because this goes to the internal (polymorphic) compare implemented
> in C which does a physical equivalence check first.

Good point, I've dropped the equal, and instead of negating the compare
I swapped its arguments.

See V3 of the patch (ignore V2, for some reason it looked nearly
identical to V1, not matching what I had in my git tree,
perhaps git-format-patch didn't overwrite the patches?).

Best regards,
--Edwin

> 
> -- C
> 
> ____
> From: Edwin Torok
> Sent: 14 August 2020 23:14
> To: xen-devel@lists.xenproject.org
> Cc: Edwin Torok; Christian Lindig; David Scott; Ian Jackson; Wei Liu
> Subject: [PATCH v1 5/6] tools/ocaml/xenstored: use more efficient
> node trees
> 
> This changes the output of xenstore-ls to be sorted.
> Previously the keys were listed in the order in which they were
> inserted
> in.
> docs/misc/xenstore.txt doesn't specify in what order keys are listed.
> 
> Map.update is used to retain semantics with replace_child:
> only an existing child is replaced, if it wasn't part of the original
> map we don't add it.
> Similarly exception behaviour is retained for del_childname and
> related
> functions.
> 
> Entries are stored in reverse sort order, so that upon Map.fold the
> constructed list is sorted in ascending order and there is no need
> for a
> List.rev.
> 
> Signed-off-by: Edwin Török 
> ---
>  tools/ocaml/xenstored/store.ml   | 46 +++---
> --
>  tools/ocaml/xenstored/symbol.ml  |  4 +++
>  tools/ocaml/xenstored/symbol.mli |  3 +++
>  3 files changed, 29 insertions(+), 24 deletions(-)
> 
> diff --git a/tools/ocaml/xenstored/store.ml
> b/tools/ocaml/xenstored/store.ml
> index 45659a23ee..d9dfa36045 100644
> --- a/tools/ocaml/xenstored/store.ml
> +++ b/tools/ocaml/xenstored/store.ml
> @@ -16,17 +16,19 @@
>   *)
>  open Stdext
> 
> +module SymbolMap = Map.Make(Symbol)
> +
>  module Node = struct
> 
>  type t = {
> name: Symbol.t;
> perms: Perms.Node.t;
> value: string;
> -   children: t list;
> +   children: t SymbolMap.t;
>  }
> 
>  let create _name _perms _value =
> -   { name = Symbol.of_string _name; perms = _perms; value =
> _value; children = []; }
> +   { name = Symbol.of_string _name; perms = _perms; value =
> _value; children = SymbolMap.empty; }
> 
>  let get_owner node = Perms.Node.get_owner node.perms
>  let get_children node = node.children
> @@ -42,38 +44,34 @@ let set_value node nvalue =
>  let set_perms node nperms = { node with perms = nperms }
> 
>  let add_child node child =
> -   { node with children = child :: node.children }
> +   let children = SymbolMap.add child.name child node.children
> in
> +   { node with children }
> 
>  let exists node childname =
> let childname = Symbol.of_string childname in
> -   List.exists (fun n -> Symbol.equal n.name childname)
> node.children
> +   SymbolMap.mem childname node.children
> 
>  let find node childname =
> let childname = Symbol.of_string childname in
> -   List.find (fun n -> Symbol.equal n.name childname)
> node.children
> +   SymbolMap.find childname node.children
> 
>  let replace_child node child nchild =
> -   (* this is the on-steroid version of the filter one-replace
> one *)
> -   let rec replace_one_in_list l =
> -   match l with
> -   | []   -> []
> -   | h :: tl when Symbol.equal h.name child.name ->
> nchild :: tl
> -   | h :: tl  -> h ::
> replace_one_in_list tl
> -   in
> -   { node with children = (replace_one_in_list node.children) }
> +   { node with
> + children = SymbolMap.update child.name
> +(function None -> None | Some _ -> Some nchild)
> +node.children
> +   }
> 
>  let del_childname node childname =
> let sym = Symbol.of_string childname in
> -   let rec delete_one_in_list l =
> -   match l with
> -   | []-> raise Not_found
> -   | h :: tl when Symbol.equal h.name sym -> tl
> -   | h :: tl   -> h ::
> delete_one_in_list tl
> -   in
> -   { node with children = (delete_one_in_l

Re: [PATCH v1 4/6] tools/ocaml/xenstored: drop select based

2020-08-17 Thread Edwin Torok
On Mon, 2020-08-17 at 09:59 +, Wei Liu wrote:
> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open
> attachments unless you have verified the sender and know the content
> is safe.
> 
> The subject line seems to be cut off half way.
> 
> "Drop select based $SOMETHING"?

$SOMETHING = socket watching, I'll fix the message in V2.

> 
> Wei.
> 


Re: [PATCH 2/2] tools/ocaml: Default to useful build output

2020-07-20 Thread Edwin Torok
On Mon, 2020-07-20 at 10:38 +0200, Christian Lindig wrote:
> > Time for a bit of controversy.
> 
> OCaml outside Xen has moved to a different model of building based on
> dune which is fast, declarative and reliable. The OCaml xenstore is
> stagnating because nobody with OCaml experience wants to touch it
> anymore. It would be beneficial for the health of the OCaml xenstore
> to split it out such that it could be worked on independently.

AFAIK there are 2 unstable interfaces used by oxenstored, decoupling it
would make the version of oxenstored more independent from the version
of hypervisor: 
https://andrewcoop-xen.readthedocs.io/en/docs-devel/misc/tech-debt.html#remove-xenstored-s-dependencies-on-unstable-interfaces
IIUC this would also allow some code to be droped from the hypervisor
where oxenstored is the last user of the unstable interface.

>  You might argue that Make is still appropriate for building OCaml
> projects but the OCaml community has moved through several build
> systems, starting from Make, and learned the hard way that this is
> not an easy problem. After years of more-or-less successful build
> system the consensus is that dune is right one and it has resulted in
> combination with the Opam package manager the ecosystem to flourish.
> Alternatively, it would be possible to move OCaml xenstore to dune
> within the Xen tree but it would create a dependency on it.

I'd welcome a Dune based build-system.

The current Makefile based build-system doesn't handle dependencies
correctly for incremental development: I often have to run 'make clean'
in order to successfully build xenstored after changing an .ml file,
otherwise the linker fails with 'inconsistent assumptions over
interface', indicating that Make hasn't rebuilt something that it
should have. (For those unfamiliar with this issue, see the
'Motivation' section in 
https://nicolaspouillard.fr/ocamlbuild/ocamlbuild-user-guide.html)
It also lacks generation of .merlin files (for editor integration, e.g.
Vim or Emacs), which you get for free with Dune.

We could still retain a Makefile as an entrypoint that launches Dune
with appropriate flags, which aside from adding a build requirement on
Dune wouldn't require changes to package building.

I think a nice way forward here would be to try to write a minimal
binding to gnttab to replicate 
https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=38eeb3864d in
OCaml, this would both demonstrate the benefits of Dune (making
contribution easier), and reduce technical debt within Xen.

Best regards,
--Edwin



Re: [PATCH v5] docs/designs: re-work the xenstore migration document...

2020-05-05 Thread Edwin Torok
On Tue, 2020-05-05 at 14:13 +0100, Jürgen Groß wrote:
> On 05.05.20 15:01, Julien Grall wrote:
> > Hi Paul,
> > 
> > On 28/04/2020 16:06, Paul Durrant wrote:
> > > From: Paul Durrant 
> > > 
> > > ... to specify a separate migration stream that will also be
> > > suitable for
> > > live update.
> > > 
> > > The original scope of the document was to support non-
> > > cooperative 
> > > migration
> > > of guests [1] but, since then, live update of xenstored has been 
> > > brought into
> > > scope. Thus it makes more sense to define a separate image format
> > > for
> > > serializing xenstore state that is suitable for both purposes.
> > > 
> > > The document has been limited to specifying a new image format.
> > > The 
> > > mechanism
> > > for acquiring the image for live update or migration is not
> > > covered as 
> > > that
> > > is more appropriately dealt with by a patch to
> > > docs/misc/xenstore.txt. 
> > > It is
> > > also expected that, when the first implementation of live update
> > > or 
> > > migration
> > > making use of this specification is committed, that the document
> > > is 
> > > moved from
> > > docs/designs into docs/specs.
> > > 
> > > NOTE: It will only be necessary to save and restore state for
> > > active 
> > > xenstore
> > >connections, but the documentation for 'RESUME' in
> > > xenstore.txt 
> > > implies
> > >otherwise. That command is unused so this patch deletes it
> > > from 
> > > the
> > >specification.
> 
> Could someone from Citrix please verify that XAPI isn't using
> XS_RESUME?

The implementation of XS_RESUME in oxenstored doesn't do much: it seems
to be just a way for Dom0 to check whether a domain exists or not, and
for a domain to check whether they are Dom0 or not.
If the domain exists, then the resume implementation just returns `()`,
i.e. does nothing.

I can't find any references to Xs.resume in xenopsd (or the other XAPI
repos that I got checked out), so I think it can be safely removed from
the spec and client libraries (I'd keep it in the actual oxenstored
implementation just in case some guest does call it).

Best regards,
--Edwin

> 
> 
> Juergen
> 


Re: [PATCH] docs: update xenstore migration design document

2020-04-15 Thread Edwin Torok
On Tue, 2020-04-14 at 17:59 +0200, Juergen Gross wrote:
> In the past there have been several attempts to make Xenstore
> restartable. This requires to transfer the internal Xenstore state to
> the new instance. With the Xenstore migration protocol added recently
> to Xen's documentation a first base has been defined to represent the
> state of Xenstore. This can be expanded a little bit in order to have
> a full state representation which is needed as a first step for live
> updating Xenstore.
> 
> Add some definitions to designs/xenstore-migration.md which are
> needed
> for live update of xenstored.
> 
> Signed-off-by: Juergen Gross 
> ---
>  docs/designs/xenstore-migration.md | 90
> --
>  1 file changed, 87 insertions(+), 3 deletions(-)
> 
> diff --git a/docs/designs/xenstore-migration.md
> b/docs/designs/xenstore-migration.md
> index 6ab351e8fe..09bb4700b4 100644
> --- a/docs/designs/xenstore-migration.md
> +++ b/docs/designs/xenstore-migration.md
> @@ -9,6 +9,10 @@ records must include details of registered xenstore
> watches as well as
>  content; information that cannot currently be recovered from
> `xenstored`,
>  and hence some extension to the xenstore protocol[2] will also be
> required.
>  
> +As a similar set of data is needed for transferring xenstore data
> from
> +one instance to another when live updating xenstored the same
> definitions
> +are being used.
> +
>  The *libxenlight Domain Image Format* specification[3] already
> defines a
>  record type `EMULATOR_XENSTORE_DATA` but this is not suitable for
>  transferring xenstore data pertaining to the domain directly as it
> is
> @@ -48,7 +52,10 @@ where type is one of the following values
>  || 0x0001: NODE_DATA|
>  || 0x0002: WATCH_DATA   |
>  || 0x0003: TRANSACTION_DATA |
> -|| 0x0004 - 0x: reserved for future use |
> +|| 0x0004: TRANSACTION_NODE_DATA|
> +|| 0x0005: GUEST_RING_DATA  |
> +|| 0x0006: DOMAIN_START (live update only)  |
> +|| 0x0007 - 0x: reserved for future use |
>  
>  
>  and data is one of the record data formats described in the
> following
> @@ -79,7 +86,7 @@ as follows:
>  +---+
>  | perm count (N)|
>  +---+
> -| perm0 |
> +| perm1 |
>  +---+
>  ...
>  +---+
> @@ -93,7 +100,7 @@ as follows:
>  +---+
>  ```
>  
> -where perm0..N are formatted as follows:
> +where perm1..N are formatted as follows:
>  
>  
>  ```
> @@ -164,6 +171,83 @@ as follows:
>  where tx_id is the non-zero identifier values of an open
> transaction.
>  
>  
> +**TRANSACTION_NODE_DATA**
> +
> +
> +Each TRANSACTION_NODE_DATA record specifies a transaction local
> xenstore
> +node. Its is similar to the NODE_DATA record with the addition of a
> +transaction id:
> +
> +```
> +0   1   2   3 octet
> ++---+---+---+---+
> +| TRANSACTION_NODE_DATA |
> ++---+
> +| tx_id |
> ++---+
> +| path length   |
> ++---+
> +| path data |
> +...
> +| pad (0 to 3 octets)   |
> ++---+
> +| perm count (N)|
> ++---+
> +| perm1 |
> ++---+
> +...
> ++---+
> +| permN |
> ++---+
> +| value length  |
> ++---+
> +| value data|
> +...
> +| pad (0 to 3 octets)   |
> ++---+
> +```
> +
> +where perm1..N are formatted as specified in the NODE_DATA record. A
> perm
> +count of 0 denotes a node having been deleted in the transaction.


oxenstored also tracks the number of operations that a transaction has
performed, which includes readonly operations AFAICT, which cannot be
inferred from counting TRANSACTION_NODE_DATA entries.
I think the operation count would have to be serialized as part of
TRANSACTION_DATA.

> +
> +
> +**GUEST_RING_DATA**
> +
> +
> +The GUEST_RING_DATA record is used to transfer data which is pending
> to be
> +written to the guest's xenstore ring buffer. It si formatted as 

typo: s/si/is/

> follows:
> +
> +
> +```
> ++---+---+---+---+
> +| GUEST_RING_DATA   |
> ++---+
> +| value length  |
> ++---+
> +| value data|
> +...
> +| pad (0 to 3 octets)   |
> ++---+
>