from:"Anthony PERARD"

Xen Project Winter Meetup 2025

2024-11-07 Thread Anthony PERARD

Hello everyone!

Vates is pleased to organize the next Xen Project Winter Meetup. With
this in mind, we're excited to announce that we'll be hosting the Xen
Winter Summit in January 2025, on Thursday 30th and Friday 31st.

Here's what's in store: After a welcoming breakfast, we'll dive into two
days of talks and collaboration, organized as follows:

 - Morning: Talk sessions
 - Afternoon: Design sessions

Whether you're already a Xen developer or just starting out with system
development skills, this event will give you the opportunity to work on
Xen topics to improve the hypervisor. You'll collaborate with Xen
maintainers and help newcomers, including academics, join the project.
We encourage you to register and submit a proposal if you'd like to
share on a related topic - all contributions are welcome, and we'll
review each with great interest.

You can register and submit your proposal at
https://campaign.vates.tech/xen-project-winter-meetup
at the bottom of the page.

This site also contains all the information related to the event.

We're also organizing a private tour of Aconit, Europe's largest and
oldest computer museum collection, on Thursday afternoon, as well as a
gourmet evening out in Grenoble. A detailed schedule of presentations
and activities will follow soon. The link above also provides guides for
accommodations, directions to the university venue, and activities,
should you wish to enjoy the snow over the weekend or explore Grenoble's
local culture.

Please don't hesitate to reach out for any additional information.

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] tools/libxl: remove usage of VLA arrays

2024-11-05 Thread Anthony PERARD

On Mon, Oct 28, 2024 at 12:03:59PM +, Andrew Cooper wrote:
> On 28/10/2024 11:48 am, Roger Pau Monne wrote:
> > Clang 19 complains with the following error when building libxl:
> >
> > libxl_utils.c:48:15: error: variable length array folded to constant array 
> > as an extension [-Werror,-Wgnu-folding-constant]
> >48 | char path[strlen("/local/domain") + 12];
> >   |   ^~~~
> >
> > Replace the usage of strlen() with ARRAY_SIZE(), which allows the literal
> > string length to be known at build time.  Note ARRAY_SIZE() accounts for the
> > NUL terminator while strlen() didn't, hence subtract 1 from the total size
> > calculation.
> >
> > Signed-off-by: Roger Pau Monné 
> > ---
> >  tools/libs/light/libxl_utils.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/libs/light/libxl_utils.c b/tools/libs/light/libxl_utils.c
> > index 10398a6c8611..b3f5e751cc3f 100644
> > --- a/tools/libs/light/libxl_utils.c
> > +++ b/tools/libs/light/libxl_utils.c
> > @@ -45,7 +45,7 @@ unsigned long libxl_get_required_shadow_memory(unsigned 
> > long maxmem_kb, unsigned
> >  char *libxl_domid_to_name(libxl_ctx *ctx, uint32_t domid)
> >  {
> >  unsigned int len;
> > -char path[strlen("/local/domain") + 12];
> > +char path[ARRAY_SIZE("/local/domain") + 11];
> >  char *s;
> >
> >  snprintf(path, sizeof(path), "/local/domain/%d/name", domid);
> > @@ -141,7 +141,7 @@ int libxl_cpupool_qualifier_to_cpupoolid(libxl_ctx 
> > *ctx, const char *p,
> >  char *libxl_cpupoolid_to_name(libxl_ctx *ctx, uint32_t poolid)
> >  {
> >  unsigned int len;
> > -char path[strlen("/local/pool") + 12];
> > +char path[ARRAY_SIZE("/local/pool") + 11];
> >  char *s;
> >
> >  snprintf(path, sizeof(path), "/local/pool/%d/name", poolid);
>
> Acked-by: Andrew Cooper 
>
> Although I have a minor preference for sizeof() as suggested by Frediano.

I have a preference for sizeof() too, we even used it this way (more or
less) in libxl before, for `eom` here:

https://elixir.bootlin.com/xen/v4.19.0/source/tools/libs/light/libxl_qmp.c#L1608

I was a bit supprised by the use of ARRAY_SIZE on a string literal but
it's just an array of char :-).

For the patch, with sizeof() or ARRAY_SIZE():
Acked-by: Anthony PERARD 

Thanks,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v1 4/6] CI: Refresh the Debian 12 arm32 cross compile container

2024-10-24 Thread Anthony PERARD

On Thu, Oct 24, 2024 at 11:04:19AM +0100, Javi Merino wrote:
> diff --git a/automation/build/debian/12-arm64v8-arm32-gcc.dockerfile 
> b/automation/build/debian/12-arm64v8-arm32-gcc.dockerfile
> new file mode 100644
> index ..bdc935706bfa
> --- /dev/null
> +++ b/automation/build/debian/12-arm64v8-arm32-gcc.dockerfile
> @@ -0,0 +1,28 @@
> +# syntax=docker/dockerfile:1
> +FROM --platform=linux/arm64/v8 debian:bookworm
> +LABEL maintainer.name="The Xen Project" \
> +  maintainer.email="xen-devel@lists.xenproject.org"

Recent update of other dockerfile was repeating the "LABEL" intruction
for each label, maybe we should continue to do that (which avoid the
backslash at the end of lines).

> +
> +ENV DEBIAN_FRONTEND=noninteractive
> +ENV CROSS_COMPILE /usr/bin/arm-linux-gnueabihf-

While the syntax "ENV VAR value" is supported, it's been discouraged by
docker's doc (https://docs.docker.com/reference/dockerfile/#env). Also,
here we have two ENV instructions with two different syntax, could you
use the = for declaring CROSS_COMPILE as well?

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] tools/pvhsim: Inherit the tools debug= setting

2024-10-24 Thread Anthony PERARD

On Tue, Oct 22, 2024 at 05:25:00PM +0100, Andrew Cooper wrote:
> diff --git a/tools/firmware/xen-dir/Makefile b/tools/firmware/xen-dir/Makefile
> index 6f5e208ab413..e19916f76722 100644
> --- a/tools/firmware/xen-dir/Makefile
> +++ b/tools/firmware/xen-dir/Makefile
> @@ -1,4 +1,5 @@
>  XEN_ROOT = $(CURDIR)/../../..
> +include $(XEN_ROOT)/tools/Rules.mk
>  
>  all: xen-shim
>  
> @@ -75,6 +76,8 @@ $(D): linkfarm.stamp
>  
>  $(D)/xen/.config: $(D)
>   $(MAKE) -C $(@D) KBUILD_DEFCONFIG=pvshim_defconfig defconfig
> + echo "CONFIG_DEBUG=$(if $(debug),y,n)" >> $@

I don't think this does what you think it does ;-).

In the GitLab CI, we have both "export debug=y" and "export debug=n", in
both case that $(if ) will return 'y' because $(debug) is non-empty.

For one-liner to do this, there's a few option in Make and shell:
$(if $(filter y,$(debug)),y,n)
$$([ "$(debug)" = y ] && echo y || echo n)
$$(case "$(debug)" in y) echo y;; *) echo n;; esac)

For the $(filter ) option, if one does `debug='y no' make` the $(if )
will expand to 'y', but it's probably good enough.

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Scheduled Maintenance of xenbits, wiki, mail

2024-10-21 Thread Anthony PERARD

Hello everyone,

We planned to upgrade a few host of xenproject.org in the coming week.

This will affect access to git repository (on xenbits), download of Xen
release tarballs (on mail), access to the wiki, and email delivery
delayed for some.

The upgrade will start at 09:00 European Time (so 07:00 then 08:00 UTC)
and will take a few hours on those dates:

- 2024-10-23: mail.xenproject.org
- 2024-10-24: wiki.xenproject.org
- 2024-10-28: xenbits.xenproject.org

Cheers,

-- 
Anthony PERARD

Re: [XEN PATCH v1 3/3] CI: Refresh and upgrade the Fedora container

2024-10-17 Thread Anthony PERARD

On Thu, Oct 17, 2024 at 05:20:21PM +0100, Javi Merino wrote:
> From: Andrew Cooper 
> 
> Fedora 29 is long out of date.  Move forward 5 years to Fedora 40.
> 
> Include all the usual improvements.  Rework the container to be non-root, use
> heredocs for legibility, and switch to the new naming scheme.
> 
> Signed-off-by: Andrew Cooper 
> Signed-off-by: Javi Merino 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v1 1/3] automation: Fix URL to the gitlab container registry documentation

2024-10-17 Thread Anthony PERARD

On Thu, Oct 17, 2024 at 05:20:19PM +0100, Javi Merino wrote:
> The gitlab documentation is now at
> https://docs.gitlab.com/ee/administration/packages/container_registry.html

This link seems to be for self-managed instance of gitlab, but the link
in the patch looks fine.

> Signed-off-by: Javi Merino 

The patch description could be removed, so:
Reviewed-by: Anthony PERARD 

Thanks,

> ---
>  automation/build/README.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/automation/build/README.md b/automation/build/README.md
> index bd0c0e041804..ecc898680c91 100644
> --- a/automation/build/README.md
> +++ b/automation/build/README.md
> @@ -110,7 +110,7 @@ make -C automation/build opensuse/tumbleweed-x86_64 PUSH=1
>  
>  [BuildKit]: https://docs.docker.com/build/buildkit/
>  [registry]: https://gitlab.com/xen-project/xen/container_registry
> -[registry help]: https://gitlab.com/help/user/project/container_registry
> +[registry help]: https://docs.gitlab.com/ee/user/packages/container_registry/
>  
>  
>  Building/Running container for a different architecture

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v6 1/5] x86/boot: create a C bundle for 32 bit boot code and use it

2024-10-17 Thread Anthony PERARD

On Thu, Oct 17, 2024 at 02:31:19PM +0100, Frediano Ziglio wrote:
> +$(obj)/build32.base.lds: AFLAGS-y += -DGAP=$(text_gap) 
> -DTEXT_DIFF=$(text_diff)
> +$(obj)/build32.offset.lds: AFLAGS-y += -DGAP=$(text_gap) 
> -DTEXT_DIFF=$(text_diff) -DFINAL

I was somehow expecting "base" and "offset" to be the other way around,
but that's fine. And by the way, -DFINAL cancel changes to GAP and
TEXT_DIFF ;-).

But overall, the changes looks nice as is,
Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v4 2/6] x86/boot: create a C bundle for 32 bit boot code and use it

2024-10-17 Thread Anthony PERARD

On Wed, Oct 16, 2024 at 04:05:00PM +0100, Frediano Ziglio wrote:
> On Wed, Oct 16, 2024 at 12:25 PM Anthony PERARD >  
> wrote:
> > On Wed, Oct 16, 2024 at 09:33:32AM +0100, Frediano Ziglio wrote:
> > > On Tue, Oct 15, 2024 at 2:51 PM Anthony PERARD 
> > >  wrote:
> > > > I can think of one example where $(if_changed,) is going to really help,
> > > > by looking at this command line:
> > > > One does update the .c file to add a function that they like to
> > > > export, run `make`, realize they forgot to update the makefile so
> > > > update it, run `make`, it's still doesn't work...
> > > > Maybe run `make clean; make`, or something else...
> > > >
> > > > So, could you use $(if_changed,) ?
> > > > Probably:
> > > > quiet_cmd_combine = GEN $@
> > > > cmd_combine = $(PYTHON) ...
> > > > $(obj)/built_in_32.S: $(obj)/built_in_32.other.bin 
> > > > $(obj)/built_in_32.final.bin FORCE
> > > > $(call if_changes,combine)
> > > > targets += built_in_32.S
> > > >
> > > > GEN, for generate, or it could be PY instead, because python script can
> > > > be slow to compile which could explain why the build system output is
> > > > making a pause on this target (on slow machines that is). Or it could be
> > > > COMBINE, or something else, but it's not really necessary to explain,
> > > > the target name is often enough to figure out what's happening, when
> > > > needed.
> > > >
> > >
> > > It just looks more complicated to me.
> >
> > I'm sorry if writing makefile is complicated. GNU make doesn't help with
> > writing build system that work well, especially when doing incremental
> > builds. So we need to use more complicated construction, especially for
> > a complex project like Xen.
> >
>
> It was more a balance consideration. Considering the cases that seem
> to solve (and your case did not much apply) I don't feel that worth.
> Also, dependency to Makefile would solve without additional macros and
> indirection. Do you mind posting a full working change?

Sure, here it is (I notice I've misspell the macro name in what I've written
earlier):

quiet_cmd_combine = GEN $@
cmd_combine = \
$(PYTHON) $(srctree)/tools/combine_two_binaries.py \
--script $(obj)/build32.final.lds \
    --bin1 $(obj)/built_in_32.other.bin \
--bin2 $(obj)/built_in_32.final.bin \
--map $(obj)/built_in_32.final.map \
--exports cmdline_parse_early,reloc \
--output $@

targets += built_in_32.S

$(obj)/built_in_32.S: $(obj)/built_in_32.other.bin $(obj)/built_in_32.final.bin 
\
$(srctree)/tools/combine_two_binaries.py FORCE
$(call if_changed,combine)

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v4 2/6] x86/boot: create a C bundle for 32 bit boot code and use it

2024-10-16 Thread Anthony PERARD

On Wed, Oct 16, 2024 at 09:33:32AM +0100, Frediano Ziglio wrote:
> On Tue, Oct 15, 2024 at 2:51 PM Anthony PERARD  
> wrote:
> > On Mon, Oct 14, 2024 at 05:32:26PM +0100, Frediano Ziglio wrote:
> > > On Mon, Oct 14, 2024 at 4:31 PM Anthony PERARD 
> > >  wrote:
> > > > On Mon, Oct 14, 2024 at 09:53:28AM +0100, Frediano Ziglio wrote:

> > BTW, do we need the rules that generate the input of this rules
> > (built_in_32.tmp.o that is), or can this one takes all 32bit objects as
> > input?
> >
>
> Better not to do it In some conditions it can generate slightly
> different results (like different object alignments) making the
> algorithm fail.

Ok. Thanks for the explanation.

> > > > > +# generate final object file combining and checking above binaries
> > > > > +$(obj)/built_in_32.S: $(obj)/built_in_32.other.bin 
> > > > > $(obj)/built_in_32.final.bin
> > > >
> > > > So, "other" isn't part of "final", I don't really know what those two
> > > > things contains so naming wise I can't suggest anything useful.
> >
> > Instead of "other", is "control" (like in science experiment where you
> > have a control group), or "offseted" (which seems to be how this second
> > binary is constructed) would be better names for this *.bin? It seems
> > the script take both input and play the game of the 7 differences, to
> > find clues about the location of some symbols, right?.
> >
>
> I don't know the game and I think people not familiar with it won't
> find new names more readable but less.

Sorry, the "game" as nothing to do with the name I've proposed. I was
just asking if the script take both *.bin and was looking for
differences.

(The game of 7 differences is simple: there's two similar pictures and
you just look for the 7 differences between them, that's it.)

> Not saying that current names are good, they just need to be located
> at different addresses with some "magic" in the middle.

Well to me "other" evoke a binary that contains functions that are not
in "final", but instead they both contain the sames functions with
slight variation of placement in the file (with added offset, gap), as I
understand. But if you don't like my proposal, so be it.

> > I can think of one example where $(if_changed,) is going to really help,
> > by looking at this command line:
> > One does update the .c file to add a function that they like to
> > export, run `make`, realize they forgot to update the makefile so
> > update it, run `make`, it's still doesn't work...
> > Maybe run `make clean; make`, or something else...
> >
> > So, could you use $(if_changed,) ?
> > Probably:
> > quiet_cmd_combine = GEN $@
> > cmd_combine = $(PYTHON) ...
> > $(obj)/built_in_32.S: $(obj)/built_in_32.other.bin 
> > $(obj)/built_in_32.final.bin FORCE
> > $(call if_changes,combine)
> > targets += built_in_32.S
> >
> > GEN, for generate, or it could be PY instead, because python script can
> > be slow to compile which could explain why the build system output is
> > making a pause on this target (on slow machines that is). Or it could be
> > COMBINE, or something else, but it's not really necessary to explain,
> > the target name is often enough to figure out what's happening, when
> > needed.
> >
>
> It just looks more complicated to me.

I'm sorry if writing makefile is complicated. GNU make doesn't help with
writing build system that work well, especially when doing incremental
builds. So we need to use more complicated construction, especially for
a complex project like Xen.

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v4 3/6] x86/boot: Reuse code to relocate trampoline

2024-10-15 Thread Anthony PERARD

On Mon, Oct 14, 2024 at 09:53:29AM +0100, Frediano Ziglio wrote:
> diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
> index 23ad274c89..ca258a9729 100644
> --- a/xen/arch/x86/boot/Makefile
> +++ b/xen/arch/x86/boot/Makefile
> @@ -1,12 +1,16 @@
>  obj-bin-y += head.o
>  obj-bin-y += built_in_32.o
> +obj-bin-y += $(obj64)

Could you move this so $(obj64) is fully set when added to $(obj-bin-y)?
There's no garanties that this is going to keep working. We already have
"obj-y :=", so it is possible we will have "obj-bin-y :=" one day. (Or
that we get rid of $(obj-bin-y) because it becomes unnecessary.)

>  obj32 := cmdline.32.o
>  obj32 += reloc.32.o
> +obj32 += reloc-trampoline.32.o
>  
> -nocov-y   += $(obj32)
> -noubsan-y += $(obj32)
> -targets   += $(obj32)
> +obj64 := reloc-trampoline.o
> +
> +nocov-y   += $(obj32) $(obj64)
> +noubsan-y += $(obj32) $(obj64)
> +targets   += $(obj32) $(obj64)

Technically, the change to $(targets) isn't necessary, $(obj-bin-y)
should already be added to it.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v4 2/6] x86/boot: create a C bundle for 32 bit boot code and use it

2024-10-15 Thread Anthony PERARD

On Mon, Oct 14, 2024 at 05:32:26PM +0100, Frediano Ziglio wrote:
> On Mon, Oct 14, 2024 at 4:31 PM Anthony PERARD
>  wrote:
> >
> > On Mon, Oct 14, 2024 at 09:53:28AM +0100, Frediano Ziglio wrote:
> > > diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
> > > index 1199291d2b..23ad274c89 100644
> > > --- a/xen/arch/x86/boot/Makefile
> > > +++ b/xen/arch/x86/boot/Makefile
> > > @@ -1,4 +1,5 @@
> > >  obj-bin-y += head.o
> > > +obj-bin-y += built_in_32.o
> >
> > I don't quite like that this new object name is "built_in_32.o",
> > It's really closed to "built_in.*" which is used by Rules.mk to collect
> > all objects in a subdirectory. I don't really have a better suggestion at
> > the moment.
> >
>
> It was cbundle.o before, but people preferred built_in_32.o.
> It's a collection of object files like built_in.o so it does not seem
> so bad to me.
> But seen other replies, some other people prefer "bundle".

Yeah, I guess it's ok (built_in_32 that is; now that I gave a better
review of the rest of the Makefile changes).

> > > @@ -9,9 +10,6 @@ targets   += $(obj32)
> > >
> > >  obj32 := $(addprefix $(obj)/,$(obj32))
> > >
> > > -$(obj)/head.o: AFLAGS-y += -Wa$(comma)-I$(obj)
> > > -$(obj)/head.o: $(obj32:.32.o=.bin)
> > > -
> > >  CFLAGS_x86_32 := $(subst -m64,-m32 -march=i686,$(XEN_TREEWIDE_CFLAGS))
> > >  $(call cc-options-add,CFLAGS_x86_32,CC,$(EMBEDDED_EXTRA_CFLAGS))
> > >  CFLAGS_x86_32 += -Werror -fno-builtin -g0 -msoft-float -mregparm=3
> > > @@ -25,14 +23,34 @@ $(obj32): XEN_CFLAGS := $(CFLAGS_x86_32) -fpic
> > >  $(obj)/%.32.o: $(src)/%.c FORCE
> > >   $(call if_changed_rule,cc_o_c)
> > >
> > > +orphan-handling-$(call ld-option,--orphan-handling=error) := 
> > > --orphan-handling=error
> > >  LDFLAGS_DIRECT-$(call ld-option,--warn-rwx-segments) := 
> > > --no-warn-rwx-segments
> > >  LDFLAGS_DIRECT += $(LDFLAGS_DIRECT-y)
> > >  LD32 := $(LD) $(subst x86_64,i386,$(LDFLAGS_DIRECT))
> > >
> > > -%.bin: %.lnk
> > > - $(OBJCOPY) -j .text -O binary $< $@
> > > -
> > > -%.lnk: %.32.o $(src)/build32.lds
> > > - $(LD32) -N -T $(filter %.lds,$^) -o $@ $<
> > > -
> > > -clean-files := *.lnk *.bin
> > > +$(obj)/build32.final.lds: AFLAGS-y += -DFINAL
> > > +$(obj)/build32.other.lds $(obj)/build32.final.lds: $(src)/build32.lds.S 
> > > FORCE
> > > + $(call if_changed_dep,cpp_lds_S)

Could you add:

targets += build32.final.lds build32.other.lds

otherwise, the if_changes macro doesn't work, and the target keeps been
rebuilt. `make V=2` will tell you the same thing.

> > > +# link all 32bit objects together
> > > +$(obj)/built_in_32.tmp.o: $(obj32)
> > > + $(LD32) -r -o $@ $^
> > > +
> > > +$(obj)/built_in_32.%.bin: $(obj)/build32.%.lds $(obj)/built_in_32.tmp.o
> > > +## link bundle with a given layout
> >
> > Could you put the comment aligned with the rest of the recipe? That way
> > it won't visually separate the rule into several pieces. You can
> > prefix it with @ so it doesn't show in build logs:
> >
> > @# comments
> >
>
> Yes, they look more or less the same but technically pretty different.
> The "## XXX" is a comment for make command, the "\t@#comment" is a way
> to tell make to not print the command before launching it so make will
> launch a shell command "# comment".
> Yes, indentation does not make it clear. Not that launching a shell to
> execute a comment will take a long time. On the other hand, here that
> comment is more for the rule than for the shell. Maybe something like
>
> target:
>   command \
># do something
>
> (personally I found it even more ugly)

How about removing the comments? I mean the command line is
self-explanatory here. Same thing about the objcopy comment.
(Or a comment before the rule, when really needed)

> > > + $(LD32) $(orphan-handling-y) -N -T $< -Map 
> > > $(obj)/built_in_32.$(*F).map -o $(obj)/built_in_32.$(*F).o 
> > > $(obj)/built_in_32.tmp.o
> >
> > I think this wants to be: -T $(filter %.lds,$^) -Map $(patsubst 
> > %.bin,%.map,$@) -o $(patsubst %.bin,%.o,$@) $(filter %.o,$^)
> >
> > :-(, maybe that's lots of $(patsubst,), not sure which is better between
> > $(patsubst,) and using the stem $*.
> >
>
> Not strong about stem or patsubst.

Actually, I found a

Re: [PATCH v4 2/6] x86/boot: create a C bundle for 32 bit boot code and use it

2024-10-14 Thread Anthony PERARD

/boot/build32.lds
> +++ b/xen/arch/x86/boot/build32.lds.S
> @@ -15,22 +15,47 @@
>   * with this program.  If not, see <http://www.gnu.org/licenses/>.
>   */
>  
> -ENTRY(_start)
> +#ifdef FINAL
> +# define GAP 0
> +# define MULT 0
> +# define TEXT_START
> +#else
> +# define GAP 0x010200
> +# define MULT 1
> +# define TEXT_START 0x408020
> +#endif
> +# define DECLARE_IMPORT(name) name = . + (__LINE__ * MULT)

Is  ^ this a stray space?


Overall, it's kind of mostly style comment that I had, so feel free to
ignore most of them.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 2/3] tools/xenstored: remove unneeded libxenguest reference

2024-10-11 Thread Anthony PERARD

On Thu, Oct 10, 2024 at 05:54:58PM +0200, Juergen Gross wrote:
> Today the xenstored Makefile contains an unneeded reference to the
> not used libxenguest library.
> 
> Remove it.
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v2] stubdom: add fine grained library config items to Mini-OS configs

2024-10-10 Thread Anthony PERARD

On Thu, Oct 10, 2024 at 01:19:46PM +0200, Juergen Gross wrote:
> Today Mini-OS can only be configured to use all or none Xen library.
> In order to prepare a more fine grained configuration scheme, add per
> library config items to the Mini-OS config files.
> 
> As some libraries pull in others, the config files need to be
> extended at build time to reflect those indirect library uses.
> 
> Signed-off-by: Juergen Gross 
> ---
> V2:
> - rename BUILD_config to GEN_CONFIG (Anthony PERARD)
> - rename generated config files to *.gen.cfg (Anthony PERARD)
> - don't write config data to file in make macro (Anthony PERARD)
> - remove no longer needed $(CURDIR)/ from prerequisites (Anthony PERARD)

Looks good to me.

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] CI: Stop building QEMU in general

2024-10-08 Thread Anthony PERARD

On Tue, Oct 08, 2024 at 04:50:23PM +0100, Andrew Cooper wrote:
> diff --git a/automation/gitlab-ci/build.yaml b/automation/gitlab-ci/build.yaml
> index c668736bdc2f..c83e0bdbe119 100644
> --- a/automation/gitlab-ci/build.yaml
> +++ b/automation/gitlab-ci/build.yaml
> @@ -339,6 +339,7 @@ alpine-3.18-gcc-debug:
>extends: .gcc-x86-64-build-debug
>variables:
>  CONTAINER: alpine:3.18
> +QEMU: y

Could you use a different name for the variable? This is exposed as an
environment variable, it could easily be used in a build system already,
like used to store a path to a QEMU to use.

We don't really have a name space for CI variable, but maybe
BUILD_QEMU or BUILD_QEMU_XEN would be less likely to clash with other
usages.

Otherwise patch looks fine.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] CI: Drop bin86/dev86 from archlinux container

2024-10-07 Thread Anthony PERARD

On Mon, Oct 07, 2024 at 10:56:37AM +0100, Andrew Cooper wrote:
> These packages have moved out of main to AUR, and are not easily accessible
> any more.  Drop them, because they're only needed for RomBIOS which is very
> legacy these days.
>
> Signed-off-by: Andrew Cooper 
> Reviewed-by: Roger Pau Monné 

Reviewed-by: Anthony PERARD 

Thanks,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 4/4] stubdom: add fine grained library config items to Mini-OS configs

2024-10-07 Thread Anthony PERARD

On Sat, Oct 05, 2024 at 05:15:48PM +0200, Juergen Gross wrote:
> diff --git a/stubdom/Makefile b/stubdom/Makefile
> index 8c503c2bf8..3b501a0710 100644
> --- a/stubdom/Makefile
> +++ b/stubdom/Makefile
> @@ -340,6 +340,14 @@ endef
>  
>  $(foreach lib,$(STUB_LIBS),$(eval $(call BUILD_lib,$(lib
>  
> +define BUILD_config
> + cp $< $@
> + for i in $(sort $(APP_LIBS) $(call xenlibs-dependencies,$(APP_LIBS))); do \
> +   u=`echo $$i | tr a-z A-Z`; \
> +   echo "CONFIG_LIBXEN$$u=y"; \
> + done >> $@
> +endef

I don't think I like having a recipe hidden like that in a variable,
maybe if it was a full rule it would be a bit less annoying to me. But
how about something slightly different:

First, the name, "GEN_config" would be a bit better, then we could have
it only do the output and not writing any file:

define GEN_config
 (cat '$<' && \
 for i in $(sort $(APP_LIBS) $(call xenlibs-dependencies,$(APP_LIBS))); do \
   u=`echo $$i | tr a-z A-Z`; \
   echo "CONFIG_LIBXEN$$u=y"; \
 done)
endef

The this can be used in rules as:
$(GEN_config) > $@

Would that be ok?

(It might be better to have the macro not depends on the environment
have take parameter explicitly which could be used as $(call
GEN_config,$<,evtchn gnttab) > $@ or take a variable if it's useful
elsewhere, but I'm already fine if $@ is taken out of the macro.)

> +
>  xenstore/stamp: $(XEN_ROOT)/tools/xenstored/Makefile.common
>   $(do_links)
>  
> @@ -373,8 +381,12 @@ $(TARGETS_MINIOS): mini-os-%:
>  # ioemu
>  ###
>  
> -ioemu-minios-config.mk: $(CURDIR)/ioemu-minios.cfg
> - MINIOS_CONFIG="$<" CONFIG_FILE="$(CURDIR)/$@" $(MAKE) DESTDIR= -C 
> $(MINI_OS) config
> +ioemu-minios.out.cfg: APP_LIBS = evtchn gnttab ctrl guest
> +ioemu-minios.out.cfg: $(CURDIR)/ioemu-minios.cfg Makefile

Could you change the suffix to ".gen.cfg"? ".out.cfg" is a bit generic
while "generated" is more common for the kind of file that are
automatically generated by the build system for it's own use.

BTW, in the first prerequisite, $(CURDIR) isn't necessary anymore, it
was only to be used in "MINIOS_CONFIG" just below.

> + $(BUILD_config)
> +
> +ioemu-minios-config.mk: ioemu-minios.out.cfg
> + MINIOS_CONFIG="$(CURDIR)/$<" CONFIG_FILE="$(CURDIR)/$@" $(MAKE) 
> DESTDIR= -C $(MINI_OS) config

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 3/4] build: move xenlibs-dependencies make definition to uselibs.mk

2024-10-07 Thread Anthony PERARD

On Sat, Oct 05, 2024 at 05:15:47PM +0200, Juergen Gross wrote:
> In order to be able to use the xenlibs-dependencies macro from stubdom
> build, move it to tools/libs/uselibs.mk, which is included from
> current users and stubdom/Makefile.
> 
> No functional change intended.
> 
> Signed-off-by: Juergen Gross 

Acked-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 1/4] stubdom: add local .gitignore file

2024-10-07 Thread Anthony PERARD

On Sat, Oct 05, 2024 at 05:15:45PM +0200, Juergen Gross wrote:
> Add a stubdom specfic .gitignore file.

More like "Move stubdom specific ignored file into it." because there's
no changes to the list of ignored paths, and we don't usually need to
repeate the title of the commit in its description in the Xen repo.

> Signed-off-by: Juergen Gross 

In any case all looks fine:
Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] CI: Fix builds following qemu-xen update

2024-10-04 Thread Anthony PERARD

On Fri, Oct 04, 2024 at 06:27:23PM +0100, Andrew Cooper wrote:
> A recent update to qemu-xen has bumped the build requirements, with Python 3.8
> being the new baseline but also needing the 'ensurepip' and 'tomllib/tomli'
> packages.
> 
>  * Ubuntu/Debian package 'ensurepip' separately, but it can be obtained by
>installing the python3-venv package.
> 
>  * 'tomllib' was added to the python standard library in Python 3.11, but
>previously it was a separate package named 'tomli'.
> 
> In terms of changes required to build QEMU:
> 
>  * Ubuntu 24.04 (Noble) has Python 3.12 so only needs python3-venv
> 
>  * Ubuntu 22.04 (Jammy) has Python 3.10 but does have a python3-tomli package
>that QEMU is happy with.
> 
>  * FreeBSD has Python 3.9, but Python 3.11 is available.
> 
> In terms of exclusions:
> 
>  * Ubuntu 20.04 (Focal) has Python 3.8, but lacks any kind of tomli package.
> 
>  * Fedora 29 (Python 3.7), OpenSUSE Leap 15.6 (Python 3.6), and Ubuntu
>18.04/Bionic (Python 3.6) are now too old.
> 
> Detecting tomllib/tomli is more than can fit in build's oneliner, so break it
> out into a proper script.
> 
> Signed-off-by: Andrew Cooper 

The changes on the gitlab side look fine. I don't know if the changes on
the cirrus side are ok, but at least the seems to work, after looking at
the build logs. So:

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] stubdom: Fix newlib build with GCC-14

2024-10-03 Thread Anthony PERARD

On Wed, Oct 02, 2024 at 11:45:31PM +0100, Andrew Cooper wrote:
> Based on a fix from OpenSUSE, but adjusted to be Clang-compatible too.  Pass
> -Wno-implicit-function-declaration library-wide rather than using local GCC
> pragmas.
> 
> Fix of copy_past_newline() to avoid triggering -Wstrict-prototypes.
> 
> Link: https://build.opensuse.org/request/show/1178775
> Signed-off-by: Andrew Cooper 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[XEN PATCH 1/2] blkif: Fix alignment description for discard request

2024-09-26 Thread Anthony PERARD

The discard feature have an other xenstore node to described the size
of the blocks than can be discarded, "discard-granularity", which
default to "sector-size" when absent as noted in the properties and in
note 4. So discard request should be aligned on this value.

Fixes: 221f2748e8da ("blkif: reconcile protocol specification with in-use 
implementations")
Signed-off-by: Anthony PERARD 
---
 xen/include/public/io/blkif.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
index 9b00d633d3..789bab65ab 100644
--- a/xen/include/public/io/blkif.h
+++ b/xen/include/public/io/blkif.h
@@ -668,7 +668,7 @@ typedef struct blkif_request blkif_request_t;
  *
  * The 'sector_number' field is in units of 512b, despite the value of the
  * 'sector-size' xenstore node.  Note however that the offset in
- * 'sector_number' must be aligned to 'sector-size'.
+ * 'sector_number' must be aligned to 'discard-granularity'.
  */
 struct blkif_request_discard {
 uint8_toperation;/* BLKIF_OP_DISCARD */
-- 


Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[XEN PATCH 0/2] blkif: Fix discard req align requirement and various typos

2024-09-26 Thread Anthony PERARD

Patch series available in this git branch:
https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git 
br.blkif-wording-fix-v1

Cheers,

Anthony PERARD (2):
  blkif: Fix alignment description for discard request
  blkif: Fix a couple of typos

 xen/include/public/io/blkif.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 


Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[XEN PATCH 2/2] blkif: Fix a couple of typos

2024-09-26 Thread Anthony PERARD

Those where fixed in OVMF's copy. (And one of them fixed in QEMU's
copy but later discarded by an update.)

Signed-off-by: Anthony PERARD 
---
 xen/include/public/io/blkif.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
index 789bab65ab..8407453324 100644
--- a/xen/include/public/io/blkif.h
+++ b/xen/include/public/io/blkif.h
@@ -42,7 +42,7 @@
  * All data in the XenStore is stored as strings.  Nodes specifying numeric
  * values are encoded in decimal.  Integer value ranges listed below are
  * expressed as fixed sized integer types capable of storing the conversion
- * of a properly formated node string, without loss of information.
+ * of a properly formatted node string, without loss of information.
  *
  * Any specified default value is in effect if the corresponding XenBus node
  * is not present in the XenStore.
@@ -328,7 +328,7 @@
  *  access (even when it should be read-only). If the frontend hits the
  *  maximum number of allowed persistently mapped grants, it can fallback
  *  to non persistent mode. This will cause a performance degradation,
- *  since the the backend driver will still try to map those grants
+ *  since the backend driver will still try to map those grants
  *  persistently. Since the persistent grants protocol is compatible with
  *  the previous protocol, a frontend driver can choose to work in
  *  persistent mode even when the backend doesn't support it.
-- 


Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[PATCH 1/2] include: update Xen public headers io/blkif.h

2024-09-26 Thread Anthony PERARD

Signed-off-by: Anthony PERARD 
---
 include/hw/xen/interface/io/blkif.h | 52 +
 1 file changed, 39 insertions(+), 13 deletions(-)

diff --git a/include/hw/xen/interface/io/blkif.h 
b/include/hw/xen/interface/io/blkif.h
index 22f1eef0c0..9b00d633d3 100644
--- a/include/hw/xen/interface/io/blkif.h
+++ b/include/hw/xen/interface/io/blkif.h
@@ -237,12 +237,16 @@
  * sector-size
  *  Values: 
  *
- *  The logical block size, in bytes, of the underlying storage. This
- *  must be a power of two with a minimum value of 512.
+ *  The logical block size, in bytes, of the underlying storage. This must
+ *  be a power of two with a minimum value of 512.  The sector size should
+ *  only be used for request segment length and alignment.
  *
- *  NOTE: Because of implementation bugs in some frontends this must be
- *set to 512, unless the frontend advertizes a non-zero value
- *in its "feature-large-sector-size" xenbus node. (See below).
+ *  When exposing a device that uses a logical sector size of 4096, the
+ *  only difference xenstore wise will be that 'sector-size' (and possibly
+ *  'physical-sector-size' if supported by the backend) will be 4096, but
+ *  the 'sectors' node will still be calculated using 512 byte units.  The
+ *  sector base units in the ring requests fields will all be 512 byte
+ *  based despite the logical sector size exposed in 'sector-size'.
  *
  * physical-sector-size
  *  Values: 
@@ -254,9 +258,9 @@
  * sectors
  *  Values: 
  *
- *  The size of the backend device, expressed in units of "sector-size".
- *  The product of "sector-size" and "sectors" must also be an integer
- *  multiple of "physical-sector-size", if that node is present.
+ *  The size of the backend device, expressed in units of 512b.  The
+ *  product of "sectors" * 512 must also be an integer multiple of
+ *  "physical-sector-size", if that node is present.
  *
  *
  *Frontend XenBus Nodes
@@ -338,6 +342,7 @@
  * feature-large-sector-size
  *  Values: 0/1 (boolean)
  *  Default Value:  0
+ *  Notes:  DEPRECATED, 12
  *
  *  A value of "1" indicates that the frontend will correctly supply and
  *  interpret all sector-based quantities in terms of the "sector-size"
@@ -411,6 +416,11 @@
  *(10) The discard-secure property may be present and will be set to 1 if the
  * backing device supports secure discard.
  *(11) Only used by Linux and NetBSD.
+ *(12) Possibly only ever implemented by the QEMU Qdisk backend and the Windows
+ * PV block frontend.  Other backends and frontends supported 'sector-size'
+ * values greater than 512 before such feature was added.  Frontends should
+ * not expose this node, neither should backends make any decisions based
+ * on it being exposed by the frontend.
  */
 
 /*
@@ -619,11 +629,14 @@
 #define BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST 8
 
 /*
- * NB. 'first_sect' and 'last_sect' in blkif_request_segment, as well as
- * 'sector_number' in blkif_request, blkif_request_discard and
- * blkif_request_indirect are sector-based quantities. See the description
- * of the "feature-large-sector-size" frontend xenbus node above for
- * more information.
+ * NB. 'first_sect' and 'last_sect' in blkif_request_segment are all in units
+ * of 512 bytes, despite the 'sector-size' xenstore node possibly having a
+ * value greater than 512.
+ *
+ * The value in 'first_sect' and 'last_sect' fields must be setup so that the
+ * resulting segment offset and size is aligned to the logical sector size
+ * reported by the 'sector-size' xenstore node, see 'Backend Device Properties'
+ * section.
  */
 struct blkif_request_segment {
 grant_ref_t gref;/* reference to I/O buffer frame*/
@@ -634,6 +647,10 @@ struct blkif_request_segment {
 
 /*
  * Starting ring element for any I/O request.
+ *
+ * The 'sector_number' field is in units of 512b, despite the value of the
+ * 'sector-size' xenstore node.  Note however that the offset in
+ * 'sector_number' must be aligned to 'sector-size'.
  */
 struct blkif_request {
 uint8_toperation;/* BLKIF_OP_??? */
@@ -648,6 +665,10 @@ typedef struct blkif_request blkif_request_t;
 /*
  * Cast to this structure when blkif_request.operation == BLKIF_OP_DISCARD
  * sizeof(struct blkif_request_discard) <= sizeof(struct blkif_request)
+ *
+ * The 'sector_number' field is in units of 512b, despite the value of the
+

[PATCH 0/2] Xen: Update sector-size handling in block backend

2024-09-26 Thread Anthony PERARD

The specification have been clarified regarding what "sector" is in Xen PV
block protocol by Xen commit 221f2748e8da ("blkif: reconcile protocol
specification with in-use implementations") and "feature-large-sector-size"
have been removed.

https://xenbits.xenproject.org/gitweb/?p=xen.git;a=commit;h=221f2748e8dabe8361b8cdfcffbeab9102c4c899

This update the header and the backend.

Thanks,

Anthony PERARD (2):
  include: update Xen public headers io/blkif.h
  hw/block/xen-block: Update sector-size handling

 hw/block/dataplane/xen-block.c  | 31 -
 hw/block/xen-block.c|  8 ++---
 include/hw/xen/interface/io/blkif.h | 52 +
 3 files changed, 65 insertions(+), 26 deletions(-)

-- 


Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[PATCH 2/2] hw/block/xen-block: Update sector-size handling

2024-09-26 Thread Anthony PERARD

The use of "feature-large-sector-size" have been removed from the
protocol, as it hasn't been evenly implemented across all backend and
frontend. Linux for example will happily expose "sector-size" != 512
even when "feature-large-sector-size" hasn't been set by the frontend.

The specification have been clarified regarding what "sector" is by
Xen commit 221f2748e8da ("blkif: reconcile protocol specification with
in-use implementations").

https://xenbits.xenproject.org/gitweb/?p=xen.git;a=commit;h=221f2748e8dabe8361b8cdfcffbeab9102c4c899

So change QEMU to follow the updated specification.

Frontends that exposes "feature-large-sector-size" will most certainly
misbehave if "sector-size" is different than 512, so don't even try.
(Windows driver is likely to be the only one having implemented it.)

Signed-off-by: Anthony PERARD 
---
 hw/block/dataplane/xen-block.c | 31 ++-
 hw/block/xen-block.c   |  8 
 2 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 98501e6885..43be97b4f3 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -176,7 +176,11 @@ static int xen_block_parse_request(XenBlockRequest 
*request)
 goto err;
 }
 
-request->start = request->req.sector_number * dataplane->sector_size;
+request->start = request->req.sector_number * XEN_BLKIF_SECTOR_SIZE;
+if (!QEMU_IS_ALIGNED(request->start, dataplane->sector_size)) {
+error_report("error: sector_number missaligned with sector-size");
+goto err;
+}
 for (i = 0; i < request->req.nr_segments; i++) {
 if (i == BLKIF_MAX_SEGMENTS_PER_REQUEST) {
 error_report("error: nr_segments too big");
@@ -186,14 +190,23 @@ static int xen_block_parse_request(XenBlockRequest 
*request)
 error_report("error: first > last sector");
 goto err;
 }
-if (request->req.seg[i].last_sect * dataplane->sector_size >=
+if (request->req.seg[i].last_sect * XEN_BLKIF_SECTOR_SIZE >=
 XEN_PAGE_SIZE) {
 error_report("error: page crossing");
 goto err;
 }
+if (!QEMU_IS_ALIGNED(request->req.seg[i].first_sect,
+ dataplane->sector_size / XEN_BLKIF_SECTOR_SIZE)) {
+error_report("error: first_sect missaligned with sector-size");
+goto err;
+}
 
 len = (request->req.seg[i].last_sect -
-   request->req.seg[i].first_sect + 1) * dataplane->sector_size;
+   request->req.seg[i].first_sect + 1) * XEN_BLKIF_SECTOR_SIZE;
+if (!QEMU_IS_ALIGNED(len, dataplane->sector_size)) {
+error_report("error: segment size missaligned with sector-size");
+goto err;
+}
 request->size += len;
 }
 if (request->start + request->size > blk_getlength(dataplane->blk)) {
@@ -227,17 +240,17 @@ static int xen_block_copy_request(XenBlockRequest 
*request)
 if (to_domain) {
 segs[i].dest.foreign.ref = request->req.seg[i].gref;
 segs[i].dest.foreign.offset = request->req.seg[i].first_sect *
-dataplane->sector_size;
+XEN_BLKIF_SECTOR_SIZE;
 segs[i].source.virt = virt;
 } else {
 segs[i].source.foreign.ref = request->req.seg[i].gref;
 segs[i].source.foreign.offset = request->req.seg[i].first_sect *
-dataplane->sector_size;
+XEN_BLKIF_SECTOR_SIZE;
 segs[i].dest.virt = virt;
 }
 segs[i].len = (request->req.seg[i].last_sect -
request->req.seg[i].first_sect + 1) *
-  dataplane->sector_size;
+  XEN_BLKIF_SECTOR_SIZE;
 virt += segs[i].len;
 }
 
@@ -331,12 +344,12 @@ static bool xen_block_split_discard(XenBlockRequest 
*request,
 
 /* Wrap around, or overflowing byte limit? */
 if (sec_start + sec_count < sec_count ||
-sec_start + sec_count > INT64_MAX / dataplane->sector_size) {
+sec_start + sec_count > INT64_MAX / XEN_BLKIF_SECTOR_SIZE) {
 return false;
 }
 
-byte_offset = sec_start * dataplane->sector_size;
-byte_remaining = sec_count * dataplane->sector_size;
+byte_offset = sec_start * XEN_BLKIF_SECTOR_SIZE;
+byte_remaining = sec_count * XEN_BLKIF_SECTOR_SIZE;
 
 do {
 byte_chunk = byte_remaining > BDRV_REQUEST_MAX_BYTES ?
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index aed1d5c330..8c150c6c69 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -189,10 +189,10 @@ static

Re: [PATCH v3] blkif: reconcile protocol specification with in-use implementations

2024-09-26 Thread Anthony PERARD

On Thu, Sep 12, 2024 at 11:57:29AM +0200, Roger Pau Monne wrote:
>  /*
>   * Cast to this structure when blkif_request.operation == BLKIF_OP_DISCARD
>   * sizeof(struct blkif_request_discard) <= sizeof(struct blkif_request)
> + *
> + * The 'sector_number' field is in units of 512b, despite the value of the
> + * 'sector-size' xenstore node.  Note however that the offset in
> + * 'sector_number' must be aligned to 'sector-size'.

For discard request, there's "discard-granularity", and I think
`sector_number` should be aligned to it. See "discard-granularity" and
note 4.

>   */
>  struct blkif_request_discard {
>      uint8_toperation;/* BLKIF_OP_DISCARD */

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] hw/xen: Remove deadcode

2024-09-17 Thread Anthony PERARD

On Tue, Sep 17, 2024 at 01:22:12AM +0100, d...@treblig.org wrote:
> From: "Dr. David Alan Gilbert" 
> 
> xen_be_copy_grant_refs is unused since 2019's
>   19f87870ba ("xen: remove the legacy 'xen_disk' backend")
> 
> xen_config_dev_console is unused since 2018's
>   6d7c06c213 ("Remove broken Xen PV domain builder")
> 
> Remove them.
> 
> Signed-off-by: Dr. David Alan Gilbert 

Acked-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v2] blkif: reconcile protocol specification with in-use implementations

2024-09-12 Thread Anthony PERARD

On Tue, Sep 10, 2024 at 04:01:50PM +0200, Jürgen Groß wrote:
> On 10.09.24 15:59, Roger Pau Monné wrote:
> > On Tue, Sep 10, 2024 at 03:46:00PM +0200, Jürgen Groß wrote:
> > > On 10.09.24 13:46, Roger Pau Monne wrote:
> > > > diff --git a/xen/include/public/io/blkif.h 
> > > > b/xen/include/public/io/blkif.h
> > > > index 22f1eef0c0ca..da893eb379db 100644
> > > > --- a/xen/include/public/io/blkif.h
> > > > +++ b/xen/include/public/io/blkif.h
> > > > @@ -237,12 +237,16 @@
> > > > * sector-size
> > > > *  Values: 
> > > > *
> > > > - *  The logical block size, in bytes, of the underlying storage. 
> > > > This
> > > > - *  must be a power of two with a minimum value of 512.
> > > > + *  The logical block size, in bytes, of the underlying storage. 
> > > > This must
> > > > + *  be a power of two with a minimum value of 512.  The sector 
> > > > size should
> > > > + *  only be used for request segment length and alignment.
> > > > *
> > > > - *  NOTE: Because of implementation bugs in some frontends this 
> > > > must be
> > > > - *set to 512, unless the frontend advertizes a non-zero 
> > > > value
> > > > - *in its "feature-large-sector-size" xenbus node. (See 
> > > > below).
> > > > + *  When exposing a device that uses 4096 logical sector sizes, 
> > > > the only
> > >
> > > s/uses 4096 logical sector sizes/uses a logical sector size of 4096/
> >
> > Yes, that's better.
> >
> > > > + *  difference xenstore wise will be that 'sector-size' (and 
> > > > possibly
> > > > + *  'physical-sector-size' if supported by the backend) will be 
> > > > 4096, but
> > > > + *  the 'sectors' node will still be calculated using 512 byte 
> > > > units.  The
> > > > + *  sector base units in the ring requests fields will all be 512 
> > > > byte
> > > > + *  based despite the logical sector size exposed in 'sector-size'.
> > > > *
> > > > * physical-sector-size
> > > > *  Values: 
> > > > @@ -254,8 +258,8 @@
> > > > * sectors
> > > >     *      Values: 
> > > > *
> > > > - *  The size of the backend device, expressed in units of 
> > > > "sector-size".
> > > > - *  The product of "sector-size" and "sectors" must also be an 
> > > > integer
> > > > + *  The size of the backend device, expressed in units of 512b.
> > > > + *  The product of "sector-size" * 512 must also be an integer
> > >
> > > DYM: The product of "sectors" * 512 must also be an integer ... ?
> >
> > Indeed.
>
> With those 2 changes applied you can add my:
>
> Reviewed-by: Juergen Gross 

Reviewed-by: Anthony PERARD 

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [RFC XEN PATCH v15 4/4] tools: Add new function to do PIRQ (un)map on PVH dom0

2024-09-12 Thread Anthony PERARD

On Wed, Sep 11, 2024 at 02:58:32PM +0800, Jiqian Chen wrote:
> When dom0 is PVH, and passthrough a device to dumU, xl will
> use the gsi number of device to do a pirq mapping, see
> pci_add_dm_done->xc_physdev_map_pirq, but the gsi number is
> got from file /sys/bus/pci/devices//irq, that confuses
> irq and gsi, they are in different space and are not equal,
> so it will fail when mapping.
> To solve this issue, to get the real gsi and add a new function
> xc_physdev_map_pirq_gsi to get a free pirq for gsi.
> Note: why not use current function xc_physdev_map_pirq, because
> it doesn't support to allocate a free pirq, what's more, to
> prevent changing it and affecting its callers, so add
> xc_physdev_map_pirq_gsi.
> 
> Besides, PVH dom0 doesn't have PIRQs flag, it doesn't do
> PHYSDEVOP_map_pirq for each gsi. So grant function callstack
> pci_add_dm_done->XEN_DOMCTL_irq_permission will fail at function
> domain_pirq_to_irq. And old hypercall XEN_DOMCTL_irq_permission
> requires passing in pirq, it is not suitable for PVH dom0 that
> doesn't have PIRQs to grant irq permission.
> To solve this issue, use the another hypercall
> XEN_DOMCTL_gsi_permission to grant the permission of irq(
> translate from gsi) to dumU when dom0 has no PIRQs.
> 
> Signed-off-by: Jiqian Chen 
> Signed-off-by: Huang Rui 
> Signed-off-by: Chen Jiqian 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [xen-unstable test] 187507: regressions - FAIL

2024-09-06 Thread Anthony PERARD

On Fri, Sep 06, 2024 at 11:07:06AM +0100, Andrew Cooper wrote:
> Interestingly, Gitlab's x86_32 build test missed this.
> 
> https://gitlab.com/xen-project/people/andyhhp/xen/-/jobs/7762103169 passed.
> 
> I wonder if there's anything we should have done to get better coverage.

osstest does `make build` before `make` (equivalent to `make all` or
`make dist`). On gitlab, it's only `make dist`.

`make all` in root Makefile doesn't call `all` in sub-directories,
instead it calls `make install` recursively.

The "install: all" rules seems to only be written in leaf makefile
(probably for the best). Since there's nothing to install for
x86_emulate, it doesn't depends on the "all" rule. So only "osstest" is
going to build the tests.

Cheers,

-- 
Anthony PERARD

Re: [PATCH] blkif: reconcile protocol specification with in-use implementations

2024-09-04 Thread Anthony PERARD

**
> @@ -338,6 +334,7 @@
>   * feature-large-sector-size
>   *  Values: 0/1 (boolean)
>   *  Default Value:  0
> + *  Notes:  DEPRECATED, 12
>   *
>   *  A value of "1" indicates that the frontend will correctly supply and

Could you remove "correctly" from this sentence? It's misleading.

>   *  interpret all sector-based quantities in terms of the "sector-size"
> @@ -411,6 +408,11 @@
>   *(10) The discard-secure property may be present and will be set to 1 if the
>   * backing device supports secure discard.
>   *(11) Only used by Linux and NetBSD.
> + *(12) Possibly only ever implemented by the QEMU Qdisk backend and the 
> Windows
> + * PV block frontend.  Other backends and frontends supported 
> 'sector-size'
> + * values greater than 512 before such feature was added.  Frontends 
> should
> + * not expose this node, neither should backends make any decisions based
> + * on it being exposed by the frontend.
>   */
>
>  /*
> @@ -621,9 +623,12 @@
>  /*
>   * NB. 'first_sect' and 'last_sect' in blkif_request_segment, as well as
>   * 'sector_number' in blkif_request, blkif_request_discard and
> - * blkif_request_indirect are sector-based quantities. See the description
> - * of the "feature-large-sector-size" frontend xenbus node above for
> - * more information.
> + * blkif_request_indirect are all in units of 512 bytes, regardless of 
> whether the
> + * 'sector-size' xenstore node contains a value greater than 512.
> + *
> + * However the value in those fields must be properly aligned to the logical
> + * sector size reported by the 'sector-size' xenstore node, see 'Backend 
> Device
> + * Properties' section.
>   */
>  struct blkif_request_segment {

Textually (that is without reading it) this comment seems to only apply
to `struct blkif_request_segment`. There is an other comment that
separate it from `struct blkif_request` (and it is far away from
blkif_request_discard and blkif_request_indirect). For `struct
blkif_request.sector_number`, the only comment is "start sector idx on
disk" but it's really hard to tell how to interpret it, it could be
interpreted as a "sector-size" quantity because that the size of a
sector on the disk, the underlying storage.

So, I think we need to change the comment of
`blkif_request.sector_number`.


Another thing, there's a "type" `blkif_sector_t` defined at the beginning
of the file, would it be worth it to add a description to it?

Thanks,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] blkif: reconcile protocol specification with in-use implementations

2024-09-04 Thread Anthony PERARD

On Wed, Sep 04, 2024 at 11:11:51AM +0200, Roger Pau Monné wrote:
> On Wed, Sep 04, 2024 at 09:39:17AM +0100, Paul Durrant wrote:
> > On 04/09/2024 09:21, Roger Pau Monné wrote:
> > > > In the absence of that I'm afraid it is a little harder to
> > > > judge whether the proposal here is the best we can do at this point.
> > >
> > > While I don't mind looking at what we can do to better handle 4K
> > > sector disks, we need IMO to revert to the specification before
> > > 67e1c050e36b, as that change switched the hardcoded sector based units
> > > from 512 to 'sector-size', thus breaking the existing ABI.
> > >
> >
> > But that's the crux of the problem. What *is* is the ABI? We apparently
> > don't have one that all OS subscribe to.
>
> At least prior to 67e1c050e36b the specification in blkif.h and (what
> I consider) the reference implementation in Linux blk{front,back}
> matched.  Previous to 67e1c050e36b blkif.h stated:
>
> /*
>  * NB. first_sect and last_sect in blkif_request_segment, as well as
>  * sector_number in blkif_request, are always expressed in 512-byte units.
>  * However they must be properly aligned to the real sector size of the
>  * physical disk, which is reported in the "physical-sector-size" node in
>  * the backend xenbus info. Also the xenbus "sectors" node is expressed in
>  * 512-byte units.
>  */
>
> I think it was quite clear, and does in fact match the implementation
> in Linux.

That's wrong, Linux doesn't match the specification before 67e1c050e36b,
in particular for "sectors":

sectors
 Values: 

 The size of the backend device, expressed in units of its logical
 sector size ("sector-size").

The only implementation that matches this specification is MiniOS (and
OMVF).

Oh, I didn't notice that that random comment you quoted that comes from
the middle of the header have a different definition for "sectors" ...

Well, the specification doesn't match with the specification ... and the
only possible way to implement the specification is to only ever set
"sector-size" to 512...

No wonder that they are so many different interpretation of the
protocol.

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [RFC XEN PATCH v14 5/5] tools: Add new function to do PIRQ (un)map on PVH dom0

2024-09-03 Thread Anthony PERARD

ror message as
> +* it is not easy to distinguished between valid and
> +* spurious error.
> +*/
> +LOGED(ERROR, domid, "xc_physdev_unmap_pirq irq=%d", irq);
> +}
> +rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
> +if (rc < 0) {
> +LOGED(ERROR, domid, "xc_domain_irq_permission irq=%d", irq);
> +}
>  }
> -}
>  
> -fclose(f);
> +fclose(f);
> +}
>  
>  skip_legacy_irq:
>  
> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> index 60643d6f5376..20e3956f09b8 100644
> --- a/tools/libs/light/libxl_x86.c
> +++ b/tools/libs/light/libxl_x86.c
> +int libxl__arch_hvm_map_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
> +{
> +int pirq = -1, gsi, r;
> +
> +gsi = xc_pcidev_get_gsi(CTX->xch, sbdf);
> +if (gsi < 0) {
> +return ERROR_FAIL;
> +}
> +
> +r = xc_physdev_map_pirq_gsi(CTX->xch, domid, gsi, &pirq);
> +if (r < 0) {
> +LOGED(ERROR, domid, "xc_physdev_map_pirq_gsi gsi=%d ret=%d", gsi, r);

`r` should be -1, I don't think loggin it is useful..

> +return ERROR_FAIL;
> +}
> +
> +r = xc_domain_gsi_permission(CTX->xch, domid, gsi, XEN_DOMCTL_GSI_GRANT);
> +if (r < 0) {
> +LOGED(ERROR, domid, "xc_domain_gsi_permission gsi=%d ret=%d", gsi, 
> r);

Same here. And in the next function.

> +return ERROR_FAIL;
> +}
> +
> +return 0;
> +}
> +
> +int libxl__arch_hvm_unmap_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
> +{
> +int pirq = -1, gsi, r;
> +
> +gsi = xc_pcidev_get_gsi(CTX->xch, sbdf);
> +if (gsi < 0) {
> +return ERROR_FAIL;
> +}
> +
> +/* Before unmapping, use mapping to get the already mapped pirq first */
> +r = xc_physdev_map_pirq_gsi(CTX->xch, domid, gsi, &pirq);
> +if (r < 0) {
> +LOGED(ERROR, domid, "xc_physdev_map_pirq_gsi gsi=%d ret=%d", gsi, r);
> +return ERROR_FAIL;
> +}
> +
> +r = xc_physdev_unmap_pirq(CTX->xch, domid, pirq);
> +if (r < 0) {
> +LOGED(ERROR, domid, "xc_physdev_unmap_pirq gsi=%d ret=%d", gsi, r);
> +return ERROR_FAIL;
> +}
> +
> +r = xc_domain_gsi_permission(CTX->xch, domid, gsi, 
> XEN_DOMCTL_GSI_REVOKE);
> +if (r < 0) {
> +LOGED(ERROR, domid, "xc_domain_gsi_permission gsi=%d ret=%d", gsi, 
> r);
> +return ERROR_FAIL;
> +}
> +
> +return 0;
> +}


Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [RFC XEN PATCH v14 4/5] tools: Add new function to get gsi from dev

2024-09-03 Thread Anthony PERARD

On Tue, Sep 03, 2024 at 03:04:23PM +0800, Jiqian Chen wrote:
> diff --git a/tools/include/xen-sys/Linux/privcmd.h 
> b/tools/include/xen-sys/Linux/privcmd.h
> index bc60e8fd55eb..607dfa2287bc 100644
> --- a/tools/include/xen-sys/Linux/privcmd.h
> +++ b/tools/include/xen-sys/Linux/privcmd.h
> @@ -95,6 +95,11 @@ typedef struct privcmd_mmap_resource {
>   __u64 addr;
>  } privcmd_mmap_resource_t;
>  
> +typedef struct privcmd_pcidev_get_gsi {
> + __u32 sbdf;
> + __u32 gsi;
> +} privcmd_pcidev_get_gsi_t;

> diff --git a/tools/libs/ctrl/xc_linux.c b/tools/libs/ctrl/xc_linux.c
> index c67c71c08be3..92591e49a1c8 100644
> --- a/tools/libs/ctrl/xc_linux.c
> +++ b/tools/libs/ctrl/xc_linux.c
> @@ -66,6 +66,26 @@ void *xc_memalign(xc_interface *xch, size_t alignment, 
> size_t size)
>  return ptr;
>  }
>  
> +int xc_pcidev_get_gsi(xc_interface *xch, uint32_t sbdf)
> +{
> +int ret;
> +privcmd_pcidev_get_gsi_t dev_gsi = {
> +.sbdf = sbdf,
> +.gsi = 0,
> +};
> +
> +ret = ioctl(xencall_fd(xch->xcall),
> +IOCTL_PRIVCMD_PCIDEV_GET_GSI, &dev_gsi);
> +
> +if (ret < 0) {
> +PERROR("Failed to get gsi from dev");
> +} else {
> +ret = dev_gsi.gsi;

I've just notice that this is mixing signed and unsigned int.
We are storing a "__u32" into an "int" here. This isn't great as we are
throwing way lots of potentially good value. (Even if I have no idea if
they are possible.)

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [RFC XEN PATCH v14 4/5] tools: Add new function to get gsi from dev

2024-09-03 Thread Anthony PERARD

On Tue, Sep 03, 2024 at 03:04:23PM +0800, Jiqian Chen wrote:
> When passthrough a device to domU, QEMU and xl tools use its gsi
> number to do pirq mapping, see QEMU code
> xen_pt_realize->xc_physdev_map_pirq, and xl code
> pci_add_dm_done->xc_physdev_map_pirq, but the gsi number is got
> from file /sys/bus/pci/devices//irq, that is wrong, because
> irq is not equal with gsi, they are in different spaces, so pirq
> mapping fails.
> 
> And in current codes, there is no method to get gsi for userspace.
> For above purpose, add new function to get gsi, and the
> corresponding ioctl is implemented on linux kernel side.
> 
> Signed-off-by: Jiqian Chen 
> Signed-off-by: Huang Rui 
> Signed-off-by: Chen Jiqian 
> ---
> RFC: it needs to wait for the corresponding third patch on linux kernel side 
> to be merged.
> https://lore.kernel.org/xen-devel/20240607075109.126277-4-jiqian.c...@amd.com/
> ---
> v13->v14 changes:
> No.
> 
> v12->v13 changes:
> Rename the function xc_physdev_gsi_from_pcidev to xc_pcidev_get_gsi to avoid 
> confusion with physdev namesapce.
> Move the implementation of xc_pcidev_get_gsi into xc_linux.c.
> Directly use xencall_fd(xch->xcall) in the function xc_pcidev_get_gsi instead 
> of opening "privcmd".

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] x86/boot: Reinstate -nostdinc for CFLAGS_x86_32

2024-09-03 Thread Anthony PERARD

On Tue, Sep 03, 2024 at 11:49:40AM +0100, Andrew Cooper wrote:
> Most of Xen is build using -nostdinc and a fully specified include path.
> However, the makefile line:
> 
>   $(head-bin-objs): XEN_CFLAGS := $(CFLAGS_x86_32) -fpic
> 
> discards XEN_CFLAGS and replaces them with CFLAGS_x86_32.
> 
> Reinstate -nostdinc, and add the arch include path to the command line.  This
> is a latent bug for now, but it breaks properly with subsequent include
> changes.
> 
> Fixes: d58a509e01c4 ("build,x86: remove the need for build32.mk")

I disagree with this. I'm pretty sure I've check that no command line
have changed.

Also, this commit shows that CFLAGS was only coming from root's
Config.mk:
> diff --git a/xen/arch/x86/boot/build32.mk b/xen/arch/x86/boot
> deleted file mode 100644
> index e90680cd9f..00
> --- a/xen/arch/x86/boot/build32.mk
> +++ /dev/null
> @@ -1,40 +0,0 @@
> -override XEN_TARGET_ARCH=x86_32
> -CFLAGS =
> -include $(XEN_ROOT)/Config.mk

So, I'm pretty sure, -nostdinc was never used before. But happy to be
told that I've come to the wrong conclusion. (We would need to check by
building with an old version without this commit to be sure.)

"xen/arch/x86/boot" as it's own sets of CFLAGS, which is annoying and I
haven't really tried to change that. This is also why XEN_CFLAGS is been
discarded.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH] CHANGELOG: Update after fixing dm_restrict

2024-09-03 Thread Anthony PERARD

On Tue, Sep 03, 2024 at 11:20:37AM +0200, Jan Beulich wrote:
> On 03.09.2024 10:54, Anthony PERARD wrote:
> > --- a/CHANGELOG.md
> > +++ b/CHANGELOG.md
> > @@ -16,6 +16,10 @@ The format is based on [Keep a 
> > Changelog](https://keepachangelog.com/en/1.0.0/)
> >   - On x86:
> > - Support for running on Xeon Phi processors.
> >  
> > +### Fixed
> > + - Fixed xl/libxl dm\_restrict feature with QEMU 9.0.
> > +
> > +
> >  ## 
> > [4.19.0](https://xenbits.xenproject.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-4.19.0)
> >  - 2024-07-29
> >  
> >  ### Changed
> 
> We had no "### Fixed" so far, and I think we also shouldn't gain any. Imo
> this wants expressing in term of "### Changed", also wording the text
> accordingly (needing to adapt to qemu changes isn't a bug fix, strictly
> speaking, but more an enhancement in my view).

>From our CHANGELOG.md:
> The format is based on [Keep a 
> Changelog](https://keepachangelog.com/en/1.0.0/)

following the link, there is:

> Types of changes
> 
> - `Added` for new features.
> - `Changed` for changes in existing functionality.
> - `Deprecated` for soon-to-be removed features.
> - `Removed` for now removed features.
> - `Fixed` for any bug fixes.
> - `Security` in case of vulnerabilities.

So, are we not following keepachangelog.com format? Is there another
document/email/discussion describing which category are allowed?

Been compatible with newer version of dependencies doesn't feels like an
existing functionality have changed. It feels more like a fix to the bug
"can't start guest with newer QEMU".

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[XEN PATCH] CHANGELOG: Update after fixing dm_restrict

2024-09-03 Thread Anthony PERARD

Fixed by change 82335a8cc54c ("libxl: Probe QEMU for -run-with
chroot=dir and use it").

Signed-off-by: Anthony PERARD 
---
 CHANGELOG.md | 4 
 1 file changed, 4 insertions(+)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 5521ae5bb3..fb9a66c727 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -16,6 +16,10 @@ The format is based on [Keep a 
Changelog](https://keepachangelog.com/en/1.0.0/)
  - On x86:
- Support for running on Xeon Phi processors.
 
+### Fixed
+ - Fixed xl/libxl dm\_restrict feature with QEMU 9.0.
+
+
 ## 
[4.19.0](https://xenbits.xenproject.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-4.19.0)
 - 2024-07-29
 
 ### Changed
-- 


Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v3 3/3] libxl: Update the documentation of libxl_xen_console_read_line()

2024-09-03 Thread Anthony PERARD

On Mon, Sep 02, 2024 at 05:38:39PM +0100, Javi Merino wrote:
> Despite its name, libxl_xen_console_read_line() does not read a line,
> it fills the buffer with as many characters as fit.  Update the
> documentation to reflect the real behaviour of the function.  Rename
> line_r to avoid confusion since it is a pointer to an array of
> characters.
> 
> Signed-off-by: Javi Merino 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v3 2/3] libxl: Remove unnecessary buffer zeroing and zalloc()

2024-09-03 Thread Anthony PERARD

On Mon, Sep 02, 2024 at 05:38:38PM +0100, Javi Merino wrote:
> When reading the console, xen overwrites the contents of the buffer,
> so there is no need to zero the buffer before passing it to xen.
> Instead, add a NULL at the end of the buffer.
> 
> While we are at it, change the zalloc() of the buffer back to
> malloc() as it was before bdf4131 (libxl: don't leak buf in
> libxl_xen_console_read_start error handling, 2013-12-03).  The comment
> in that commit message says that the intent of the commit was to
> change malloc+memset to zalloc(), but only for the
> libxl_xen_console_reader struct, not for the buffer.
> 
> Signed-off-by: Javi Merino 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: Block protocol incompatibilities with 4K logical sector size disks

2024-09-02 Thread Anthony PERARD

On Mon, Sep 02, 2024 at 05:23:37PM +0200, Roger Pau Monné wrote:
> On Mon, Sep 02, 2024 at 03:19:56PM +0100, Paul Durrant wrote:
> > On 02/09/2024 09:55, Roger Pau Monné wrote:
> > [snip]
> > >
> > > Thanks for your input.  I would also like to hear from the blktap and
> > > Windows PV drivers maintainers, as the change that I'm proposing here
> > > will require changes to their implementations.
> > >
> >
> > So IIUC you are proposing to refuse to connect to a frontend that sets
> > feature-large-sector-size if sector-size != 512? Is that right?
>
> Is is worth retrofitting this into existing backends?  My suggestion
> would be to make `feature-large-sector-size` not mandatory to expose a
> sector-size != 512, but I wouldn't go as far as refusing to connect to
> frontends that expose the feature.  I have no idea which frontends
> might expose `feature-large-sector-size` but still be compatible with
> Linux blkback regarding sector-size != 512 (I know the Windows one
> isn't).

The frontend exposing "feature-large-sector-size" are not going to work
with Linux's backend if it set "sector-size" to a value different from
512.

>From blkif.h:
feature-large-sector-size
 A value of "1" indicates that the frontend will correctly supply and
 interpret all sector-based quantities in terms of the "sector-size"
 value supplied in the backend info, whatever that may be set to.
 ...

But Linux interprets "sector-based quantities" as 512 bytes. This is
incompatible with "feature-large-sector-size".

This is why I proposed to mark "feature-large-sector-size" as broken or
incompatible with your proposal to use 512B for all sector-based
quantities.

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: Block protocol incompatibilities with 4K logical sector size disks

2024-08-30 Thread Anthony PERARD

On Thu, Aug 29, 2024 at 05:42:45PM +0200, Roger Pau Monné wrote:
> On Thu, Aug 29, 2024 at 01:15:42PM +0000, Anthony PERARD wrote:
> > On Thu, Aug 29, 2024 at 12:59:43PM +0200, Roger Pau Monné wrote:
> > > The following table attempts to summarize in which units the following 
> > > fields
> > > are defined for the analyzed implementations (please correct me if I got 
> > > some
> > > of this wrong):
> > >
> > > │ sectors xenbus node │ requests sector_number │ 
> > > requests {first,last}_sect
> > > ┼─┼┼───
> > > FreeBSD blk{front,back} │ sector-size │  sector-size   │  
> > >  512
> > > ┼─┼┼───
> > > Linux blk{front,back}   │ 512 │  512   │  
> > >  512
> > > ┼─┼┼───
> > > QEMU blkback│ sector-size │  sector-size   │  
> > >  sector-size
> > > ┼─┼┼───
> > > Windows blkfront│ sector-size │  sector-size   │  
> > >  sector-size
> > > ┼─┼┼───
> > > MiniOS  │ sector-size │  512   │  
> > >  512
> > > ┼─┼┼───
> > > tapdisk blkback │ 512 │  sector-size   │  
> > >  512

Tapdisk situation seems more like:

 tapdisk blkback │  ?? │  ???   │   
  ?

I've looks at the implementation at xapi-project/blktat[1] and the way
sector_number or {first,last}_sect seems to be used varied on which
backend is used (block-vhd, block-nbd, block-aio).

[1] https://github.com/xapi-project/blktap

block-vhd seems mostly sectors of 512 but recalculated with "s->spb"
(sector per block?) but still, sector seems to be only 512.

block-nbd seems to set "sector-size" to always 512, but uses
"sector-size" for sector_number and {first,last}_sect.

The weirdest one is block-aio, where on read it multiply sector_number
and {first,last}_sect by 512, but on write, those are multiplied by
"sector-size". With "sector-size" set by ioctl(BLKSSZGET)

At least, is seems "sectors" is a multiple of 512 on all those, like in
the table, but I've only look at those 3 "drivers".


> > There's OVMF as well, which copied MiniOS's implementation, and looks
> > like it's still the same as MiniOS for the table above:
> >
> > OVMF (base on MiniOS)   │ sector-size │  512   │
> >512
> >
> > >
> > > It's all a mess, I'm surprised we didn't get more reports about 
> > > brokenness when
> > > using disks with 4K logical sectors.
> > >
> > > Overall I think the in-kernel backends are more difficult to update (as it
> > > might require a kernel rebuild), compared to QEMU or blktap.  Hence my 
> > > slight
> > > preference would be to adjust the public interface to match the behavior 
> > > of
> > > Linux blkback, and then adjust the implementation in the rest of the 
> > > backends
> > > and frontends.
> >
> > I would add that making "sector-size" been different from 512 illegal
> > makes going forward easier, has every implementation will work with a
> > "sector-size" of 512, and it probably going to be the most common sector
> > size for a while longer.
>
> My main concern is the amount of backends out there that already
> expose a "sector-size" different than 512.  I fear any changes here
> will take time to propagate to in-kernel backends, and hence my
> approach was to avoid modifying Linux blkback, because (as seen in the
> FreeBSD bug report) there are already instances of 4K logical sector
> disks being used by users.  Modifying the frontends is likely easier,
> as that's under the owner of the VM control.
>
> > > There was an attempt in 2019 to introduce a new frontend feature flag to 
> > > signal
> > > whether the frontend supported `sector-size` xenstore nodes different

[XEN PATCH v2 2/2] libxl: Probe QEMU for -run-with user=user and use it

2024-08-30 Thread Anthony PERARD

"-runas" is deprecated since QEMU 9.1 and will be remove in a future
release.

Signed-off-by: Anthony PERARD 
Reviewed-by: Jason Andryuk 
---

Notes:
v2:
- rename "qemu_cmdline" to "qemu_opts" in struct field and func
  parameters.

 tools/libs/light/libxl_dm.c   | 12 ++--
 tools/libs/light/libxl_internal.h |  1 +
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 15b157060f..1f2f5bd97a 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -2052,8 +2052,13 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 }
 
 if (state->dm_runas) {
-flexarray_append(dm_args, "-runas");
-flexarray_append(dm_args, state->dm_runas);
+if (qemu_opts->have_runwith_user) {
+flexarray_append_pair(dm_args, "-run-with",
+  GCSPRINTF("user=%s", state->dm_runas));
+} else {
+flexarray_append(dm_args, "-runas");
+flexarray_append(dm_args, state->dm_runas);
+}
 }
 }
 flexarray_append(dm_args, NULL);
@@ -3073,6 +3078,9 @@ static void device_model_probe_cmdline(libxl__egc *egc,
 if (!strcmp("chroot", libxl__json_object_get_string(o))) {
 dmss->qemu_opts.have_runwith_chroot = true;
 }
+else if (!strcmp("user", libxl__json_object_get_string(o))) {
+dmss->qemu_opts.have_runwith_user = true;
+}
 }
 
 /*
diff --git a/tools/libs/light/libxl_internal.h 
b/tools/libs/light/libxl_internal.h
index 0133c57e01..089a2f949c 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -4145,6 +4145,7 @@ typedef struct libxl__dm_spawn_state 
libxl__dm_spawn_state;
 typedef struct libxl__qemu_available_opts libxl__qemu_available_opts;
 struct libxl__qemu_available_opts {
 bool have_runwith_chroot;
+    bool have_runwith_user;
 };
 
 typedef void libxl__dm_spawn_cb(libxl__egc *egc, libxl__dm_spawn_state*,
-- 


Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[XEN PATCH v2 0/2] libxl: Implement QEMU command line probe

2024-08-30 Thread Anthony PERARD

Patch series available in this git branch:
https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git 
br.libxl-qemu-cmdline-probe-v2

v2:
- removed commited patch
- rename "qemu_cmdline" to "qemu_opts" in struct field and func parameters.
- rename "struct libxl__qemu_available_cmd_line" to
  "struct libxl__qemu_available_opts".

Starting with QEMU 9.0, the option "-chroot", that we use for the
"dmrestrict" feature, is removed. We need to find out which to use
between "-chroot" and "-run-with chroot=dir".

Also, "-runas" is deprecated in QEMU 9.1 and will be remove in a future
release, it's replaced with "-run-with user=user".

To find out which command line option we can use, we'll spawn QEMU, and run the
QMP command "query-command-line-options".

Some example of running these patches:
with qemu-xen (8.0):
http://logs.test-lab.xenproject.org/osstest/logs/187352/
with QEMU (upstream, 9.1-rc3):
http://logs.test-lab.xenproject.org/osstest/logs/187353/

Anthony PERARD (2):
  libxl: Probe QEMU for -run-with chroot=dir and use it
  libxl: Probe QEMU for -run-with user=user and use it

 tools/libs/light/libxl_dm.c   | 90 +++++++++--
 tools/libs/light/libxl_internal.h |  6 +++
 2 files changed, 80 insertions(+), 16 deletions(-)

-- 


Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[XEN PATCH v2 1/2] libxl: Probe QEMU for -run-with chroot=dir and use it

2024-08-30 Thread Anthony PERARD

QEMU 9.0 have removed "-chroot" command line option, which have been
deprecated since QEMU 8.1 in favor of "-run-with chroot=dir".

Look into the result of the QMP command "query-command-line-options"
to find out if "-run-with chroot=dir" is available. Then use it in
place of "-chroot".

Resolves: xen-project/xen#187
Signed-off-by: Anthony PERARD 
Reviewed-by: Jason Andryuk 
---

Notes:
v2:
- rename "qemu_cmdline" to "qemu_opts" in struct field and func
  parameters.
- rename "struct libxl__qemu_available_cmd_line" to
  "struct libxl__qemu_available_opts".

 tools/libs/light/libxl_dm.c   | 78 +--
 tools/libs/light/libxl_internal.h |  5 ++
 2 files changed, 69 insertions(+), 14 deletions(-)

diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 46babfed0b..15b157060f 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -1183,11 +1183,12 @@ static int libxl__pre_open_qmp_socket(libxl__gc *gc, 
libxl_domid domid,
 }
 
 static int libxl__build_device_model_args_new(libxl__gc *gc,
-const char *dm, int guest_domid,
-const libxl_domain_config 
*guest_config,
-char ***args, char ***envs,
-const libxl__domain_build_state *state,
-int *dm_state_fd)
+const char *dm, int guest_domid,
+const libxl_domain_config *guest_config,
+char ***args, char ***envs,
+const libxl__domain_build_state *state,
+const libxl__qemu_available_opts *qemu_opts,
+int *dm_state_fd)
 {
 const libxl_domain_create_info *c_info = &guest_config->c_info;
 const libxl_domain_build_info *b_info = &guest_config->b_info;
@@ -1778,8 +1779,13 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 }
 
 /* Add "-chroot [dir]" to command-line */
-flexarray_append(dm_args, "-chroot");
-flexarray_append(dm_args, chroot_dir);
+if (qemu_opts->have_runwith_chroot) {
+flexarray_append_pair(dm_args, "-run-with",
+  GCSPRINTF("chroot=%s", chroot_dir));
+} else {
+flexarray_append(dm_args, "-chroot");
+flexarray_append(dm_args, chroot_dir);
+}
 }
 
 if (state->saved_state) {
@@ -2059,11 +2065,12 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 }
 
 static int libxl__build_device_model_args(libxl__gc *gc,
-const char *dm, int guest_domid,
-const libxl_domain_config 
*guest_config,
-char ***args, char ***envs,
-const libxl__domain_build_state *state,
-int *dm_state_fd)
+const char *dm, int guest_domid,
+const libxl_domain_config *guest_config,
+char ***args, char ***envs,
+const libxl__domain_build_state *state,
+const libxl__qemu_available_opts *qemu_opts,
+int *dm_state_fd)
 /* dm_state_fd may be NULL iff caller knows we are using stubdom
  * and therefore will be passing a filename rather than a fd. */
 {
@@ -2081,7 +2088,9 @@ static int libxl__build_device_model_args(libxl__gc *gc,
 return libxl__build_device_model_args_new(gc, dm,
   guest_domid, guest_config,
   args, envs,
-  state, dm_state_fd);
+  state,
+  qemu_opts,
+  dm_state_fd);
 default:
 LOGED(ERROR, guest_domid, "unknown device model version %d",
   guest_config->b_info.device_model_version);
@@ -2403,7 +2412,9 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
libxl__stub_dm_spawn_state *sdss)
 
 ret = libxl__build_device_model_args(gc, "stubdom-dm", guest_domid,
  guest_config, &args, NULL,
- d_state, NULL);
+ d_state,
+ &sdss->dm.qemu_opts,
+ NULL);
 if (ret) {
 ret = ERROR_FAIL;
 goto out;
@@ -3024,6 +3035,7 @@ static void device_model_probe_detached(libxl__egc *egc,
 static void device_model_probe_cmdline(libxl__egc *egc,
 libxl__ev_qmp *qmp, const libxl__json_object *response, int rc)
 {
+EGC_GC;
 libxl__dm_spawn_state *dmss = CONTAINER_OF(qmp, *dmss, qm

Re: Block protocol incompatibilities with 4K logical sector size disks

2024-08-29 Thread Anthony PERARD

 backends are more difficult to update (as it
> might require a kernel rebuild), compared to QEMU or blktap.  Hence my slight
> preference would be to adjust the public interface to match the behavior of
> Linux blkback, and then adjust the implementation in the rest of the backends
> and frontends.

I would add that making "sector-size" been different from 512 illegal
makes going forward easier, has every implementation will work with a
"sector-size" of 512, and it probably going to be the most common sector
size for a while longer.

> There was an attempt in 2019 to introduce a new frontend feature flag to 
> signal
> whether the frontend supported `sector-size` xenstore nodes different than 
> 512 [0].
> However that was only ever implemented for QEMU blkback and Windows blkfront,
> all the other backends will expose `sector-size` different than 512 without
> checking if `feature-large-sector-size` is exposed by the frontend.  I'm 
> afraid
> it's now too late to retrofit that feature into existing backends, seeing as
> they already expose `sector-size` nodes greater than 512 without checking if
> `feature-large-sector-size` is reported by the frontend.

Much before that, "physical-sector-size" was introduced (2013):

https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=a67e2dac9f8339681b30b0f89274a64e691ea139

Linux seems to implement it, but QEMU or OVMF don't have it.

> My proposal would be to adjust the public interface with:
>
>  * Disk size is calculated as: `sectors` * 512 (`sectors` being the contents 
> of
>such xenstore backend node).
>
>  * All the sector related fields in blkif ring requests use a 512b base sector
>size, regardless of the value in the `sector-size` xenstore node.
>
>  * The `sector-size` contains the disk logical sector size.  The frontend must
>ensure that all request segments addresses are aligned and it's length is
>a multiple of such size.  Otherwise the backend will refuse to process the
>request.

You still want to try to have a "sector-size" different from 512? To me
this just add confusion to the confusion. There would be no way fro
backend or frontend to know if setting something other than 512 is going
to work. Also, it is probably easier to update backend than frontend, so
it is just likely that something is going to lag behind and broke.

Why not make use of the node "physical-sector-size" that have existed
for 10 years, even if unused or unadvertised, and if an IO request isn't
aligned on it, it is just going to be slow (as backend would have to
read,update,write instead of just write sectors).

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v3 3/3] Do not access /dev/mem in MSI-X PCI passthrough on Xen

2024-08-28 Thread Anthony PERARD

On Mon, May 06, 2024 at 02:33:22AM +0200, Marek Marczykowski-Górecki wrote:
> diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
> index 09cca4e..836cc9c 100644
> --- a/hw/xen/xen_pt_msi.c
> +++ b/hw/xen/xen_pt_msi.c
> @@ -493,7 +501,12 @@ static uint64_t pci_msix_read(void *opaque, hwaddr addr,
>  return get_entry_value(&msix->msix_entry[entry_nr], offset);
>  } else {
>  /* Pending Bit Array (PBA) */
> -return *(uint32_t *)(msix->phys_iomem_base + addr);
> +if (s->msix->phys_iomem_base) {
> +return *(uint32_t *)(msix->phys_iomem_base + addr);
> +}
> +XEN_PT_LOG(&s->dev, "reading PBA, addr 0x%lx, offset 0x%lx\n",
> +   addr, addr - msix->total_entries * PCI_MSIX_ENTRY_SIZE);
> +return 0x;

If Xen advertise XENFEAT_dm_msix_all_writes, we are not expecting QEMU
to reach this code, right? A comment might be useful.

>  }
>  }
>
> @@ -576,33 +593,40 @@ int xen_pt_msix_init(XenPCIPassthroughState *s, 
> uint32_t base)
>  msix->table_base = s->real_device.io_regions[bar_index].base_addr;
>  XEN_PT_LOG(d, "get MSI-X table BAR base 0x%"PRIx64"\n", 
> msix->table_base);
>
> +/* Accessing /dev/mem is needed only on older Xen. */
> +if (!(xc_version_info.submap & (1U << XENFEAT_dm_msix_all_writes))) {

Would it be ok to use test_bit() instead?

Thanks,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v3 2/3] Update Xen's features.h header

2024-08-28 Thread Anthony PERARD

On Mon, May 06, 2024 at 02:33:21AM +0200, Marek Marczykowski-Górecki wrote:
> Update it to get XENFEAT_dm_msix_all_writes for the next patch.
>
> Signed-off-by: Marek Marczykowski-Górecki 

Reviewed-by: Anthony PERARD 

Thanks,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v3 1/3] hw/xen/xen_pt: Save back data only for declared registers

2024-08-28 Thread Anthony PERARD

On Mon, May 06, 2024 at 02:33:20AM +0200, Marek Marczykowski-Górecki wrote:
> Call pci_default_write_config() only after resolving any handlers from
> XenPTRegInfo structures, and only with a value updated with those
> handlers. This is important for two reasons:
> 1. XenPTRegInfo has ro_mask which needs to be enforced - Xen-specific
>hooks do that on their own (especially xen_pt_*_reg_write()).
> 2. Not setting value early allows hooks to see the old value too.
>
> If it would be only about the first point, setting PCIDevice.wmask would
> probably be sufficient, but given the second point, change those
> writes.
>
> Relevant handlers already save data back to the emulated registers
> space, call the pci_default_write_config() only for its side effects.
>
> Signed-off-by: Marek Marczykowski-Górecki 
> ---
> v3:
>  - use emulated register value for pci_default_write_config() call, not
>the one for writting back to the hardware
>  - greatly simplify the patch by calling pci_default_write_config() on
>the whole value
> v2:
>  - rewrite commit message, previous one was very misleading
>  - fix loop saving register values
>  - fix int overflow when calculating write mask

Reviewed-by: Anthony PERARD 

Thanks,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH 2/3] libxl: Probe QEMU for -run-with chroot=dir and use it

2024-08-28 Thread Anthony PERARD

On Tue, Aug 27, 2024 at 06:20:42PM -0400, Jason Andryuk wrote:
> On 2024-08-27 06:03, Anthony PERARD wrote:
> > QEMU 9.0 have removed "-chroot" command line option, which have been
> > deprecated since QEMU 8.1 in favor of "-run-with chroot=dir".
> > 
> > Look into the result of the QMP command "query-command-line-options"
> > to find out if "-run-with chroot=dir" is available. Then use it in
> > place of "-chroot".
> > 
> > Resolves: xen-project/xen#187
> > Signed-off-by: Anthony PERARD 
> 
> Reviewed-by: Jason Andryuk 
> 
> though one suggestion below.
> 
> > ---
> >   tools/libs/light/libxl_dm.c   | 78 +--
> >   tools/libs/light/libxl_internal.h |  5 ++
> >   2 files changed, 69 insertions(+), 14 deletions(-)
> > 
> > diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
> > index 46babfed0b..298fbb84fe 100644
> > --- a/tools/libs/light/libxl_dm.c
> > +++ b/tools/libs/light/libxl_dm.c
> > @@ -1183,11 +1183,12 @@ static int libxl__pre_open_qmp_socket(libxl__gc 
> > *gc, libxl_domid domid,
> >   }
> >   static int libxl__build_device_model_args_new(libxl__gc *gc,
> > -const char *dm, int guest_domid,
> > -const libxl_domain_config 
> > *guest_config,
> > -char ***args, char ***envs,
> > -const libxl__domain_build_state 
> > *state,
> > -int *dm_state_fd)
> > +const char *dm, int guest_domid,
> > +const libxl_domain_config *guest_config,
> > +char ***args, char ***envs,
> > +const libxl__domain_build_state *state,
> > +const libxl__qemu_available_cmd_line *qemu_cmdline,
> 
> cmd_line/cmdline makes me think of command line strings.
> qemu_opts/qemu_cli_opts is a little more generic, to me at least.  But not a
> big deal if you want to keep it as is.

Yes, "opts" sounds better than "cmdline" in this context.

I'll rename "libxl__qemu_available_cmd_line" to
"libxl__qemu_available_opts".

And "qemu_cmdline" to "qemu_opts", both in the struct
libxl__dm_spawn_state and as argument of functions.

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH 1/3] libxl: Implement QEMU command line probe

2024-08-28 Thread Anthony PERARD

On Tue, Aug 27, 2024 at 06:17:05PM -0400, Jason Andryuk wrote:
> On 2024-08-27 06:03, Anthony PERARD wrote:
> > From: Anthony PERARD 
> > 
> > Starting with QEMU 9.0, the option "-chroot", that we use for the
> > "dmrestrict" feature, is removed. We need to find out which to use
> > between "-chroot" and "-run-with chroot=dir".
> > 
> > This patch implement the machinery to spawn QEMU, and to run the QMP
> > command "query-command-line-options" but doesn't yet look at the
> > actual result. Whether or not to use "-run-with chroot=dir" will be
> > implemented in a follow up patch.
> > 
> > The command line used to spawn the qemu we want to probe is mostly
> > similar to the one we already use for the device model, "-machine
> > none" comes from libvirt.
> > 
> > This patch implement the probing on qemu-xen, even if we probably not
> > going to use the result. We could check the feature wanted for the
> > domain been created, but this could get complicated fairly quickly.
> 
> "domain being created"?

Yes.

> > We already need to check the options "b_info->dm_restrict" for
> > "-chroot" and "state->dm_runas" for "-runas" (which is deprecated).
> > 
> > Signed-off-by: Anthony PERARD 
> 
> Reviewed-by: Jason Andryuk 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v2 3/3] libxl: Update the documentation of libxl_xen_console_read_line()

2024-08-27 Thread Anthony PERARD

On Fri, Aug 23, 2024 at 06:05:05PM +0100, Javi Merino wrote:
> Despite its name, libxl_xen_console_read_line() does not read a line,
> it fills the buffer with as many characters as fit.  Update the
> documentation to reflect the real behaviour of the function.  Rename
> line_r to avoid confusion since it is a pointer to an array of
> characters.
> 
> Signed-off-by: Javi Merino 
> ---
>  tools/libs/light/libxl_console.c | 22 --
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/libs/light/libxl_console.c 
> b/tools/libs/light/libxl_console.c
> index f42f6a51ee6f..652897e4ef6d 100644
> --- a/tools/libs/light/libxl_console.c
> +++ b/tools/libs/light/libxl_console.c
> @@ -789,17 +789,19 @@ libxl_xen_console_reader *
>  return cr;
>  }
>  
> -/* return values:  *line_r
> - *   1  success, whole line obtained from buffernon-0
> - *   0  no more lines available right now   0
> - *   negative   error code ERROR_*  0
> - * On success *line_r is updated to point to a nul-terminated
> +/* Copy part of the console ring into a buffer
> + *
> + * Return values:
> + *   1: Success, the buffer obtained from the console ring an
> + *   0: No more lines available right now
> + *   -ERROR_* on error
> + *
> + * On success, *line_r is updated to point to a nul-terminated
>   * string which is valid until the next call on the same console
> - * reader.  The libxl caller may overwrite parts of the string
> - * if it wishes. */
> + * reader. */
>  int libxl_xen_console_read_line(libxl_ctx *ctx,
>  libxl_xen_console_reader *cr,
> -char **line_r)
> +char **buff)

You may want to update "tools/include/libxl.h" as well. I don't know why
this comments is here instead of in the public header. At least there's
some documentation I guess.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v2 2/3] libxl: Remove unnecessary buffer zeroing and zalloc()

2024-08-27 Thread Anthony PERARD

On Fri, Aug 23, 2024 at 06:05:04PM +0100, Javi Merino wrote:
> When reading the console, xen overwrites the contents of the buffer,
> so there is no need to zero the buffer before passing it to xen.
> Instead, add a NULL at the end of the buffer.
> 
> While we are at it, change the zalloc() of the buffer back to
> malloc() as it was before bdf4131 (libxl: don't leak buf in
> libxl_xen_console_read_start error handling, 2013-12-03).  The comment
> in that commit message says that the intent of the commit was to
> change malloc+memset to zalloc(), but only for the
> libxl_xen_console_reader struct, not for the buffer.
> 
> Signed-off-by: Javi Merino 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v2 1/3] libxl: Fix nul-termination of the return value of libxl_xen_console_read_line()

2024-08-27 Thread Anthony PERARD

On Fri, Aug 23, 2024 at 06:05:03PM +0100, Javi Merino wrote:
> diff --git a/tools/libs/light/libxl_console.c 
> b/tools/libs/light/libxl_console.c
> index a563c9d3c7f9..012fd996fba9 100644
> --- a/tools/libs/light/libxl_console.c
> +++ b/tools/libs/light/libxl_console.c
> @@ -774,12 +774,14 @@ libxl_xen_console_reader *
>  {
>  GC_INIT(ctx);
>  libxl_xen_console_reader *cr;
> -unsigned int size = 16384;
> +/* We want xen to fill the buffer in as few hypercalls as
> + * possible, but xen will not nul-terminate it.  Leave one byte at
> + * the end for the null */
> +unsigned int size = 16384 + 1;

This comment doesn't really explain why the size choosen is 16k+1, it
kind of explain the +1 but that's about it.

16k seems to be the initial size

https://elixir.bootlin.com/xen/v4.19.0/source/xen/drivers/char/console.c#L110
But then, it is changed to depend on $(nproc) and loglevel

https://elixir.bootlin.com/xen/v4.19.0/source/xen/drivers/char/console.c#L1095
with 16k been the minimum it seems:

https://elixir.bootlin.com/xen/v4.19.0/source/xen/drivers/char/console.c#L1061

So, I think a useful comment here would reflect that 16k is default size
of Xen's console buffer.

Also, multi-line comments are normally expected to be with begin and end
markers on separated lines:
/*
 * Comments.
 */


It be nice to fix both comments, but otherwise the patch looks good:
Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[XEN PATCH 3/3] libxl: Probe QEMU for -run-with user=user and use it

2024-08-27 Thread Anthony PERARD

"-runas" is deprecated since QEMU 9.1 and will be remove in a future
release.

Signed-off-by: Anthony PERARD 
---
 tools/libs/light/libxl_dm.c   | 12 ++--
 tools/libs/light/libxl_internal.h |  1 +
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 298fbb84fe..49995b14b8 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -2052,8 +2052,13 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 }
 
 if (state->dm_runas) {
-flexarray_append(dm_args, "-runas");
-flexarray_append(dm_args, state->dm_runas);
+if (qemu_cmdline->have_runwith_user) {
+flexarray_append_pair(dm_args, "-run-with",
+  GCSPRINTF("user=%s", state->dm_runas));
+} else {
+flexarray_append(dm_args, "-runas");
+flexarray_append(dm_args, state->dm_runas);
+}
 }
 }
 flexarray_append(dm_args, NULL);
@@ -3073,6 +3078,9 @@ static void device_model_probe_cmdline(libxl__egc *egc,
 if (!strcmp("chroot", libxl__json_object_get_string(o))) {
 dmss->qemu_cmdline.have_runwith_chroot = true;
 }
+else if (!strcmp("user", libxl__json_object_get_string(o))) {
+dmss->qemu_cmdline.have_runwith_user = true;
+}
 }
 
 /*
diff --git a/tools/libs/light/libxl_internal.h 
b/tools/libs/light/libxl_internal.h
index df93d904c2..5c61050f79 100644
--- a/tools/libs/light/libxl_internal.h
+++ b/tools/libs/light/libxl_internal.h
@@ -4145,6 +4145,7 @@ typedef struct libxl__dm_spawn_state 
libxl__dm_spawn_state;
 typedef struct libxl__qemu_available_cmd_line libxl__qemu_available_cmd_line;
 struct libxl__qemu_available_cmd_line {
 bool have_runwith_chroot;
+bool have_runwith_user;
 };
 
 typedef void libxl__dm_spawn_cb(libxl__egc *egc, libxl__dm_spawn_state*,
-- 


Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[XEN PATCH 0/3] libxl: Implement QEMU command line probe

2024-08-27 Thread Anthony PERARD

Patch series available in this git branch:
https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git 
br.libxl-qemu-cmdline-probe-v1

Starting with QEMU 9.0, the option "-chroot", that we use for the
"dmrestrict" feature, is removed. We need to find out which to use
between "-chroot" and "-run-with chroot=dir".

Also, "-runas" is deprecated in QEMU 9.1 and will be remove in a future
release, it's replaced with "-run-with user=user".

To find out which command line option we can use, we'll spawn QEMU, and run the
QMP command "query-command-line-options".

Some example of running these patches:
with qemu-xen (8.0):
http://logs.test-lab.xenproject.org/osstest/logs/187352/
with QEMU (upstream, 9.1-rc3):
http://logs.test-lab.xenproject.org/osstest/logs/187353/

Anthony PERARD (3):
  libxl: Implement QEMU command line probe
  libxl: Probe QEMU for -run-with chroot=dir and use it
  libxl: Probe QEMU for -run-with user=user and use it

 tools/libs/light/libxl_dm.c   | 297 +++---
 tools/libs/light/libxl_internal.h |   7 +
 2 files changed, 278 insertions(+), 26 deletions(-)

-- 


Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

[XEN PATCH 1/3] libxl: Implement QEMU command line probe

2024-08-27 Thread Anthony PERARD

From: Anthony PERARD 

Starting with QEMU 9.0, the option "-chroot", that we use for the
"dmrestrict" feature, is removed. We need to find out which to use
between "-chroot" and "-run-with chroot=dir".

This patch implement the machinery to spawn QEMU, and to run the QMP
command "query-command-line-options" but doesn't yet look at the
actual result. Whether or not to use "-run-with chroot=dir" will be
implemented in a follow up patch.

The command line used to spawn the qemu we want to probe is mostly
similar to the one we already use for the device model, "-machine
none" comes from libvirt.

This patch implement the probing on qemu-xen, even if we probably not
going to use the result. We could check the feature wanted for the
domain been created, but this could get complicated fairly quickly.
We already need to check the options "b_info->dm_restrict" for
"-chroot" and "state->dm_runas" for "-runas" (which is deprecated).

Signed-off-by: Anthony PERARD 
---
 tools/libs/light/libxl_dm.c   | 207 --
 tools/libs/light/libxl_internal.h |   1 +
 2 files changed, 198 insertions(+), 10 deletions(-)

diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index ff8ddeec9a..46babfed0b 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -2858,6 +2858,20 @@ static void device_model_qmp_cb(libxl__egc *egc, 
libxl__ev_qmp *ev,
 static void device_model_spawn_outcome(libxl__egc *egc,
libxl__dm_spawn_state *dmss,
int rc);
+static void device_model_probe_startup_failed(libxl__egc *egc,
+libxl__spawn_state *spawn, int rc);
+static void device_model_probe_confirm(libxl__egc *egc,
+libxl__spawn_state *spawn, const char *xsdata);
+static void device_model_probe_detached(libxl__egc *egc,
+libxl__spawn_state *spawn);
+static void device_model_probe_cmdline(libxl__egc *egc,
+libxl__ev_qmp *qmp, const libxl__json_object *response, int rc);
+static void device_model_probe_quit(libxl__egc *egc,
+libxl__ev_qmp *qmp, const libxl__json_object *response, int rc);
+static void device_model_probe_spawn_outcome(libxl__egc *egc,
+ libxl__dm_spawn_state *dmss, int rc);
+static void device_model_launch(libxl__egc *egc,
+libxl__dm_spawn_state *dmss, int rc);
 static void device_model_postconfig_chardev(libxl__egc *egc,
 libxl__ev_qmp *qmp, const libxl__json_object *response, int rc);
 static void device_model_postconfig_vnc(libxl__egc *egc,
@@ -2873,25 +2887,18 @@ void libxl__spawn_local_dm(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)
 {
 /* convenience aliases */
 const int domid = dmss->guest_domid;
-libxl__domain_build_state *const state = dmss->build_state;
 libxl__spawn_state *const spawn = &dmss->spawn;
 
 STATE_AO_GC(dmss->spawn.ao);
 
-libxl_ctx *ctx = CTX;
 libxl_domain_config *guest_config = dmss->guest_config;
 const libxl_domain_create_info *c_info = &guest_config->c_info;
 const libxl_domain_build_info *b_info = &guest_config->b_info;
-const libxl_vnc_info *vnc = libxl__dm_vnc(guest_config);
-char *path;
-int logfile_w, null;
 int rc;
-char **args, **arg, **envs;
-xs_transaction_t t;
-char *vm_path;
-char **pass_stuff;
 const char *dm;
-int dm_state_fd = -1;
+int logfile_w = -1, null = -1;
+int qmp_probe_fd = -1;
+bool probe_spawned = false;
 
 dmss_init(dmss);
 
@@ -2904,6 +2911,7 @@ void libxl__spawn_local_dm(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)
 rc = ERROR_FAIL;
 goto out;
 }
+dmss->dm = dm;
 if (access(dm, X_OK) < 0) {
 LOGED(ERROR, domid, "device model %s is not executable", dm);
 rc = ERROR_FAIL;
@@ -2911,6 +2919,185 @@ void libxl__spawn_local_dm(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)
 }
 
 rc = libxl__domain_get_device_model_uid(gc, dmss);
+if (rc)
+goto out;
+
+/* probe QEMU's available command line options */
+if (b_info->device_model_version
+== LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN) {
+
+logfile_w = libxl__create_qemu_logfile(
+gc, GCSPRINTF("qemu-probe-%s", c_info->name));
+if (logfile_w < 0) {
+rc = logfile_w;
+goto out;
+}
+null = open("/dev/null", O_RDONLY);
+if (null < 0) {
+LOGED(ERROR, domid, "unable to open /dev/null");
+rc = ERROR_FAIL;
+goto out;
+}
+
+rc = libxl__pre_open_qmp_socket(gc, domid, &qmp_probe_fd);
+if (rc) goto out;
+
+flexarray_t *dm_args = flexarray_make(gc, 16, 1);
+flexarray_vappend(dm_args, dm,
+  "-S",
+

[XEN PATCH 2/3] libxl: Probe QEMU for -run-with chroot=dir and use it

2024-08-27 Thread Anthony PERARD

QEMU 9.0 have removed "-chroot" command line option, which have been
deprecated since QEMU 8.1 in favor of "-run-with chroot=dir".

Look into the result of the QMP command "query-command-line-options"
to find out if "-run-with chroot=dir" is available. Then use it in
place of "-chroot".

Resolves: xen-project/xen#187
Signed-off-by: Anthony PERARD 
---
 tools/libs/light/libxl_dm.c   | 78 +--
 tools/libs/light/libxl_internal.h |  5 ++
 2 files changed, 69 insertions(+), 14 deletions(-)

diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 46babfed0b..298fbb84fe 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -1183,11 +1183,12 @@ static int libxl__pre_open_qmp_socket(libxl__gc *gc, 
libxl_domid domid,
 }
 
 static int libxl__build_device_model_args_new(libxl__gc *gc,
-const char *dm, int guest_domid,
-const libxl_domain_config 
*guest_config,
-char ***args, char ***envs,
-const libxl__domain_build_state *state,
-int *dm_state_fd)
+const char *dm, int guest_domid,
+const libxl_domain_config *guest_config,
+char ***args, char ***envs,
+const libxl__domain_build_state *state,
+const libxl__qemu_available_cmd_line *qemu_cmdline,
+int *dm_state_fd)
 {
 const libxl_domain_create_info *c_info = &guest_config->c_info;
 const libxl_domain_build_info *b_info = &guest_config->b_info;
@@ -1778,8 +1779,13 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 }
 
 /* Add "-chroot [dir]" to command-line */
-flexarray_append(dm_args, "-chroot");
-flexarray_append(dm_args, chroot_dir);
+if (qemu_cmdline->have_runwith_chroot) {
+flexarray_append_pair(dm_args, "-run-with",
+  GCSPRINTF("chroot=%s", chroot_dir));
+} else {
+flexarray_append(dm_args, "-chroot");
+flexarray_append(dm_args, chroot_dir);
+}
 }
 
 if (state->saved_state) {
@@ -2059,11 +2065,12 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 }
 
 static int libxl__build_device_model_args(libxl__gc *gc,
-const char *dm, int guest_domid,
-const libxl_domain_config 
*guest_config,
-char ***args, char ***envs,
-const libxl__domain_build_state *state,
-int *dm_state_fd)
+const char *dm, int guest_domid,
+const libxl_domain_config *guest_config,
+char ***args, char ***envs,
+const libxl__domain_build_state *state,
+const libxl__qemu_available_cmd_line *qemu_cmdline,
+int *dm_state_fd)
 /* dm_state_fd may be NULL iff caller knows we are using stubdom
  * and therefore will be passing a filename rather than a fd. */
 {
@@ -2081,7 +2088,9 @@ static int libxl__build_device_model_args(libxl__gc *gc,
 return libxl__build_device_model_args_new(gc, dm,
   guest_domid, guest_config,
   args, envs,
-  state, dm_state_fd);
+  state,
+  qemu_cmdline,
+  dm_state_fd);
 default:
 LOGED(ERROR, guest_domid, "unknown device model version %d",
   guest_config->b_info.device_model_version);
@@ -2403,7 +2412,9 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
libxl__stub_dm_spawn_state *sdss)
 
 ret = libxl__build_device_model_args(gc, "stubdom-dm", guest_domid,
  guest_config, &args, NULL,
- d_state, NULL);
+ d_state,
+ &sdss->dm.qemu_cmdline,
+ NULL);
 if (ret) {
 ret = ERROR_FAIL;
 goto out;
@@ -3024,6 +3035,7 @@ static void device_model_probe_detached(libxl__egc *egc,
 static void device_model_probe_cmdline(libxl__egc *egc,
 libxl__ev_qmp *qmp, const libxl__json_object *response, int rc)
 {
+EGC_GC;
 libxl__dm_spawn_state *dmss = CONTAINER_OF(qmp, *dmss, qmp);
 
 if (rc)
@@ -3033,6 +3045,43 @@ static void device_model_probe_cmdline(libxl__egc *egc,
  * query-command-line-options response:
  * [ { 'option': 'str', 'parameters': [{ 'name': &#

Re: [XEN PATCH v1] libxl: Fix nul-termination of the return value of libxl_xen_console_read_line()

2024-08-23 Thread Anthony PERARD

On Fri, Aug 23, 2024 at 10:34:05AM +0100, Andrew Cooper wrote:
> On 22/08/2024 2:13 pm, Javi Merino wrote:
> This looks like it will fix the ASAN issue, but I think a better fix
> would be:
>
> -    printf("%s", line);
> +   printf("%.*s", cr->count, line);

nul-terminated string are safer to manipulate, and cr->count isn't
available outside of libxl, so that's a no go.

> because otherwise there's a world of sharp corners once Xen has wrapped
> the buffer for the first time.

If wrapped buffer are visible to libxl or `xl dmesg`, that a bug in xen,
in XEN_SYSCTL_readconsole. It doesn't looks like it's possible to know
when the console ring buffer wrapped.

>
> Which brings us a lot of other WTFs in this code...
>
> First, libxl_console.c describes it's functionality in terms of lines,
> and line_reader() in the API.  Yet it's not lines, it's a 16k buffer
> with generally multi-line content.  It's too late to fix the naming, but
> we could at least rewrite the comments not to be blatant lies.
>
>
> Just out of context above the hunk is:
>
>     unsigned int size = 16384;
>
> which isn't accurate.  The size of Xen's console ring can be changed at
> compile time (like XenServer does), and/or by command line variable.

This buffer size isn't link to Xen's console ring size, both value don't
need to match.

> Because the dmesg ring is never full (it just keeps producing and
> overwriting tail data), it's impossible to get a clean copy except in a
> single hypercall; the incremental/offset parameters are an illusion, and
> do not function correctly if a new printk() has occurred between
> adjacent hypercalls.

Well, let's hope there's no more that "ring_size - size" is been written
to the ring in between two hypercall. There's doesn't seems to be
anything that can be done to work around this with this hypercall
(beside using a huge buffer size).

I think xenconsoled can read the console ring buffer, but does so
continuously, so if `xl dmesg` have some shortcoming, it might be fine.

> And that is why setting count to size - 1 probably isn't wise.  It means
> that, even in the ideal case where Xen's ringbuffer is 16k, you've still
> got to make 2 hypercalls to get the content.

Well, I've run `xl dmesg` on one machine, it seems to have taken 9
hypercall to get everything, and there's still the first few line of the
boot available. So if the ringbuffer was indeed 16k, it would not be
ideal at all... as lots of stuff would have been overwritten.

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v1] libxl: Fix nul-termination of the return value of libxl_xen_console_read_line()

2024-08-23 Thread Anthony PERARD

On Fri, Aug 23, 2024 at 08:31:50AM +0200, Jan Beulich wrote:
> On 22.08.2024 15:13, Javi Merino wrote:
> > When built with ASAN, "xl dmesg" crashes in the "printf("%s", line)"
> > call in main_dmesg().  ASAN reports a heap buffer overflow: an
> > off-by-one access to cr->buffer.
> >
> > The readconsole sysctl copies up to count characters into the buffer,
> > but it does not add a null character at the end.  Despite the
> > documentation of libxl_xen_console_read_line(), line_r is not
> > nul-terminated if 16384 characters were copied to the buffer.
> >
> > Fix this by making count one less that the size of the allocated
> > buffer so that the last byte is always null.
> >
> > Reported-by: Edwin Török 
> > Signed-off-by: Javi Merino 
>
> Perhaps wants a Fixes: tag as well?

Seems to be:
Fixes: 4024bae739cc ("xl: Add subcommand 'xl dmesg'")

> > --- a/tools/libs/light/libxl_console.c
> > +++ b/tools/libs/light/libxl_console.c
> > @@ -779,7 +779,7 @@ libxl_xen_console_reader *
> >  cr = libxl__zalloc(NOGC, sizeof(libxl_xen_console_reader));
> >  cr->buffer = libxl__zalloc(NOGC, size);
> >  cr->size = size;
> > -cr->count = size;
> > +cr->count = size - 1;
> >  cr->clear = clear;
> >  cr->incremental = 1;
>
> While this looks to be addressing the issue at hand, I still wonder: Why
> does the "count" field exist at all? It's certainly odd to be set right
> here (when the buffer actually is empty). It's used solely in
> libxl_xen_console_read_line(), so could be a local variable there.

It isn't only odd to have "count" field, it is also used the wrong way.
Once "cr->count" is set to 0, the next call to
libxl_xen_console_read_line() will simply return an empty NULL, even if
in the mean time more data is available to read from the string.
Also, if the last call to libxl_xen_console_read_line() was shorter than
the buffer size, none of the following call will use the full size of
the buffer. This isn't really a problem for `xl dmesg`, but could be if
we want to implement "--follow" option for example.

So Javi, could you remove that `cr->count` field and use a local
variable instead?

And a comment in the code might be useful about why there's a "-1".

But I think a better way to address the issue would be to actually set
'\0' at the end of the string after the call to xc_readconsolering(),
and remove the not very useful memset(0). That way, it will be more
explicite as to why we need "-1".

> Then, further, I wonder why struct libxl__xen_console_reader lives in
> libxl_internal.h - it's used solely in libxl_console.c.

History? It actually use to live in libxl.h, 14 yr ago.

> Finally - why libxl__zalloc() when libxl_xen_console_read_line() clears
> the buffer anyway?

Overuse of libxl__zalloc when it was use to replace the open coding that
was used to allocate "cr". While it doesn't seems necessary, I don't
think it hurt to keep it there.

Thanks,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v1 2/4] ocaml/libs: Implement a dynamically-loaded plugin for Xenctrl.domain_getinfo

2024-08-22 Thread Anthony PERARD

On Thu, Aug 22, 2024 at 10:06:03AM +0100, Andrii Sultanov wrote:
> diff --git a/tools/ocaml/libs/xenstoredglue/domain_getinfo_plugin_v1/Makefile 
> b/tools/ocaml/libs/xenstoredglue/domain_getinfo_plugin_v1/Makefile
> new file mode 100644
> index 00..eae44f8326
> --- /dev/null
> +++ b/tools/ocaml/libs/xenstoredglue/domain_getinfo_plugin_v1/Makefile
> @@ -0,0 +1,38 @@
[...]
> +.PHONY: install
> +install: $(LIBS) META
> + $(INSTALL_DIR) $(DESTDIR)$(LIBEXEC_BIN)/xenctrl_plugin
> + $(INSTALL_PROG) domain_getinfo_v1.cmxs 
> $(DESTDIR)$(LIBEXEC_BIN)/xenctrl_plugin

Is there any reason to put that new library in "/usr/libexec"?
It doesn't seems like a good place for it, and using "/usr/lib" instead
seems better.
libexec is mostly for binary, according to
https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s07.html

It seems that location for ocaml libs is in $(OCAMLDESTDIR), any reason
to deviate from that?

Also, on the following patch, "XEN_CTRL_DOMAININFO_PLUGIN" is
introduced. If that value is still useful, it would be better to use it
at installation time as well.

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v2] tools/helpers/init-dom0less: fix vcpu availability

2024-08-21 Thread Anthony PERARD

On Wed, Aug 21, 2024 at 11:08:59AM +0530, Amneesh Singh wrote:
> On 16:24-20240820, Anthony PERARD wrote:
> > On Tue, Aug 20, 2024 at 01:34:17PM +0530, Amneesh Singh wrote:
> > > @@ -330,14 +336,24 @@ int main(int argc, char **argv)
> > >
> > >  for (i = 0; i < nb_vm; i++) {
> > >  domid_t domid = info[i].domid;
> > > +libxl_vcpuinfo *vcpuinfo;
> > > +int nb_vcpu = 0, nr_cpus = 0;
> > > +
> > >
> > >  /* Don't need to check for Dom0 */
> > >  if (!domid)
> > >  continue;
> > >
> > > +vcpuinfo = libxl_list_vcpu(ctx, domid, &nb_vcpu, &nr_cpus);
> > > +
> > > +if (!vcpuinfo) {
> > > +  fprintf(stderr, "libxl_list_vcpu failed.\n");
> > > +  nb_vcpu = 0;
> > 
> > Is there any value to keep going if libxl_list_vcpu() fails?
> > Or is the reasoning is that cpu/*/availability was broken before, so
> > it's not important enough to stop init-dom0less?
> Yes, so missing cpu/*/availability nodes would mean we cannot
> pin/remove/add vcpus using xenlight I believe. However, we can still
> hotplug other stuff like net or block devices. In fact, I was doing
> exactly this when cpu/*/availability was broken.

Without the "availability" nodes, it probably mean that guest (probably
PV ones) will just use all available CPUs, it seems that Linux is doing
that, and only disable CPUs that are explicitly marked as "offline" via
that node.

But I guess it's ok.

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] automation: restore CR filtering

2024-08-21 Thread Anthony PERARD

On Tue, Aug 20, 2024 at 08:48:55PM -0700, Stefano Stabellini wrote:
> On Tue, 20 Aug 2024, Anthony PERARD wrote:
> > So I guess `sed` the output of `expect` will do. Maybe put that in a
> > script that also call the expect script? (To avoid duplication, and help
> > with maintainability of the whole thing.)
> 
> I tried a couple more times with expect but couldn't get it to work.
> With sed it seems to behave correctly. Can you guys spot any issues with
> it? If not, I'll send the full patch.
> 
> https://gitlab.com/xen-project/people/sstabellini/xen/-/jobs/7627376496

That job looks fine to me.

> diff --git a/automation/scripts/qemu-smoke-dom0-arm64.sh 
> b/automation/scripts/qemu-smoke-dom0-arm64.sh
> index 0094bfc8e1..e0cea742b0 100755
> --- a/automation/scripts/qemu-smoke-dom0-arm64.sh
> +++ b/automation/scripts/qemu-smoke-dom0-arm64.sh
> @@ -109,4 +109,4 @@ export QEMU_LOG="smoke.serial"
>  export LOG_MSG="Domain-0"
>  export PASSED="BusyBox"
>  
> -./automation/scripts/qemu-key.exp
> +./automation/scripts/qemu-key.exp | sed 's/\r\+$//'

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v2] tools/helpers/init-dom0less: fix vcpu availability

2024-08-20 Thread Anthony PERARD

On Tue, Aug 20, 2024 at 01:34:17PM +0530, Amneesh Singh wrote:
> diff --git a/tools/helpers/init-dom0less.c b/tools/helpers/init-dom0less.c
> index fee9345..722a5af 100644
> --- a/tools/helpers/init-dom0less.c
> +++ b/tools/helpers/init-dom0less.c
> @@ -167,15 +167,20 @@ retry_transaction:
>  /* /domain */
>  if (!do_xs_write_dom(xsh, t, domid, "vm", vm_val_str)) goto err;
>  if (!do_xs_write_dom(xsh, t, domid, "name", dom_name_str)) goto err;
> -if (!do_xs_write_dom(xsh, t, domid, "cpu", "")) goto err;

You should probably keep this node even if "*/availability" isn't going
to be written. It might be useful for watching everything under the
"cpu" node. (libxl create this node independently from all the other
"availability" sub-nodes.)

> @@ -330,14 +336,24 @@ int main(int argc, char **argv)
>  
>  for (i = 0; i < nb_vm; i++) {
>  domid_t domid = info[i].domid;
> +libxl_vcpuinfo *vcpuinfo;
> +int nb_vcpu = 0, nr_cpus = 0;
> +
>  
>  /* Don't need to check for Dom0 */
>  if (!domid)
>  continue;
>  
> +vcpuinfo = libxl_list_vcpu(ctx, domid, &nb_vcpu, &nr_cpus);
> +
> +if (!vcpuinfo) {
> +  fprintf(stderr, "libxl_list_vcpu failed.\n");
> +  nb_vcpu = 0;

Is there any value to keep going if libxl_list_vcpu() fails?
Or is the reasoning is that cpu/*/availability was broken before, so
it's not important enough to stop init-dom0less?


Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] automation: restore CR filtering

2024-08-20 Thread Anthony PERARD

On Mon, Aug 19, 2024 at 06:56:47PM -0700, Stefano Stabellini wrote:
> On Mon, 19 Aug 2024, Anthony PERARD wrote:
> > On Mon, Aug 19, 2024 at 09:21:22AM +0200, Michal Orzel wrote:
> > > On 17/08/2024 01:46, Stefano Stabellini wrote:
> > > > diff --git a/automation/scripts/qemu-xtf-dom0less-arm64.sh 
> > > > b/automation/scripts/qemu-xtf-dom0less-arm64.sh
> > > > index 0666f6363e..ed44aab0f0 100755
> > > > --- a/automation/scripts/qemu-xtf-dom0less-arm64.sh
> > > > +++ b/automation/scripts/qemu-xtf-dom0less-arm64.sh
> > > > @@ -65,4 +65,4 @@ export UBOOT_CMD="virtio scan; dhcp; tftpb 0x4000 
> > > > boot.scr; source 0x400
> > > >  export QEMU_LOG="smoke.serial"
> > > >  export PASSED="${passed}"
> > > > 
> > > > -./automation/scripts/qemu-key.exp
> > > > +./automation/scripts/qemu-key.exp | sed 's/\r//'
> > > 
> > > I compared 3 pipelines:
> > > 1) one before c36efb7fcea6 
> > > (https://gitlab.com/xen-project/hardware/xen/-/jobs/7566986885)
> > > 2) one after c36efb7fcea6 
> > > (https://gitlab.com/xen-project/hardware/xen/-/jobs/7603830706)
> > > 3) one with this fix 
> > > (https://gitlab.com/xen-project/people/sstabellini/xen/-/jobs/7603783403)
> > > 
> > > In 1), there is Xen log + Linux log in Gitlab web page
> > > In 2), there is no log at all
> > > In 3), there is only Xen log visible
> > 
> > It's nice that you can select uboot/Xen logs or Linux logs based on the
> > number of '\r' at the end of a line (output cat -A):
> > U-Boot 2023.01+dfsg-2+deb12u1 (Apr 18 2024 - 22:00:21 +)^M^M$
> > (XEN) [0.013864] Xen version 4.20-unstable (root@) (gcc (Alpine 
> > 12.2.1_git20220924-r10) 12.2.1 20220924) debug=n Sat Aug 17 00:54:57 UTC 
> > 2024^M^M$
> > [0.00] Booting Linux on physical CPU 0x00 
> > [0x411fd070]^M^M^M$
> > 
> > But to display to GitLab's job logs, we want: sed 's/\r\+$//'
> > 
> > Also, do you have to edit every single script to overcome a shortcoming
> > from the "expect" script? Can't you write a bit of Tcl and edit the line
> > in the script instead?
> 
> The sed route is not perfect but it works :-)
> 
> I did try using expect but the logs were mangled. I think I missed that
> there can be multilple \r. I managed to get close to the wanted behavior
> with the below, but the Xen logs are still missing and I don't know why. 
> 
> https://gitlab.com/xen-project/people/sstabellini/xen/-/jobs/7617161552

Well, it just looks like the output is duplicated, if you look at the
raw output:
https://gitlab.com/xen-project/people/sstabellini/xen/-/jobs/7617161552/raw

So I don't know if it's possible to edit the output before `expect`
prints it. It probably is, but not easy to do.

I did try to edit the command line to change QEMU's output:
-eval spawn $env(QEMU_CMD)
+spawn sh -c "$env(QEMU_CMD) | sed s/r+//"

But then many failure, probably because expect can't interact with qemu
anymore.

So I guess `sed` the output of `expect` will do. Maybe put that in a
script that also call the expect script? (To avoid duplication, and help
with maintainability of the whole thing.)

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] automation: restore CR filtering

2024-08-19 Thread Anthony PERARD

On Mon, Aug 19, 2024 at 09:21:22AM +0200, Michal Orzel wrote:
> On 17/08/2024 01:46, Stefano Stabellini wrote:
> > diff --git a/automation/scripts/qemu-xtf-dom0less-arm64.sh 
> > b/automation/scripts/qemu-xtf-dom0less-arm64.sh
> > index 0666f6363e..ed44aab0f0 100755
> > --- a/automation/scripts/qemu-xtf-dom0less-arm64.sh
> > +++ b/automation/scripts/qemu-xtf-dom0less-arm64.sh
> > @@ -65,4 +65,4 @@ export UBOOT_CMD="virtio scan; dhcp; tftpb 0x4000 
> > boot.scr; source 0x400
> >  export QEMU_LOG="smoke.serial"
> >  export PASSED="${passed}"
> > 
> > -./automation/scripts/qemu-key.exp
> > +./automation/scripts/qemu-key.exp | sed 's/\r//'
> 
> I compared 3 pipelines:
> 1) one before c36efb7fcea6 
> (https://gitlab.com/xen-project/hardware/xen/-/jobs/7566986885)
> 2) one after c36efb7fcea6 
> (https://gitlab.com/xen-project/hardware/xen/-/jobs/7603830706)
> 3) one with this fix 
> (https://gitlab.com/xen-project/people/sstabellini/xen/-/jobs/7603783403)
> 
> In 1), there is Xen log + Linux log in Gitlab web page
> In 2), there is no log at all
> In 3), there is only Xen log visible

It's nice that you can select uboot/Xen logs or Linux logs based on the
number of '\r' at the end of a line (output cat -A):
U-Boot 2023.01+dfsg-2+deb12u1 (Apr 18 2024 - 22:00:21 +)^M^M$
(XEN) [0.013864] Xen version 4.20-unstable (root@) (gcc (Alpine 
12.2.1_git20220924-r10) 12.2.1 20220924) debug=n Sat Aug 17 00:54:57 UTC 
2024^M^M$
[0.00] Booting Linux on physical CPU 0x00 
[0x411fd070]^M^M^M$

But to display to GitLab's job logs, we want: sed 's/\r\+$//'

Also, do you have to edit every single script to overcome a shortcoming
from the "expect" script? Can't you write a bit of Tcl and edit the line
in the script instead?

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v6 3/4] x86/ucode: Introduce --force option to xen-ucode

2024-08-19 Thread Anthony PERARD

On Mon, Aug 19, 2024 at 09:56:57AM +0100, Fouad Hilly wrote:
> On Thu, Jul 25, 2024 at 9:44 AM Jan Beulich  wrote:
>
> > On 25.07.2024 10:27, Fouad Hilly wrote:
> > > @@ -79,7 +81,9 @@ static void usage(FILE *stream, const char *name)
> > >  "options:\n"
> > >  "  -h, --helpdisplay this help\n"
> > >  "  -s, --show-cpu-info   show CPU information\n"
> > > -"Usage: %s [microcode file | options]\n", name, name);
> > > +"  -f, --force   skip certain checks; do not use 
> > > unless\n"
> > > +"you know exactly what you are doing\n"
> >
> > Did you look at the produced output? Imo you want to have
> >
> > "  -f, --force   skip certain checks; do not use 
> > unless\n"
> > "you know exactly what you are doing\n"
> >
> > > +"Usage: %s [microcode file [-f,--force] | options]\n", name, 
> > > name);
> >
> > At least
> >
> > "Usage: %s [microcode file [-f|--force] | options]\n", name, 
> > name);
> >
> > But: "options" now includes -f / --force, yet that on its own makes no 
> > sense.
> > I think this needs further textual clarification to properly indicate what 
> > is
> > valid to use and what is not.
> >
>
> Will be fixed in v7:
> static void usage(FILE *stream, const char *name)
> {
> fprintf(stream,
> "%s: Xen microcode updating tool\n"
> "Usage: %s [options | microcode-file]\n"
> "options:\n"
> "  -h, --help   display this help\n"
> "  -s, --show-cpu-info  show CPU information\n"
> "  -f, --force  skip certain checks; do not
> \n"

If I recall correctly, "--force" doesn't take any argument, so this
usage is misleading. One could be tempted to execute `./xen-ucode
-fmicrocode` or event `./xen-ucode --force -microcode` and expect it to
work with files "microcode" or "-microcode" but instead I think getopt()
is just going to return an error.

Instead of writing "--force ", could you change the help
text, with something like "skip certain checks when applying microcode"?

> "   use unless you know exactly
> \n"
> "   what you are doing\n",
> name, name);
> show_curr_cpu(stream);

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v2] mktarball: only archive Xen

2024-08-16 Thread Anthony PERARD

On Wed, Aug 14, 2024 at 09:42:46AM +0200, Jan Beulich wrote:
> --- a/Makefile
> +++ b/Makefile
> @@ -200,10 +200,6 @@ rpmball: dist
>  subtree-force-update: mini-os-dir-force-update
>   $(MAKE) -C tools subtree-force-update
>  
> -.PHONY: subtree-force-update-all
> -subtree-force-update-all: mini-os-dir-force-update
> - $(MAKE) -C tools subtree-force-update-all

If this rule is removed, the one in tools/Makefile and the one in
tools/firmware/Makefile should be removed as well. (Also remove that
string from $(no-configure-targets) in tools/Rules.mk.)

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] tools/helpers/init-dom0less: fix vcpu availability

2024-08-14 Thread Anthony PERARD

Hi Amneesh,

On Fri, Aug 02, 2024 at 01:28:25PM +0530, Amneesh Singh wrote:
> diff --git a/tools/helpers/init-dom0less.c b/tools/helpers/init-dom0less.c
> index fee9345..a8cdc6d 100644
> --- a/tools/helpers/init-dom0less.c
> +++ b/tools/helpers/init-dom0less.c
> @@ -300,7 +301,7 @@ int main(int argc, char **argv)
>  {
>  libxl_dominfo *info = NULL;
>  libxl_ctx *ctx;
> -int nb_vm = 0, rc = 0, i;
> +int nb_vm = 0, nb_vcpu = 0, nr_cpus = 0, rc = 0, i;
>  struct xs_handle *xsh = NULL;
>  struct xc_interface_core *xch = NULL;
>  xenforeignmemory_handle *xfh = NULL;
> @@ -330,14 +331,17 @@ int main(int argc, char **argv)
>  
>  for (i = 0; i < nb_vm; i++) {
>  domid_t domid = info[i].domid;
> +libxl_vcpuinfo *vcpuinfo;
>  
>  /* Don't need to check for Dom0 */
>  if (!domid)
>  continue;
>  
> +vcpuinfo = libxl_list_vcpu(ctx, domid, &nb_vcpu, &nr_cpus);

You should check that libxl_list_vcpu() didn't return an error. If it
did, `vcpuinfo` would be NULL, and `nb_vcpu` would be an outdated value.


>  printf("Checking domid: %u\n", domid);
>  if (!domain_exists(xsh, domid)) {
> -rc = init_domain(xsh, xch, xfh, &info[i]);
> +rc = init_domain(xsh, xch, xfh, &info[i], vcpuinfo);
>  if (rc < 0) {
>  fprintf(stderr, "init_domain failed.\n");
>  goto out;
> @@ -345,6 +349,8 @@ int main(int argc, char **argv)
>  } else {
>  printf("Domain %u has already been initialized\n", domid);
>  }
> +
> +    libxl_vcpuinfo_list_free(vcpuinfo, nb_vcpu);
>  }
>  out:
>  libxl_dominfo_list_free(info, nb_vm);


The rest of the patch looks fine.

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] Fixed incorrect output in xl's "help" command.

2024-08-13 Thread Anthony PERARD

On Mon, Aug 05, 2024 at 04:14:34PM +0200, Juergen Gross wrote:
> On 05.08.24 16:11, John E. Krokes wrote:
> > In "xl help", the output includes this line:
> > 
> >   vsnd-list   List virtual display devices for a domain
> > 
> > This should obviously say "sound devices" instead of "display devices".
> > 
> > Signed-off-by: John E. Krokes 
> 
> Reviewed-by: Juergen Gross 

Acked-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] libxl/disk: avoid aliasing of abs()

2024-08-13 Thread Anthony PERARD

On Tue, Aug 06, 2024 at 10:40:14AM +0200, Jan Beulich wrote:
> Tool chains with -Wshadow enabled by default won't like the function
> parameter name "abs", for aliasing stdlib.h's abs(). Rename the
> parameter to what other similar functions use.
> 
> Fixes: a18b50614d97 ("libxl: Enable stubdom cdrom changing")
> Signed-off-by: Jan Beulich 

Acked-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] mktarball: only archive Xen

2024-08-02 Thread Anthony PERARD

On Fri, Aug 02, 2024 at 11:34:53AM +0200, Jan Beulich wrote:
> On 02.08.2024 11:29, Anthony PERARD wrote:
> > As for removing the rule "subtree-force-update-all", I don't think
> > that's a good idea at all. As long as Xen's build systems is cloning
> > subtrees we need to keep the rule. Those subtrees aren't going away
> > anytime soon, especially mini-os for stubdom.
> 
> So you expect people might be using that goal from the command line?
> Fair enough to keep it if so. I simply didn't think a goal with this
> long a name would be intended for manual use.

Well, there's "subtree-force-update" described in `make help`, intended
to be used by developers. The "-all" variate update every subtree
regardless whether they are enable with ./configure or not. I did confuse the
two when replying, so I don't know if "-all" variate is still useful or
not.

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH] mktarball: only archive Xen

2024-08-02 Thread Anthony PERARD

On Thu, Aug 01, 2024 at 05:46:37PM +0200, Jan Beulich wrote:
> This is the simplistic approach; I'm sure this could now be done quite a
> bit more efficiently. I also expect there's no longer a need to run
> ./configure ahead of the invocation of this script, but since I have no
> idea why it was needed earlier on, I'm not removing that here from the
> doc. I further expect that the subtree-force-update-all prereq could
> also be dropped from the two involved Makefile goals. As that
> intermediate goal isn't used for any other purpose, the question there
> would be whether there's any reason to retain it. IOW all cleanup that's
> probably better done separately, by someone actually using all of that
> machinery.

The ./configure step might not be needed anymore since
00691c6c90b2 ("tools: Allow to make *-dir-force-update without ./configure")
and `make src-tarball-release` works without issue, without running
./configure that is.

And yes, `subtree-force-update-all` prereq could be dropped from
"src-tarbal%" targets in this patch. I've just tried, and the produced
tarball is the same without or with the prereq.

As for removing the rule "subtree-force-update-all", I don't think
that's a good idea at all. As long as Xen's build systems is cloning
subtrees we need to keep the rule. Those subtrees aren't going away
anytime soon, especially mini-os for stubdom.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [xen-4.19-testing test] 187097: regressions - FAIL

2024-08-02 Thread Anthony PERARD

On Fri, Aug 02, 2024 at 08:33:09AM +0200, Jan Beulich wrote:
> On 01.08.2024 23:38, osstest service owner wrote:
> > flight 187097 xen-4.19-testing real [real]
> > flight 187112 xen-4.19-testing real-retest [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/187097/
> > http://logs.test-lab.xenproject.org/osstest/logs/187112/
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stop   fail REGR. vs. 
> > 187044
> 
> Hmm, this is now pretty persistent in failing. Yet none of the changes
> under test should have affected behavior in any way. Ideas, anyone?

This test passed only a single time, across all branches in the last
year. And that happen to be on the new stable branch.

http://logs.test-lab.xenproject.org/osstest/results/history/test-amd64-amd64-xl-qemut-ws16-amd64/ALL.html

The success rate is incredebly low.

force-push ?

Cheers,

-- 
Anthony PERARD

Re: [XEN PATCH v7] tools/lsevtchn: Use errno macro to handle hypercall error cases

2024-08-01 Thread Anthony PERARD

On Wed, Jul 31, 2024 at 05:18:44PM +0100, Matthew Barnes wrote:
> Currently, lsevtchn aborts its event channel enumeration when it hits
> an event channel that is owned by Xen.
> 
> lsevtchn does not distinguish between different hypercall errors, which
> results in lsevtchn missing potential relevant event channels with
> higher port numbers.
> 
> Use the errno macro to distinguish between hypercall errors, and
> continue event channel enumeration if the hypercall error is not
> critical to enumeration.
> 
> Signed-off-by: Matthew Barnes 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [OSSTEST PATCH] preseed_base: Use "keep" NIC NamePolicy when "force-mac-address"

2024-07-31 Thread Anthony PERARD

On Tue, Jun 18, 2024 at 12:04:05PM +0100, Andrew Cooper wrote:
> On 18/06/2024 10:44 am, Anthony PERARD wrote:
> > On Mon, Jun 17, 2024 at 04:34:09PM +0100, Andrew Cooper wrote:
> >> On 17/06/2024 3:40 pm, Anthony PERARD wrote:
> >>> diff --git a/Osstest/Debian.pm b/Osstest/Debian.pm
> >>> index 3545f3fd..d974fea5 100644
> >>> --- a/Osstest/Debian.pm
> >>> +++ b/Osstest/Debian.pm
> >>> @@ -972,7 +972,19 @@ END
> >>>  # is going to be added to dom0's initrd, which is used by some 
> >>> guests
> >>>  # (created with ts-debian-install).
> >>>  preseed_hook_installscript($ho, $sfx,
> >>> -'/usr/lib/base-installer.d/', '05ifnamepolicy', <<'END');
> >>> +'/usr/lib/base-installer.d/', '05ifnamepolicy',
> >>> +$ho->{Flags}{'force-mac-address'} ? <<'END' : <<'END');
> >> The conditional looks suspicious if both options are <<'END'.
> > That works fine, this pattern is already used in few places in osstest,
> > like here:
> > https://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=ts-host-install;h=0b6aaeeae228551064618abfa624321992a2eb2d;hb=HEAD#l240
> > >  $ho->{Flags}{'force-mac-address'} ? < >
> > Or even here:
> > https://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=ts-xen-build;h=c294a51eafc26e53b5417529b943224902870acf;hb=HEAD#l173
> > > buildcmd_stamped_logged(600, 'xen', 'configure', < >
> >> Doesn't this just write 70-eth-keep-policy.link unconditionally?
> > I've check that, on a different host, and the "mac" name policy is used
> > as expected, so the file "70-eth-keep-policy.link" isn't created on that
> > host.
>
> This is horrifying.  Given a construct which specifically lets you
> choose a semantically meaningful name, using END for all options is rude.
>
> Despite the pre-existing antipatterns, it would be better to turn this
> one into:
>
> $ho->{Flags}{'force-mac-address'} ? <<'END_KEEP' : <<'END_MAC');

I've push the patch with this change.

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [RFC XEN PATCH v12 6/7] tools: Add new function to get gsi from dev

2024-07-29 Thread Anthony PERARD

On Tue, Jul 09, 2024 at 03:35:57AM +, Chen, Jiqian wrote:
> On 2024/7/8 21:27, Anthony PERARD wrote:
> > You could reuse the already opened fd from libxencall:
> > xencall_fd(xch->xcall)
> Do I need to check it this fd<0?

No, it should be good to use.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v6] tools/lsevtchn: Use errno macro to handle hypercall error cases

2024-07-29 Thread Anthony PERARD

On Mon, Jul 29, 2024 at 04:02:41PM +0100, Matthew Barnes wrote:
> diff --git a/tools/xcutils/lsevtchn.c b/tools/xcutils/lsevtchn.c
> index d1710613ddc5..29504c9d2077 100644
> --- a/tools/xcutils/lsevtchn.c
> +++ b/tools/xcutils/lsevtchn.c
> @@ -24,7 +25,28 @@ int main(int argc, char **argv)
>  status.port = port;
>  rc = xc_evtchn_status(xch, &status);
>  if ( rc < 0 )
> -break;
> +{
> +switch ( errno )
> +{
> +case EACCES: /* Xen-owned evtchn */
> +continue;
> +
> +case EINVAL: /* Port enumeration has ended */
> +rc = 0;
> +break;
> +
> +case ESRCH:
> +perror("Domain ID does not correspond to valid domain");

I think this is going to print:
"Domain ID does not correspond to valid domain: No such process"

> +rc = ESRCH;

So, `rc` is going to be the value returned by main(), that is the exit
value of the program `lsevtchn`. Having different exit value might
be useful sometime to distinguish between different errors but that
would at least need to be written in some document. Can we just return
"1" here on error? (I mean "rc = 1")

> +break;
> +
> +default:
> +perror(NULL);

It would be slightly more useful to print which function fails, simply
with:
perror("xc_evtchn_status");

> +rc = errno;

Same here, just "rc=1" should be good enough. Then if someone want to
know why `lsevtchn` failed, they can read the error messages.

At least, errno is likely to be between 0 and 255, but still, usually
not very useful as an exit value for a program.

> +break;
> +}
> +goto out;
> +}
>  
>  if ( status.status == EVTCHNSTAT_closed )
>  continue;
> @@ -58,7 +80,8 @@ int main(int argc, char **argv)
>  printf("\n");
>  }
>  
> + out:
>  xc_interface_close(xch);
>  
> -return 0;
> +return rc;
>  }

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v4] libxl: Enable stubdom cdrom changing

2024-07-29 Thread Anthony PERARD

On Sun, Jul 28, 2024 at 05:08:56PM -0400, Jason Andryuk wrote:
> +static void cdrom_insert_stubdom_parse_fdset(libxl__egc *egc,
> + libxl__ev_qmp *qmp,
> + const libxl__json_object 
> *response,
> + int rc)
> +{
> +EGC_GC;
> +libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp);
> +int devid;
> +int fdset;
> +
> +if (rc) goto out;
> +
> +/* Only called for qemu-xen/linux stubdom. */
> +assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN);
> +
> +devid = libxl__device_disk_dev_number(cis->disk->vdev, NULL, NULL);
> +fdset = query_fdsets_find_fdset(gc, response, devid);
> +if (fdset == ERROR_NOTFOUND) {
> +/* Give the stubdom a little time before trying again. */
> +rc = libxl__ev_time_register_rel(cis->ao, &cis->retry_timer,
> + cdrom_insert_stubdom_query_fdset,
> + 200);
> +if (rc) goto out;

There's a missing "return;" here.

And looks like it's the only issue,
with that fixed: Reviewed-by: Anthony PERARD 

> +} else if (fdset < 0) {
> +rc = fdset;
> +goto out;
> +}
> +
> +cis->stubdom_fdset = fdset;
> +
> +LOGD(DEBUG, cis->domid, "Found fdset %d", cis->stubdom_fdset);
> +    cdrom_insert_ejected(egc, &cis->qmp, NULL, rc);
> +return;
> +
> + out:
> +cdrom_insert_done(egc, cis, rc); /* must be last */
> +}
> +

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 07/12] libxl: Allow stubdomain to control interupts of PCI device

2024-07-26 Thread Anthony PERARD

On Thu, Jul 25, 2024 at 04:16:37PM +0200, Marek Marczykowski-Górecki wrote:
> On Thu, Jul 25, 2024 at 02:06:04PM +0000, Anthony PERARD wrote:
> > On Thu, May 16, 2024 at 03:58:28PM +0200, Marek Marczykowski-Górecki wrote:
> > > Especially allow it to control MSI/MSI-X enabling bits. This part only
> > > writes a flag to a sysfs, the actual implementation is on the kernel
> > > side.
> > >
> > > This requires Linux >= 5.10 in dom0 (or relevant patch backported).
> >
> > Does it not work before 5.10? Because the
> > Documentation/ABI/testing/sysfs-driver-pciback in linux tree say that
> > allow_interrupt_control is in 5.6.
>
> For MSI-X to work at least with Linux it needs a fixup
> 2c269f42d0f382743ab230308b836ffe5ae9b2ae, which was backported to
> 5.10.201, but not further.
>
> > > diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> > > index 96cb4da0794e..6f357b70b815 100644
> > > --- a/tools/libs/light/libxl_pci.c
> > > +++ b/tools/libs/light/libxl_pci.c
> > > @@ -1513,6 +1513,14 @@ static void pci_add_dm_done(libxl__egc *egc,
> > >  rc = ERROR_FAIL;
> > >  goto out;
> > >  }
> > > +} else if (libxl_is_stubdom(ctx, domid, NULL)) {
> > > +/* Allow acces to MSI enable flag in PCI config space for the 
> > > stubdom */
> >
> > s/acces/access/
> >
> > > +if ( sysfs_write_bdf(gc, 
> > > SYSFS_PCIBACK_DRIVER"/allow_interrupt_control",
> > > + pci) < 0 ) {
> > > +LOGD(ERROR, domainid, "Setting allow_interrupt_control for 
> > > device");
> > > +rc = ERROR_FAIL;
> > > +goto out;
> >
> > Is it possible to make this non-fatal for cases where the kernel is
> > older than the introduction of the new setting? Or does pci passthrough
> > doesn't work at all with a stubdom before the change in the kernel?
>
> MSI/MSI-X will not work. And if QEMU wouldn't hide MSI/MSI-X (upstream
> one doesn't), Linux won't fallback to INTx, so the device won't work at
> all.

Ok

I guess this patch is fine then:
Reviewed-by: Anthony PERARD 

Thanks,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v3] libxl: Enable stubdom cdrom changing

2024-07-25 Thread Anthony PERARD

On Wed, May 15, 2024 at 10:10:10PM -0400, Jason Andryuk wrote:
> +static void cdrom_insert_stubdom_parse_fdset_rm(libxl__egc *egc,
> +libxl__ev_qmp *qmp,
> +const libxl__json_object 
> *resp,
> +int rc)
> +{
> +EGC_GC;
> +libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp);
> +int devid;
> +int fdset;
> +
> +if (rc) goto out;
> +
> +/* Only called for qemu-xen/linux stubdom. */
> +assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN);
> +
> +devid = libxl__device_disk_dev_number(cis->disk->vdev, NULL, NULL);
> +fdset = query_fdsets_find_fdset(gc, resp, devid);
> +if (fdset < 0) {
> +rc = fdset;
> +goto out;
> +}
> +
> +LOGD(DEBUG, cis->domid, "Found fdset %d", fdset);
> +
> +libxl__json_object *args = NULL;
> +
> +libxl__qmp_param_add_integer(gc, &args, "fdset-id", fdset);
> +
> +cis->qmp.callback = cdrom_insert_stubdom_ejected;
> +
> +rc = libxl__ev_qmp_send(egc, &cis->qmp, "remove-fd", args);
> +if (rc) goto out;
> +
> +return;
> +
> + out:
> +if (rc == ERROR_NOTFOUND) {
> +LOGD(DEBUG, cis->domid, "No fdset found - skipping remove-fd");
> +cdrom_insert_stubdom_ejected(egc, qmp, resp, 0);

I think technically, cdrom_insert_stubdom_ejected also "must be last",
like cdrom_insert_done, for one thing it actually call cdrom_insert_done
in some cases. I think we used "/* must be last */" to indicate that
resources used by the current chain of callback could be freed,
including `egc`, `gc`, `ao`. There's quite a few more calls in this
patch that would benefit from the annotation. But we can live without
those.

> +} else {
> +cdrom_insert_done(egc, cis, rc); /* must be last */
> +}
> +}
[...]
> +static void cdrom_insert_stubdom_parse_fdset(libxl__egc *egc,
> + libxl__ev_qmp *qmp,
> + const libxl__json_object 
> *response,
> + int rc)
> +{
> +EGC_GC;
> +libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp);
> +int devid;
> +int fdset;
> +
> +if (rc) goto out;
> +
> +/* Only called for qemu-xen/linux stubdom. */
> +assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN);
> +
> +devid = libxl__device_disk_dev_number(cis->disk->vdev, NULL, NULL);
> +fdset = query_fdsets_find_fdset(gc, response, devid);
> +if (fdset < 0) {
> +rc = fdset;
> +goto out;
> +}
> +
> +cis->stubdom_fdset = fdset;
> +
> +LOGD(DEBUG, cis->domid, "Found fdset %d", cis->stubdom_fdset);
> +cdrom_insert_ejected(egc, &cis->qmp, NULL, rc);
> +return;
> +
> + out:
> +if (rc == ERROR_NOTFOUND) {

While in the previous function it seems ok to deal with the NOTFOUND
error in the "out:" path, I don't think it's a good idea here as it is
an expected part of the workflow of this callback, and not an error.

Could you move setting this timer above again? I guess something like
that would work fine:

  fdset = query_fdsets_find_fdset()
  if (fdset == ERROR_NOTFOUND) {
// doesn't exist yet, wait a bit
rc = libxl__ev_time_register_rel()
if (rc) goto out;
return
  }
  if (fdset < 0) {
  ...

Or also possible to deal with all error from query_fdsets..() first with
something like:
  if (fdset < 0 && fdset != ERROR_NOTFOUND) {
rc = fdset;
goto out;
   ...


> +rc = libxl__ev_time_register_rel(cis->ao, &cis->retry_timer,
> + cdrom_insert_stubdom_query_fdset,
> + 200);
> +if (rc) goto out;

That "goto out" after the "out" label makes this even stranger and even
a potential infinite loop if `rc` would happen to be set to
ERROR_NOTFOUND again, which I don't think can happen right now.

> +return;
> +}
> +
> +cdrom_insert_done(egc, cis, rc); /* must be last */
> +}


Otherwise patch looks good, with just the comment in
cdrom_insert_stubdom_parse_fdset
act on: Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 07/12] libxl: Allow stubdomain to control interupts of PCI device

2024-07-25 Thread Anthony PERARD

On Thu, May 16, 2024 at 03:58:28PM +0200, Marek Marczykowski-Górecki wrote:
> Especially allow it to control MSI/MSI-X enabling bits. This part only
> writes a flag to a sysfs, the actual implementation is on the kernel
> side.
>
> This requires Linux >= 5.10 in dom0 (or relevant patch backported).

Does it not work before 5.10? Because the
Documentation/ABI/testing/sysfs-driver-pciback in linux tree say that
allow_interrupt_control is in 5.6.

> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 96cb4da0794e..6f357b70b815 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -1513,6 +1513,14 @@ static void pci_add_dm_done(libxl__egc *egc,
>  rc = ERROR_FAIL;
>  goto out;
>  }
> +} else if (libxl_is_stubdom(ctx, domid, NULL)) {
> +/* Allow acces to MSI enable flag in PCI config space for the 
> stubdom */

s/acces/access/

> +if ( sysfs_write_bdf(gc, 
> SYSFS_PCIBACK_DRIVER"/allow_interrupt_control",
> + pci) < 0 ) {
> +LOGD(ERROR, domainid, "Setting allow_interrupt_control for 
> device");
> +rc = ERROR_FAIL;
> +goto out;

Is it possible to make this non-fatal for cases where the kernel is
older than the introduction of the new setting? Or does pci passthrough
doesn't work at all with a stubdom before the change in the kernel?

If making this new setting conditional is an option, you could
potentially improve the error code returned by sysfs_write_bdf() to
distinguish between an open() failure and write() failure, to avoid
checking the existance of the path ahead of the call. But maybe that
pointless because it doesn't appear possible to distinguish between
permission denied and not found.

Thanks,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v5 2/4] x86/ucode: refactor xen-ucode to utilize getopt

2024-07-24 Thread Anthony PERARD

On Fri, Jul 12, 2024 at 02:07:47PM +0100, Fouad Hilly wrote:
> diff --git a/tools/misc/xen-ucode.c b/tools/misc/xen-ucode.c
> index 390969db3d1c..8de82e5b8a10 100644
> --- a/tools/misc/xen-ucode.c
> +++ b/tools/misc/xen-ucode.c
> @@ -71,12 +72,29 @@ static void show_curr_cpu(FILE *f)
>  }
>  }
>  
> +static void usage(FILE *stream, const char *name)
> +{
> +fprintf(stream,
> +"%s: Xen microcode updating tool\n"
> +"options:\n"
> +"  -h, --helpdisplay this help\n"
> +"  -s, --show-cpu-info   show CPU information\n"
> +"Usage: %s [microcode file] [options]\n", name, name);

FYI, I disagree with Andy about the order of this message. First is
"Usage:" which explain where the option (dash-prefixed) can go, and
which are the mandatory arguments, sometime having all the single-letter
option in this line as well. Then there's an explanation of what the
options are. I've check `bash`, `cat`, `xl`, `gcc`.

I wonder which CLI program would print the minimum amount of information
on how to run the program as the last line of the help message.

> @@ -86,22 +104,34 @@ int main(int argc, char *argv[])
>  exit(1);
>  }
>  
> -if ( argc < 2 )
> +while ( (opt = getopt_long(argc, argv, "hs", options, NULL)) != -1 )
>  {
> -fprintf(stderr,
> -"xen-ucode: Xen microcode updating tool\n"
> -"Usage: %s [ | show-cpu-info]\n", argv[0]);
> -show_curr_cpu(stderr);
> -exit(2);
> +switch (opt)
> +{
> +case 'h':
> +usage(stdout, argv[0]);
> +exit(EXIT_SUCCESS);
> +
> +case 's':
> +show_curr_cpu(stdout);
> +exit(EXIT_SUCCESS);
> +
> +default:
> +goto ext_err;
> +}
>  }
>  
> -if ( !strcmp(argv[1], "show-cpu-info") )
> +if ( optind == argc )
> +goto ext_err;
> +
> +/* For backwards compatibility to the pre-getopt() cmdline handling */
> +if ( !strcmp(argv[optind], "show-cpu-info") )
>  {
>  show_curr_cpu(stdout);
>  return 0;
>  }
>  
> -filename = argv[1];
> +filename = argv[optind];
>  fd = open(filename, O_RDONLY);
>  if ( fd < 0 )
>  {
> @@ -146,4 +176,10 @@ int main(int argc, char *argv[])
>  close(fd);
>  
>  return 0;
> +
> + ext_err:
> +fprintf(stderr,
> +"%s: unable to process command line arguments\n", argv[0]);

A nice to have would be to have a better error message to point out
what's wrong with the arguments. For that you could print the error
message before "goto ext_err". One would be "unknown option" for the
first goto, and "missing microcode file" for the second goto, that is
instead of printing this more generic error message.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: xen | Failed pipeline for staging-4.19 | 2d7b6170

2024-07-24 Thread Anthony PERARD

On Wed, Jul 24, 2024 at 03:18:50PM +0200, Jan Beulich wrote:
> On 24.07.2024 15:15, GitLab wrote:
> > 
> > 
> > Pipeline #1385987377 has failed!
> > 
> > Project: xen ( https://gitlab.com/xen-project/hardware/xen )
> > Branch: staging-4.19 ( 
> > https://gitlab.com/xen-project/hardware/xen/-/commits/staging-4.19 )
> > 
> > Commit: 2d7b6170 ( 
> > https://gitlab.com/xen-project/hardware/xen/-/commit/2d7b6170cc69f8a1a60c52d87ba61f6b1f180132
> >  )
> > Commit Message: hotplug: Restore block-tap phy compatibility (a...
> > Commit Author: Jason Andryuk ( https://gitlab.com/jandryuk-amd )
> > Committed by: Jan Beulich ( https://gitlab.com/jbeulich )
> > 
> > 
> > Pipeline #1385987377 ( 
> > https://gitlab.com/xen-project/hardware/xen/-/pipelines/1385987377 ) 
> > triggered by Jan Beulich ( https://gitlab.com/jbeulich )
> > had 3 failed jobs.
> > 
> > Job #7415912260 ( 
> > https://gitlab.com/xen-project/hardware/xen/-/jobs/7415912260/raw )
> > 
> > Stage: test
> > Name: qemu-alpine-x86_64-gcc
> 
> This is the one known to fail more often than not, I think, but ...
> 
> > Job #7415912175 ( 
> > https://gitlab.com/xen-project/hardware/xen/-/jobs/7415912175/raw )
> > 
> > Stage: build
> > Name: ubuntu-24.04-x86_64-clang
> > Job #7415912173 ( 
> > https://gitlab.com/xen-project/hardware/xen/-/jobs/7415912173/raw )
> > 
> > Stage: build
> > Name: ubuntu-22.04-x86_64-gcc
> 
> ... for these two I can't spot any failure in the referenced logs. What's 
> going on
> there?

They are crutial information missing in that email, the actual reason
given by gitlab for the failures: "There has been a timeout failure or
the job got stuck." (That message can be seen when going to the url,
removing "/raw" part, and scrolling to the top. Or looking at the side
bar and seen a duration that well above 1h)

Communication between gitlab and the runner might be broken in those
cases, or the runner stop working.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH for-4.19] hotplug: Restore block-tap phy compatibility (again)

2024-07-24 Thread Anthony PERARD

On Tue, Jul 23, 2024 at 07:31:58PM +0200, oleksii.kuroc...@gmail.com wrote:
> On Tue, 2024-07-23 at 11:04 -0400, Jason Andryuk wrote:
> > Oleksii, this is a fix (for an incomplete fix) for 4.19.  76a484193d
> > broke compatibility for block-tap with the blktap2 kernel model (when
> > adding support for tapback).  This finishes restoring blktap2
> > support.
> >
> > I realize it's late in the release if you don't want to take it.
> It's pretty late but I just wanted to clarify:
> 1. Is so critical that we should have this in 4.19?

I don't think it's critical enough to justify delaying the release.
blktap2 is I think an out-of-tree kernel module so probably not very
used.
We might want to at least put a note in "known issue" about blktap2
support by the "block-tap" script been broken.

> 2. If we won't take it now, then will it be backported anyway?

Yes, we want to backport that as soon as 4.19 tree is open for
backports, to have the fix in 4.19.1.

Cheers,

--

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [XEN PATCH v3 RESEND] tools/lsevtchn: Use errno macro to handle hypercall error cases

2024-07-23 Thread Anthony PERARD

On Mon, Jul 15, 2024 at 04:36:03PM +0100, Matthew Barnes wrote:
> Currently, lsevtchn aborts its event channel enumeration when it hits
> its first hypercall error, namely:
> * When an event channel doesn't exist at the specified port
> * When the event channel is owned by Xen
> 
> lsevtchn does not distinguish between different hypercall errors, which
> results in lsevtchn missing potential relevant event channels with
> higher port numbers.
> 
> Use the errno macro to distinguish between hypercall errors, and
> continue event channel enumeration if the hypercall error is not
> critical to enumeration.
> 
> Signed-off-by: Matthew Barnes 
> ---
>  tools/xcutils/lsevtchn.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/xcutils/lsevtchn.c b/tools/xcutils/lsevtchn.c
> index d1710613ddc5..e4b3f88fe4ec 100644
> --- a/tools/xcutils/lsevtchn.c
> +++ b/tools/xcutils/lsevtchn.c
> @@ -24,7 +25,18 @@ int main(int argc, char **argv)
>  status.port = port;
>  rc = xc_evtchn_status(xch, &status);
>  if ( rc < 0 )
> -break;
> +{
> +if ( errno == ESRCH )
> +{
> +fprintf(stderr, "Domain ID '%d' does not correspond to valid 
> domain.\n", domid);
> +break;
> +}
> +
> +if ( errno == EINVAL )
> +break;
> +
> +continue;

Usually, when there's an error, we want to deal with it properly, and
not blindly ignore them. Instead, could you check for error that are
non-fatal, like described in the patch description?

Also, the code change doesn't reflect the patch description. The
description says "continue if error not fatal", but the implementation
is "exit on few known fatal error".

One other potentially useful thing to do would be to print why
xc_evtchn_status() failed like you did for ESRCH, but with perror() (or
err() or warn()) which print a translation of the errno value into text.
Of course, only in case where we stop the enumeration.

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH for-4.19] hotplug: Restore block-tap phy compatibility (again)

2024-07-23 Thread Anthony PERARD

On Mon, Jul 15, 2024 at 07:46:31PM -0400, Jason Andryuk wrote:
> "$dev" needs to be set correctly for backendtype=phy as well as
> backendtype=tap.  Move the setting into the conditional, so it can be
> handled properly for each.
> 
> (dev could be captured during tap-ctl allocate for blktap module, but it
> would not be set properly for the find_device case.  The backendtype=tap
> case would need to be handled regardless.)
> 
> Fixes: 6fcdc84927 ("hotplug: Restore block-tap phy compatibility")

Do you mean f16ac12bd418 ("hotplug: Restore block-tap phy compatibility") ?

> Fixes: 76a484193d ("hotplug: Update block-tap")
> 
> Signed-off-by: Jason Andryuk 

With the fixes tag fix:
Reviewed-by: Anthony PERARD 

> ---
> Tested with backendtype=tap & tapback and backendtype=phy & blktap
> module.

Thanks for the extra testing.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH v3 2/2] Add tools/fuzz/oss-fuzz/build.sh

2024-07-22 Thread Anthony PERARD

On Mon, Jul 22, 2024 at 02:20:21PM +0200, Jan Beulich wrote:
> On 22.07.2024 13:27, Tamas K Lengyel wrote:
> > +# SPDX-License-Identifier: Apache-2.0
> 
> ... there wouldn't want to be an "-only" tag here. Yet I'm entirely uncertain
> with this, as CC-BY-4.0 also has no such tag.

FYI, all valid SPDX identifier are listed there:
https://spdx.org/licenses/

"Apache-2.0" is the correct one to use.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [linux-linus test] 186932: regressions - FAIL

2024-07-22 Thread Anthony PERARD

On Mon, Jul 22, 2024 at 03:06:07PM +0200, Jan Beulich wrote:
> On 22.07.2024 15:03, Anthony PERARD wrote:
> > On Mon, Jul 22, 2024 at 09:05:43AM +0200, Jan Beulich wrote:
> >> On 22.07.2024 06:56, osstest service owner wrote:
> >>> flight 186932 linux-linus real [real]
> >>> http://logs.test-lab.xenproject.org/osstest/logs/186932/
> >>>
> >>> Regressions :-(
> >>>
> >>> Tests which did not succeed and are blocking,
> >>> including tests which could not be run:
> >>>  test-arm64-arm64-examine  8 reboot   fail REGR. vs. 
> >>> 186827
> >>>  test-arm64-arm64-xl   8 xen-boot fail REGR. vs. 
> >>> 186827
> >>
> >> There looks to be a basic problem as of flight 186925, yet a brief look at 
> >> one
> >> of the logs doesn't really give any hint other than the system perhaps 
> >> simply
> >> being slow. Ideas, anyone?
> > 
> > Well, yes, it's really slow to reach having a running ssh server. If I
> > let the machine boot, there's two reason in the log for the long time:
> > 
> > Jul 22 11:44:25.216867 Waiting for /dev to be fully populated...Timed out 
> > for waiting the udev queue being empty.
> > Jul 22 11:46:25.469002 done (timeout).
> > 
> > Jul 22 11:46:29.103350 Configuring network interfaces...
> > Jul 22 11:46:32.127350 ^@Timed out for waiting the udev queue being empty.
> 
> Yet both of these instances of timing out look suspiciously like a regression
> (in or caused by the kernel)?

Yes, no timeout in the logs of previous runs. So it's likely a
regression in Linux, between f83e38fc9f10 and 3c3ff7be9729.
http://logs.test-lab.xenproject.org/osstest/results/history/test-arm64-arm64-xl/linux-linus.html

-- 
Anthony PERARD

Re: [linux-linus test] 186932: regressions - FAIL

2024-07-22 Thread Anthony PERARD

On Mon, Jul 22, 2024 at 09:05:43AM +0200, Jan Beulich wrote:
> On 22.07.2024 06:56, osstest service owner wrote:
> > flight 186932 linux-linus real [real]
> > http://logs.test-lab.xenproject.org/osstest/logs/186932/
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  test-arm64-arm64-examine  8 reboot   fail REGR. vs. 
> > 186827
> >  test-arm64-arm64-xl   8 xen-boot fail REGR. vs. 
> > 186827
> 
> There looks to be a basic problem as of flight 186925, yet a brief look at one
> of the logs doesn't really give any hint other than the system perhaps simply
> being slow. Ideas, anyone?

Well, yes, it's really slow to reach having a running ssh server. If I
let the machine boot, there's two reason in the log for the long time:

Jul 22 11:44:25.216867 Waiting for /dev to be fully populated...Timed out for 
waiting the udev queue being empty.
Jul 22 11:46:25.469002 done (timeout).

Jul 22 11:46:29.103350 Configuring network interfaces...
Jul 22 11:46:32.127350 ^@Timed out for waiting the udev queue being empty.
Jul 22 11:48:29.256918 [  403.298102] NET: Registered PF_INET6 protocol family
... more network kernel info after that.
Jul 22 11:48:33.204921 Waiting for xenbr0 to get ready (MAXWAIT is 2 seconds).
Jul 22 11:48:33.204984 done.

(on previous run, "configuring network interface" is followed by
"waiting for xenbr0")

So, we lost already 4 min waiting, out of a budget of 7min for full
reboot.

Also, "reboot" doesn't work with this newer kernel, the machine prints
"Will now restart." then nothing happen.

I have also try without Xen, it's the same behavior.

Cheers,

-- 
Anthony PERARD

Re: [PATCH for-4.19] docs/checklist: Fix XEN_EXTRAVERSION inconsistency for release candidates

2024-07-16 Thread Anthony PERARD

On Tue, Jul 16, 2024 at 10:22:18AM +0200, Juergen Gross wrote:
> On 16.07.24 09:46, Jan Beulich wrote:
> > On 16.07.2024 09:33, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 16/07/2024 08:24, Jan Beulich wrote:
> > > > On 16.07.2024 09:22, Julien Grall wrote:
> > > > > On 16/07/2024 07:47, Jan Beulich wrote:
> > > > > > On 15.07.2024 18:56, Julien Grall wrote:
> > > > > > > On 15/07/2024 16:50, Andrew Cooper wrote:
> > > > > > > > An earlier part of the checklist states:
> > > > > > > > 
> > > > > > > >   * change xen-unstable README. The banner (generated using 
> > > > > > > > figlet) should say:
> > > > > > > >   - "Xen 4.5" in releases and on stable branches
> > > > > > > >   - "Xen 4.5-unstable" on unstable
> > > > > > > >   - "Xen 4.5-rc" for release candidate
> > > > > > > > 
> > > > > > > > Update the notes about XEN_EXTRAVERSION to match.
> > > > 
> > > > When this is the purpose of the patch, ...
> > > > 
> > > > > > > We have been tagging the tree with 4.5.0-rcX. So I think it would 
> > > > > > > be
> > > > > > > better to update the wording so we use a consistent naming.
> > > > > > 
> > > > > > I find:
> > > > > > 
> > > > > > 4.18-rc
> > > > > > 4.17-rc
> > > > > > 4.16-rc
> > > > > > 4.15-rc
> > > > > 
> > > > > Hmmm... I don't think we are looking at the same thing. I was
> > > > > specifically looking at the tag and *not* XEN_EXTRAVERSION.
> > > > 
> > > > ... why would we be looking at tags?
> > > 
> > > As I wrote, consistency across the naming scheme we use.
> > > 
> > > > The tags (necessarily) have RC numbers,
> > > 
> > > Right but they also *have* the .0.
> > > 
> > > > so are going to be different from XEN_EXTRAVERSION in any event.
> > > 
> > > Sure they are not going to be 100% the same. However, they could have
> > > some similarity.
> > > 
> > > As I pointed out multiple times now, to me it is odd we are tagging the
> > > tree with 4.19.0-rcX, but we use 4.19-rc.
> > > 
> > > Furthermore, if you look at the history of the document. It is quite
> > > clear that the goal was consistency (the commit mentioned by Andrew
> > > happened after). Yes it wasn't respected but I can't tell exactly why.
> > > 
> > > So as we try to correct the documentation, I think we should also look
> > > at consistency. If you *really* want to drop the .0, then I think it
> > > should happen for the tag as well (again for consistency).
> > 
> > I don't see why (but I also wouldn't mind the dropping from the tag).
> > They are going to be different. Whether they're different in one or two
> > aspects is secondary to me. I rather view the consistency goal to be
> > with what we've been doing in the last so many releases.
> 
> Another aspect to look at would be version sorting. This will be interesting
> when e.g. having a Xen rpm package installed and upgrading it with a later
> version. I don't think we want to regard replacing an -rc version with the
> .0 version to be a downgrade, so the the version numbers should be sorted by
> "sort -V" in the correct order.

Packages version from distribution is not something we have to deal with
upstream. It's for the one writing the package version to make sure
that -rc are older than actual release.

While trying to to find if SPEC files where dealing with "-rc" suffix,
I found a doc for fedora telling how to deal with RCs:
https://docs.fedoraproject.org/en-US/packaging-guidelines/Versioning/
They say to replace the dash with a tilde, so "-rc" become "~rc", and
package manager know what to do with it.

Some other distribution know how to deal with "rc" suffix, but the dash
"-" isn't actually allowed in the version string:
https://man.archlinux.org/man/vercmp.8

So unless we forgo "-rc" in tags, there's no way we can take into
account how distributions package manager sorts version numbers. Also,
there's no need to, it is the job of the packager to deal with version
number, we just need to make is simple enough and consistent.

Cheers,

-- 
Anthony PERARD

Re: [PATCH 22/12] tools/examples: Remove more obsolete content

2024-07-15 Thread Anthony PERARD

On Mon, Jul 15, 2024 at 04:16:40PM +0100, Andrew Cooper wrote:
> xeninfo.pl was introduced in commit 1b0a8bb57e3e ("Added xeninfo.pl, a script
> for collecting statistics from Xen hosts using the Xen-API") and has been
> touched exactly twice since to remove hardcoded IP addresses and paths.

Xen-API, is that xapi?

> The configuration files in vnc/* date from when we had a vendered version of
> Qemu living in the tree.

I guess QEMU at that time didn't have VNC support? Because looks like
the vnc config was supposed to be put in a guest, and `xm` had support
for it.

> These have never (AFAICT) been wired into the `make install` rule.
> 
> Signed-off-by: Andrew Cooper 

Anyway, patch looks good:
Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 21/12] CI: Refresh and upgrade the Fedora container

2024-07-15 Thread Anthony PERARD

On Sat, Jul 13, 2024 at 07:10:05PM +0100, Andrew Cooper wrote:
> Fedora 29 is long out of date.  Move forward 5 years to Fedora 39.
> 
> Inlcude all the usual improvements.  Rework the container to be non-root, use
> heredocs for legibility, and switch to the new naming scheme.

"clang" as been removed, but it's not mention in the commit message. The
other changes are kind of "the usual improvement" yes, as been done to
other containers (libaio, nasm, ...).

The patch is missing an update of the "containerize" script.

> diff --git a/automation/build/fedora/39-x86_64.dockerfile 
> b/automation/build/fedora/39-x86_64.dockerfile
> new file mode 100644
> index ..e2048a437581
> --- /dev/null
> +++ b/automation/build/fedora/39-x86_64.dockerfile
...
> +
> +dnf -y install "${DEPS[@]}"

You might want to add --setopt=install_weak_deps=False to avoid
installing "git" for example.
When running the original command by hand, `dnf` want to install those weak 
deps:
Installing weak dependencies:
 apr-util-bdb
 apr-util-openssl
 cryptsetup-libs
 diffutils
 git
 libbpf
 libxkbcommon
 mercurial
 perl-NDBM_File
 python3-fb-re2
 python3-pip
 qrencode-libs
 subversion
 systemd-networkd
 systemd-resolved
And comparring the list of deps, these extra pkgs are installed because
of weak deps:
apr
apr-util
dbus
dbus-broker
dbus-common
device-mapper
device-mapper-libs
git-core-doc
kmod-libs
libargon2
libseccomp
libserf
perl-Error
perl-File-Find
perl-Git
perl-TermReadKey
perl-lib
python3-zombie-imp
re2
subversion-libs
systemd
systemd-pam
utf8proc
xkeyboard-config

So, probably only "perl-File-Find" (can be written "perl(File::Find)" I
think as well) might be needed, but only for docs/ I think, from the
previous email.

It seems I go from 1.26GB to 1.18GB without those weak deps.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 20/12] CI: Swap from perl to perl-base in build containers

2024-07-15 Thread Anthony PERARD

On Sat, Jul 13, 2024 at 07:09:52PM +0100, Andrew Cooper wrote:
> We only need a basic perl environment, not the things that a primarily-perl
> project would want.
> 
> Discovered during the Fedora refresh where the difference is ~40M, but it's
> more modest for OpenSUSE and Ubuntu.

"perl-base" on Ubuntu is quite different from "perl-base" on OpenSUSE
Leap. I haven't check Tumbleweed but is probably the same as Leap.
So, I think these would deserve to be in separated patches.

On Ubuntu, installing "perl" or "perl-base" or none of them makes no
difference. We install "build-essentials" which pulls "dpkg-dev" which
pulls "perl", so we don't test properly if "perl-base" is enough or if
we need more.

I tried to find out which Perl module we would need, and I've got this
list:

- get_maintainer
Getopt::Long;
- add_maintainer
Getopt::Long;
File::Basename;
List::MoreUtils;
IO::Handle;
- stubdom/vtpmmgr:
Digest::SHA qw(sha1);
Math::BigInt only => 'GMP';
- kconfig
Getopt::Long;
- ocaml/lib/xc/abi-check
Data::Dumper;
- docs/gen-html-index
Getopt::Long;
IO::File;
File::Basename;
- docs/xen-headers
Getopt::Long;
File::Find;
IO::File;
(there's also tools/examples/xeninfo.pl, I dont't if that can still work
or if we need to purge it)

Then I've got whatever module is in "perl-base" pkg.

For Leap's perl-base:
getopt:long
file:basename
io:handle
io:file
file:find
data:dumper
digest:sha
So, add_maintainer.pl and the script in stubdom/vtpmmgr won't work. So
that's probably fine for Leap.

For Ubuntu 22.04 (not check others, but likely about the same)
there's more packages:
perl-base
getopt::long
file:basename
io:handle
io:file
perl-modules-5.34
file:find
math:bigint
libperl5.34
data:dumper
digest:sha
liblist-moreutils-perl
list:MoreUtils
librpc-xml-perl
rpc:xml
("perl" pkg pulls "perl-modules-5.34" and "libperl5.34")

So with just "perl-base", the ocaml's abi-check wouldn't work, as well
as the docs's script (probably ok for "docs" if we don't use containers
to build them). Also add_maintainer and stubdom/vtpmmgr like for Leap.

So I would suggest to not touch the Ubuntu containers, make the change
to the OpenSUSE one and maybe add the limitation to the commit message.

Cheers,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 19/12] docs: Fix install-man$(1)-pages if no manpages are generated

2024-07-15 Thread Anthony PERARD

On Sat, Jul 13, 2024 at 07:09:39PM +0100, Andrew Cooper wrote:
> All tools to build manpages are optional, and if none of them happen to be
> present, the intermediate working directory may not even be created.
> 
> Treat this as non-fatal, bringing the behaviour in line with install-html.
> Like the html side, it needs to be not-or to avoid Make thinking the rule has
> failed.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 18/12] CI: Add Ubuntu 22.04 (Jammy) and 24.04 (Noble) testing

2024-07-12 Thread Anthony PERARD

On Fri, Jul 12, 2024 at 11:49:07AM +0100, Andrew Cooper wrote:
> The containers are exactly as per 20.04 (Focal).  However, this now brings us
> to 5 releases * 4 build jobs worth of Ubuntu testing, which is overkill.
> 
> The oldest and newest toolchains are the most likely to find problems with new
> code, so reduce the middle 3 releases (18/20/22) to just a single smoke test
> each.

That would mean a bit less `clang` build-test, but I guess that would be
mostly covered by FreeBSD testing on GitHub.

I tried to find out which version of clang were been used with this
patch applied:
Alpine clang version 16.0.6
Debian clang version 14.0.6
clang version 17.0.6
clang version 18.1.8
clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
Ubuntu clang version 18.1.3 (1ubuntu1)
and the versions that won't be tested anymore:
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
clang version 10.0.0-4ubuntu1
Ubuntu clang version 14.0.0-1ubuntu1.1

> Signed-off-by: Andrew Cooper 

Anyway:
Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [PATCH 17/12] CI: Refresh Ubuntu Focal container as 20.04-x86_64

2024-07-12 Thread Anthony PERARD

On Fri, Jul 12, 2024 at 11:48:55AM +0100, Andrew Cooper wrote:
> Exactly as per 18.04 (Bionic).  This saves ~500M:
> 
>   registry.gitlab.com/xen-project/xen/ubuntu20.04-x86_64   1.06GB
>   registry.gitlab.com/xen-project/xen/ubuntufocal  1.57GB
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Anthony PERARD 

Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2112 matches

Mail list logo