Re: [PATCH] autogen.sh: Restore --no-git (avoid git submodule update)

2020-06-03 Thread Ian Jackson
Pavel Hrdina writes ("Re: [PATCH] autogen.sh: Restore --no-git (avoid git 
submodule update)"):
> On Tue, Jun 02, 2020 at 04:47:45PM +0100, Ian Jackson wrote:
> > -git submodule update --init || exit 1
> > +if [ "x$1" = x--no-git ]; then
> > +   shift
> > +else
> > +   git submodule update --init || exit 1
> 
> I changed the TAB into spaces.

Ah, sorry for not following the right style.

> Reviewed-by: Pavel Hrdina 
> 
> and pushed.

Thanks :-).

Ian.



Re: [PATCH] autogen.sh: Restore --no-git (avoid git submodule update)

2020-06-03 Thread Ian Jackson
Pavel Hrdina writes ("Re: [PATCH] autogen.sh: Restore --no-git (avoid git 
submodule update)"):
> To be honest I don't understand why would anyone want to keep track of
> all submodules of all projects for any CI and update it manually every
> time the upstream project changes these submodules. Sounds like a lot
> of unnecessary work but maybe I'm missing something.

Maybe I should answer this.  The short answer is that this can be done
entirely automatically.

> Well, we will break a lot more by switching to Meson build system where
> everyone building libvirt will have to change their scripts as it will
> not be compatible at all.

When that occurs we'll have to change our build rune, of course.

Our CI system (which does bisection and stable tracking and so on)
will wants to build old versions, so it'll have to have both build
runes and look for something in-tree to tell the two methods apart.

Thanks,
Ian.



Re: [PATCH] autogen.sh: Restore --no-git (avoid git submodule update)

2020-06-03 Thread Ian Jackson
Pavel Hrdina writes ("Re: [PATCH] autogen.sh: Restore --no-git (avoid git 
submodule update)"):
> There should not be any need to disable this explicitly unless you want
> to build libvirt with different revisions of submodules.

The Xen Project CI has a cross-tree bisector.  It is capable of
automatically selecting revisions, including revisions of submodules,
for testing.

It therefore needs to be able to build with different revisions of
submodules, indeed.

Thanks,
Ian.



[PATCH] autogen.sh: Restore --no-git (avoid git submodule update)

2020-06-02 Thread Ian Jackson
Prior to 2621d48f005a "gnulib: delete all gnulib integration",
one could pass ./autogen.sh --no-git to prevent the libvirt build
system from running git submodule update.

This feature is needed by systems like the Xen Project CI which want
to explicitly control the revisions of every tree.  These will
typically arrange to initialise the submodules check out the right
version of everything, and then expect the build system not to mess
with it any more.

Despite to the old documentation comments referring only to gnulib,
the --no-git feature is required not only because of gnulib but also
because of the other submodule, src/keycodemapdb.

(And in any case, even if it were no longer required because all the
submodules were removed, it ought ideally to have been retained as a
no-op for compaibility reasons.)

So restore the --no-git feature.

Because of the way the argument parsing of autogen.sh works, it is
easiest to recognise this option only if it comes first.  This works
for the Xen Project CI, which has always passed this option first.

If something else is using this option (and hasn't introduced a
different workaround in the meantime), not in the first position,
then perhaps a more sophisticated approach will be needed.  But I
think this will do for now.

Signed-off-by: Ian Jackson 
---
 autogen.sh | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/autogen.sh b/autogen.sh
index 4e1bbceb0a..1c98273452 100755
--- a/autogen.sh
+++ b/autogen.sh
@@ -1,5 +1,10 @@
 #!/bin/sh
 # Run this to generate all the initial makefiles, etc.
+#
+# THe following options must come first.  All other or subsequent
+# arguments are passed to configure:
+#   --no-git   skip `git submodule update --init`
+
 test -n "$srcdir" || srcdir=$(dirname "$0")
 test -n "$srcdir" || srcdir=.
 
@@ -13,7 +18,11 @@ cd "$srcdir"
 exit 1
 }
 
-git submodule update --init || exit 1
+if [ "x$1" = x--no-git ]; then
+   shift
+else
+   git submodule update --init || exit 1
+fi
 
 autoreconf --verbose --force --install || exit 1
 
-- 
2.11.0



Re: [libvirt] Likely build race, "/usr/bin/ld: cannot find -lvirt"

2018-06-13 Thread Ian Jackson
Jiri Denemark writes ("Re: [libvirt] Likely build race, "/usr/bin/ld: cannot 
find -lvirt""):
> On Tue, Jun 12, 2018 at 07:57:40 -0500, Eric Blake wrote:
> > Can you add that line directly into Makefile.am, or does doing that 
> > cause automake to complain and/or omit its normal rules because it 
> > thinks you are overriding its defaults?
> 
> Yeah. It doesn't complain, but it omits its normal
> install-modLTLIBRARIES rule which mean nothing will be installed.
> However, the error is still reported so there are other libraries which
> are not in mod_LTLIBRARIES affected too.

I did a web search for "automake add dependency" and ended up at a
stackmumble page which referred to EXTRA_a_DEPENDENCIES.

See
  (automake-1) Program and Library Variables
which says


'maude_DEPENDENCIES'
'EXTRA_maude_DEPENDENCIES'
 It is also occasionally useful to have a target (program or
 library) depend on some other file that is not actually part of
 that target.  This can be done using the '_DEPENDENCIES' variable.
 Each target depends on the contents of such a variable, but no
 further interpretation is done.

 Since these dependencies are associated to the link rule used to
 create the programs they should normally list files used by the
 link command.  That is '*.$(OBJEXT)', '*.a', or '*.la' files for
 programs; '*.lo' and '*.la' files for Libtool libraries; and
 '*.$(OBJEXT)' files for static libraries.  In rare cases you may
 need to add other kinds of files such as linker scripts, but
 _listing a source file in '_DEPENDENCIES' is wrong_.  If some
 source file needs to be built before all the components of a
 program are built, consider using the 'BUILT_SOURCES' variable
 (*note Sources::).

 If '_DEPENDENCIES' is not supplied, it is computed by Automake.
 The automatically-assigned value is the contents of '_LDADD' or
 '_LIBADD', with most configure substitutions, '-l', '-L', '-dlopen'
 and '-dlpreopen' options removed.  The configure substitutions that
 are left in are only '$(LIBOBJS)' and '$(ALLOCA)'; these are left
 because it is known that they will not cause an invalid value for
 '_DEPENDENCIES' to be generated.

 '_DEPENDENCIES' is more likely used to perform conditional
 compilation using an 'AC_SUBST' variable that contains a list of
 objects.  *Note Conditional Sources::, and *note Conditional
 Libtool Sources::.

 The 'EXTRA_*_DEPENDENCIES' variable may be useful for cases where
 you merely want to augment the 'automake'-generated '_DEPENDENCIES'
 variable rather than replacing it.


Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Likely build race, "/usr/bin/ld: cannot find -lvirt"

2018-05-24 Thread Ian Jackson
Ian Jackson writes ("Likely build race, "/usr/bin/ld: cannot find -lvirt""):
> tl;dr:
> 
> I think there is a bug in libvirt's build system which, with
> low probability, causes a build failure containing this message:
>   /usr/bin/ld: cannot find -lvirt
> 
> Complete build logs of two attempts:
> 
>   
> http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386-libvirt/6.ts-libvirt-build.log
> 
>   
> http://logs.test-lab.xenproject.org/osstest/logs/123096/build-i386-libvirt/6.ts-libvirt-build.log

I have run a number of attempts.  Out of 5 more, 1 succeeded.  So out
of a total of 7 attempts, 1 succeeded.  This repro rate is an IMO
excellent opportunity to debug this race :-).

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] Likely build race, "/usr/bin/ld: cannot find -lvirt"

2018-05-23 Thread Ian Jackson
tl;dr:

I think there is a bug in libvirt's build system which, with
low probability, causes a build failure containing this message:
  /usr/bin/ld: cannot find -lvirt

Complete build logs of two attempts:

  
http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386-libvirt/6.ts-libvirt-build.log

  
http://logs.test-lab.xenproject.org/osstest/logs/123096/build-i386-libvirt/6.ts-libvirt-build.log

Snippet from 123046 containing the error is enclosed below.


Longer explanation:

I have two new machines for the Xen Project CI, which I am trying to
commission.  As part of commissioning I run a complete test run (a
"flight" in osstest terminology) on just those new hosts.  The i386
libvirt build failed:

  
http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386-libvirt/6.ts-libvirt-build.log

Everything else that would be expected to work was fine.  The test
programme was identical to flight 122815, except that that ran on
other hosts in the test farm (and, there, it passed).  The error is
the kind of error one sees with missing dependencies in parallel
builds, etc.

I wanted to have some 32-bit libvirt tests actually run, so I reran a
new flight containing the relevant parts.  That failed too in a very
similar way:

  
http://logs.test-lab.xenproject.org/osstest/logs/123096/build-i386-libvirt/6.ts-libvirt-build.log

The two machines are Dell R230s (and therefore hardly unusual).  The
main novelty of these machines is that the firmware is UEFI booting in
UEFI mode.  I doubt that has anything to do with it.  The host,
including compiler, is Debian jessie i386.

As you can see from the log, we were trying to build libvirt
  764a7483f189e6de841163647c14296e693dbb2e
What may be less obvious is that we were trying to build it against
xen.git#0306a1311d02ea52b4a9a9bc339f8bab9354c5e3.
  
http://logs.test-lab.xenproject.org/osstest/logs/123064/build-i386-libvirt/info.html
  http://logs.test-lab.xenproject.org/osstest/logs/123046/build-i386/info.html

Does this seem like a likely explanation ?  Have other people
experienced occasional problems with make -j ?  If someone wants to
suggest a patch that might fix it I can test it.

In the meantime I have set off a number of new attempts, to try to
guess the failure probability, and also one attempt on other hosts to
check that nothing unexpected was broken.

Ian.


/usr/bin/ld: cannot find -lvirt
/usr/bin/ld: cannot find -lvirt
 /bin/mkdir -p 
'/home/osstest/build.123046.build-i386-libvirt/dist/usr/local/lib/libvirt/storage-backend'
 /bin/bash ../libtool   --mode=install /usr/bin/install -c   
libvirt_storage_backend_fs.la libvirt_storage_backend_logical.la 
libvirt_storage_backend_scsi.la libvirt_storage_backend_mpath.la 
'/home/osstest/build.123046.build-i386-libvirt/dist/usr/local/lib/libvirt/storage-backend'
libtool: install: warning: relinking `libvirt_storage_backend_fs.la'
libtool: install: (cd 
/home/osstest/build.123046.build-i386-libvirt/libvirt/src; /bin/bash 
/home/osstest/build.123046.build-i386-libvirt/libvirt/libtool  --silent --tag 
CC --mode=relink gcc -std=gnu99 -I./conf -I/usr/include/libxml2 -fno-common -W 
-Waddress -Waggressive-loop-optimizations -Wall -Wattributes 
-Wbad-function-cast -Wbuiltin-macro-redefined -Wcast-align -Wchar-subscripts 
-Wclobbered -Wcomment -Wcomments -Wcoverage-mismatch -Wcpp -Wdate-time 
-Wdeprecated-declarations -Wdiv-by-zero -Wdouble-promotion -Wempty-body 
-Wendif-labels -Wextra -Wformat-contains-nul -Wformat-extra-args 
-Wformat-security -Wformat-y2k -Wformat-zero-length -Wfree-nonheap-object 
-Wignored-qualifiers -Wimplicit -Wimplicit-function-declaration -Wimplicit-int 
-Winit-self -Winline -Wint-to-pointer-cast -Winvalid-memory-model -Winvalid-pch 
-Wjump-misses-init -Wlogical-op -Wmain -Wmaybe-uninitialized 
-Wmemset-transposed-args -Wmissing-braces -Wmissing-declarations 
-Wmissing-field-initializers -Wmissing-includ
 e-dirs -Wmissing-parameter-type -Wmissing-prototypes -Wmultichar -Wnarrowing 
-Wnested-externs -Wnonnull -Wold-style-declaration -Wold-style-definition 
-Wopenmp-simd -Woverflow -Woverride-init -Wpacked-bitfield-compat -Wparentheses 
-Wpointer-arith -Wpointer-sign -Wpointer-to-int-cast -Wpragmas -Wpsabi 
-Wreturn-local-addr -Wreturn-type -Wsequence-point -Wshadow 
-Wsizeof-pointer-memaccess -Wstrict-aliasing -Wstrict-prototypes 
-Wsuggest-attribute=const -Wsuggest-attribute=format 
-Wsuggest-attribute=noreturn -Wsuggest-attribute=pure -Wswitch -Wsync-nand 
-Wtrampolines -Wtrigraphs -Wtype-limits -Wuninitialized -Wunknown-pragmas 
-Wunused -Wunused-but-set-parameter -Wunused-but-set-variable -Wunused-function 
-Wunused-label -Wunused-local-typedefs -Wunused-parameter -Wunused-result 
-Wunused-value -Wunused-variable -Wvarargs -Wvariadic-macros 
-Wvector-operation-performance -Wvolatile-register-var -Wwrite-strings 
-Wnormalized=nfc -Wno-sign-compare -Wjump-misses-init -Wswitch-enum 
-Wno-format-no
 nliteral -fstack-protector-strong -fexceptions -fasyn
chronous-unwind-tables -fipa-p

Re: [libvirt] [OSSTEST PATCH 1/2] libvirt: Check migration capabilities using proper XML parser

2016-10-11 Thread Ian Jackson
Julien Grall writes ("Re: [OSSTEST PATCH 1/2] libvirt: Check migration 
capabilities using proper XML parser"):
> live migration is not currently the supported but will be in the future. 
> FWIW, there is a patch series on xen-devel to support dead migration 
> (first step for live migration) [1].
> 
> I hope this answer to your question.

Yes, thanks.

Thanks everyone, the patches (as discessed etc.) are now in the
osstest self-push-gate.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [OSSTEST PATCH 2/2] libvirt: Do not attempt save/restore when migration not advertised

2016-10-06 Thread Ian Jackson
Martin Kletzander writes ("Re: [OSSTEST PATCH 2/2] libvirt: Do not attempt 
save/restore when migration not advertised"):
> Well then, unfortunately you do.
> 
> Also, looking at how the code is structured, if you have live migration
> but don't have save/restore, you won't have  there
> at all.

Right.  OK, thanks.  I will add the patch below to my osstest queue.

Ian.

>From 5330ff9222e4e611505149945eef7dc074b4f9b5 Mon Sep 17 00:00:00 2001
In-Reply-To: <20161006104255.GP16414@wheatley>
References: <20161006104255.GP16414@wheatley>
From: Ian Jackson 
Date: Thu, 6 Oct 2016 17:38:29 +0100
Subject: [OSSTEST PATCH 3/2] libvirt: Check 
/capabilities/host/migration_features/live for live migration
Cc: libvir-list@redhat.com

libvirt is capable of advertising this separately from
/capabilities/host/migration_features, so if save/restore is supported
but live migration is not, this will do the right thing.

We would have preferred libvirt to advertise
  /capabilities/host/migration_features/save
or something, but it doesn't right now, so we continue to use
  /capabilities/host/migration_features
to detect save/restore support.

If libvirt changes its feature presentation, then at some future point
we should change osstest too.

Signed-off-by: Ian Jackson 
CC: Martin Kletzander 
CC: Jim Fehlig 
---
 Osstest/Toolstack/libvirt.pm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm
index 250fe47..81e724d 100644
--- a/Osstest/Toolstack/libvirt.pm
+++ b/Osstest/Toolstack/libvirt.pm
@@ -93,7 +93,8 @@ sub migrate_check ($$) {
 # local migration is not supported
 $rc = 1;
 } else {
-   $rc = $self->check_capability('/capabilities/host/migration_features');
+   $rc = $self->check_capability
+   ('/capabilities/host/migration_features/live');
 }
 
 logm("rc=$rc");
-- 
2.1.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [OSSTEST PATCH 1/2] libvirt: Check migration capabilities using proper XML parser

2016-10-06 Thread Ian Jackson
Julien Grall writes ("Re: [OSSTEST PATCH 1/2] libvirt: Check migration 
capabilities using proper XML parser"):
> On 04/10/2016 10:05, Ian Jackson wrote:
> > Missing _.  I didn't test this again before sending it.  I'd still
> > like a review from libvirt (and Xen/ARM) folks.
> 
> I am not sure what kind of input you would need from Xen/ARM. This patch 
> set looks very libvirt specific.

It is.  Sorry to spam you with the whole thread.  The only thing I
need from Xen/ARM people is a sanity check that my understanding of
the current and likely future supported features is correct.

Thanks,
Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [OSSTEST PATCH 2/2] libvirt: Do not attempt save/restore when migration not advertised

2016-10-06 Thread Ian Jackson
Martin Kletzander writes ("Re: [OSSTEST PATCH 2/2] libvirt: Do not attempt 
save/restore when migration not advertised"):
> Since offline migration (as in migrating a domain between hosts without
> being running) is not that used in the code and talked about, I'm
> guessing offline means save restore.  Looking at the history it was
> added before the "offline" migration, so it probably means
> save/restore.  To avoid confusion, I would suggest we add either
>  or rather  (the naming is not important) and document
> what it means.  And then you can use it exactly how you'd like.  And
> you'll be also sure it means what you need it to mean ;)  The patches
> will be straigh-forward, let me know if I can help anyhow.

Except that the point of the exercise is to detect which features are
supported in which versions.  Whatever I do in osstest needs to work
with older libvirt versions, which do not report
  /capabilities/host/migration_features/save
even on x86, where it is supported.  I suppose I could detect
  /capabilities/host/migration_features/live
and assume that save/restore was supported (since it's unlikely that
live migration would be supported but not save/restore).

So for now I think I need to use
  /capabilities/host/migration_features
as a proxy for save/restore ?

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [OSSTEST PATCH 1/2] libvirt: Check migration capabilities using proper XML parser

2016-10-04 Thread Ian Jackson
Ian Jackson writes ("[OSSTEST PATCH 1/2] libvirt: Check migration capabilities 
using proper XML parser"):
> Do not grep the virsh capabilities output (!)  Instead, parse the XML
> using perl's XML modules and look for the specific feature flag using
> an XPATH pattern.
...>  
> +sub _check_capability ($$) {
...
> + $rc = $self->check_capability('/capabilities/host/migration_features');
 ^

Missing _.  I didn't test this again before sending it.  I'd still
like a review from libvirt (and Xen/ARM) folks.

Thanks,
Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [OSSTEST PATCH 2/2] libvirt: Do not attempt save/restore when migration not advertised

2016-10-04 Thread Ian Jackson
Currently, osstest wrongly thinks that ARM can do save/restore,
because `virsh help' does mention the save command (on all
architectures).

Additionally, check the virth capabilities xpath
   /capabilities/host/migration_features
to try to see whether this host supports migration.

I am not sure if this is the right path to check.  Perhaps
   /capabilities/host/migration_features/live
is more correct, but this may be wrong if Xen comes to support save/restore
on ARM, but not live migration (but perhaps libvirt cannot express this
distinction in which case perhaps it's right after all).

Signed-off-by: Ian Jackson 
CC: Julien Grall 
CC: Jim Fehlig 
---
 Osstest/Toolstack/libvirt.pm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm
index b7db7af..250fe47 100644
--- a/Osstest/Toolstack/libvirt.pm
+++ b/Osstest/Toolstack/libvirt.pm
@@ -111,7 +111,9 @@ sub check_for_command($$) {
 
 sub saverestore_check ($) {
 my ($self) = @_;
-return check_for_command($self, "save");
+return
+   _check_capability($self, '/capabilities/host/migration_features') &&
+   check_for_command($self, "save");
 }
 
 sub migrate () {
-- 
2.1.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [OSSTEST PATCH 0/2] libvirt: Fix save/restore capability check on ARM

2016-10-04 Thread Ian Jackson
Currently, osstest, the Xen Project's automated test framework,
erroneously thinks that save/restore is supported with libvirt on ARM.
In fact, save/restore is not supported by Xen on ARM at all.

The result is that osstest then actually attempts the save/restore,
and abandons the test job as a failure.  This is not desirable.

In these two patches I try to fix the feature detection to get this
right.  I'd appreciate advice about whether I have done the right
thing.

My code is based partly on empirical observation of the output of
`virsh capabilities' on x86 and ARM.  (See below.)

Thanks,
Ian.


>From baroque0, x86.
As left by 101253.test-amd64-amd64-libvirt-pair; osstest "branch" osstest


   

 
   
 x86_64
 
   
 
 Haswell-noTSX
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
   
   
   
 
   
   vif
   
 
   
 9699328
 
   
   
   
   
   
   
   
   
 
   
 
   
 

 
   xen
   
 64
 /usr/lib/xen/bin/qemu-system-i386
 xenpv
 
   
 

 
   xen
   
 32
 /usr/lib/xen/bin/qemu-system-i386
 xenpv
 
   
   
 
   
 

 
   hvm
   
 32
 /usr/lib/xen/bin/qemu-system-i386
 /usr/lib/xen/boot/hvmloader
 xenfv
 
   
   
 
 
 
 
 
   
 

 
   hvm
   
 64
 /usr/lib/xen/bin/qemu-system-i386
 /usr/lib/xen/boot/hvmloader
 xenfv
 
   
   
 
 
 
   
 

   



>From arndale-lakeside, ARM.
As left by 101251 | test-armhf-armhf-libvirt; osstest "branch" qemu-mainline

   

 
   
 armv7l
   
   
   vif
   
 
   
 2097152
 
   
   
 
   
 
   
 

 
   xen
   
 32
 /usr/lib/xen/bin/qemu-system-i386
 xenpv
 
   
 

   


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [OSSTEST PATCH 1/2] libvirt: Check migration capabilities using proper XML parser

2016-10-04 Thread Ian Jackson
Do not grep the virsh capabilities output (!)  Instead, parse the XML
using perl's XML modules and look for the specific feature flag using
an XPATH pattern.

AFAICT from looking at the XML, that's


  

  

But the original code does not test for .

Xen could in principle (and AIUI might in the future, on ARM) support
save/restore but not live migration.  Currently it supports neither.

I don't know whether libvirt's capabilities system can capture this
distinction.  libvirt.git#1d37a4c4 "libxl: detect support for save and
restore" suggests not.

Perhaps relatedly, I am not sure whether this test should be changed
to look for the xpath
  /capabilities/host/migration_features/live
instead.  The schema (libvirt.git/docs/schemas/capability.rng) seems
to suggest that it probably should.

For now, this osstest commit has no ultimate functional change (with
libvirt output as it currently appears to be on real hosts).

Signed-off-by: Ian Jackson 
CC: Julien Grall 
CC: Jim Fehlig 
---
 Osstest/Toolstack/libvirt.pm | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm
index 69ff0bb..b7db7af 100644
--- a/Osstest/Toolstack/libvirt.pm
+++ b/Osstest/Toolstack/libvirt.pm
@@ -21,6 +21,8 @@ use strict;
 use warnings;
 
 use Osstest::TestSupport;
+use XML::LibXML::XPathContext;
+use XML::LibXML;
 
 sub new {
 my ($class, $ho, $methname,$asset) = @_;
@@ -72,6 +74,17 @@ sub shutdown_wait ($$$) {
 guest_await_destroy($gho,$timeout);
 }
 
+sub _check_capability ($$) {
+my ($self, $xpath) = @_;
+my $ho = $self->{Host};
+my $caps = target_cmd_output_root($ho, 'virsh capabilities');
+my $stash = open_unique_stashfile('virsh-capabilities');
+my $dom = XML::LibXML->load_xml(string => $caps);
+my $xc = XML::LibXML::XPathContext->new($dom);
+my @nodes = $xc->findnodes($xpath);
+return !!@nodes;
+}
+
 sub migrate_check ($$) {
 my ($self, $local) = @_;
 my $rc;
@@ -80,9 +93,7 @@ sub migrate_check ($$) {
 # local migration is not supported
 $rc = 1;
 } else {
-   my $ho = $self->{Host};
-   my $caps = target_cmd_output_root($ho, "virsh capabilities");
-   $rc = ($caps =~ m//) ? 0 : 1
+   $rc = $self->check_capability('/capabilities/host/migration_features');
 }
 
 logm("rc=$rc");
-- 
2.1.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver breakage -- where to define LIBXL_API_VERSION?

2016-06-28 Thread Ian Jackson
Jan Beulich writes ("Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver 
breakage -- where to define LIBXL_API_VERSION?"):
> On 27.06.16 at 18:54,  wrote:
> > OK.  Thanks for the feedback.  I'll go ahead with my plan with the
> > git commit ids named in my earlier email.
> 
> The only (hopefully highly theoretical) problem I see with this is that
> we may end up picking a libvirt commit which subsequently (e.g. via
> a libxl backport) turns out to have an issue. Such a problem could be
> dealt with in the suggested the stable branch tracking model (or any
> other model not dealing with something completely frozen).

I don't think there is anything stopping us manually updating one of
these frozen "branches", should such a situation occur.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver breakage -- where to define LIBXL_API_VERSION?

2016-06-27 Thread Ian Jackson
Jim Fehlig writes ("Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver 
breakage -- where to define LIBXL_API_VERSION?"):
> On 06/27/2016 10:12 AM, Ian Jackson wrote:
> > Does libvirt have stable release branches ?  One approach would be to
> > have osstest track a suitable libvirt stable release branche for each
> > Xen stable release branch.
> 
> I see Daniel already answered this question.
> 
> >
> > That would involve setting up a push gate for each of the chosen
> > libvirt stable branches.  That would be worthwhile if we expect those
> > stable branches to acquire commits which break Xen, and which we could
> > like to be told about.  But I'm not sure that's the case.
> 
> I occasionally backport Xen bug fixes to -maint branches. Cole has
> also grabbed some Xen bug fixes when making a stable release of a
> -maint branch. But such backports should be trivial and obvious bug
> fixes that shouldn't cause build or runtime breakage with Xen.

OK.  Thanks for the feedback.  I'll go ahead with my plan with the
git commit ids named in my earlier email.

Thanks,
Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver breakage -- where to define LIBXL_API_VERSION?

2016-06-27 Thread Ian Jackson
Daniel P. Berrange writes ("Re: [libvirt] [Xen-devel] Fixing libvirt's libxl 
driver breakage -- where to define LIBXL_API_VERSION?"):
> On Mon, Jun 27, 2016 at 04:54:35PM +0100, Ian Jackson wrote:
> > Created the following branch refs on xenbits in the toplevel
> > libvirt.git:
> > 
> >  osstest/frozen/xen-4.3-testing 9a0c7f5f834185db9017c34aabc03ad99cf37bed
> >  osstest/frozen/xen-4.4-testing 33fb8ff185846a7b4974105d2c9400690a6f95cf
> >  osstest/frozen/xen-4.5-testing cda1cc170f07b45911b3dad03e42c8ebfc210fa1
> >  osstest/frozen/xen-4.6-testing eac167e2610d3e59b32f7ec7ba78cbc8c420a425
> >  osstest/frozen/xen-4.7-testing 1a41ed5af5e1704dd9b0bdca40df5c9cacbdabfc
> 
> How did you pick those hashes ? Would it make more sense to pick the
> nearest libvirt release tag ? eg v1.3.2 instead of 33fb8ff18584 ?
> 
> > These were those tested by the following `tolerable' osstest push gate
> > flights for the corresponding Xen tree:
> > 
> >  xen-4.3-testing 9a0c7f5f8341 86673
> >  xen-4.4-testing 33fb8ff18584 85031
> >  xen-4.5-testing cda1cc170f07 83135
> >  xen-4.6-testing eac167e2610d 96031
> >  xen-4.7-testing 1a41ed5af5e1 95728

I picked them by searching my mail archives for osstest `tolerable'
push gate flights - ie, passes in our CI system.

That minimises the risk that the selected versions are themselves
troublesome for some reason, needing another round of adjustment.

It might indeed be better to convert them to nearby release tags.
However:

mariner:libvirt> git-describe 9a0c7f5f834185db9017c34aabc03ad99cf37bed
v1.3.2-202-g9a0c7f5
mariner:libvirt> git-describe 33fb8ff185846a7b4974105d2c9400690a6f95cf
v1.3.2-rc2-1-g33fb8ff
mariner:libvirt> git-describe cda1cc170f07b45911b3dad03e42c8ebfc210fa1
v1.3.1-262-gcda1cc1
mariner:libvirt> git-describe eac167e2610d3e59b32f7ec7ba78cbc8c420a425
v1.3.5-318-geac167e
mariner:libvirt> git-describe 1a41ed5af5e1704dd9b0bdca40df5c9cacbdabfc
v1.3.5-129-g1a41ed5
mariner:libvirt> 

So in most cases these hashes are well away from a release tag.

Does libvirt have stable release branches ?  One approach would be to
have osstest track a suitable libvirt stable release branche for each
Xen stable release branch.

That would involve setting up a push gate for each of the chosen
libvirt stable branches.  That would be worthwhile if we expect those
stable branches to acquire commits which break Xen, and which we could
like to be told about.  But I'm not sure that's the case.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver breakage -- where to define LIBXL_API_VERSION?

2016-06-27 Thread Ian Jackson
(Adding Jan Beulich)

Ian Jackson writes ("Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver 
breakage -- where to define LIBXL_API_VERSION?"):
> It seems that the libvirt LIBXL_API_VERSION is now rather higher, at
> 0x040400, since libvirt#fccf27253ced
>   libxl: switch to using libxl_domain_create_restore from v4.4 API
> 
> One unfortunate effect of this is to break the osstest tests of the
> Xen 4.3 stable branch, which for the moment is still allegedly in
> security support.
> 
> I can't really see a way that this kind of problem could be avoided
> in principle, without
>   - providing a more sophisticated way for libxl callers to set the
> requested version
>   - providing more compatibility code in libvirt, too, and retaining
> it for some time
> 
> I think instead that it would probably be better for osstest to
> "freeze" the version of libvirt that it is using, every time we branch
> Xen.
> 
> So Xen 4.4 would be tested with whatever libvirt we were using when
> the stable branch for Xen 4.4 was made, and so on.
> 
> Does that sound sensible ?

In the assumption that it is, I have:

Created the following branch refs on xenbits in the toplevel
libvirt.git:

 osstest/frozen/xen-4.3-testing 9a0c7f5f834185db9017c34aabc03ad99cf37bed
 osstest/frozen/xen-4.4-testing 33fb8ff185846a7b4974105d2c9400690a6f95cf
 osstest/frozen/xen-4.5-testing cda1cc170f07b45911b3dad03e42c8ebfc210fa1
 osstest/frozen/xen-4.6-testing eac167e2610d3e59b32f7ec7ba78cbc8c420a425
 osstest/frozen/xen-4.7-testing 1a41ed5af5e1704dd9b0bdca40df5c9cacbdabfc

These were those tested by the following `tolerable' osstest push gate
flights for the corresponding Xen tree:

 xen-4.3-testing 9a0c7f5f8341 86673
 xen-4.4-testing 33fb8ff18584 85031
 xen-4.5-testing cda1cc170f07 83135
 xen-4.6-testing eac167e2610d 96031
 xen-4.7-testing 1a41ed5af5e1 95728

And I have prepared the patch below, which (together with a
prerequisite, in my tree) will implement this in osstest.

Ian.

>From 5d1c91d3c53b580305e96d62f8ca84f85f8d3011 Mon Sep 17 00:00:00 2001
From: Ian Jackson 
Date: Mon, 27 Jun 2016 16:49:52 +0100
Subject: [OSSTEST PATCH] cr-daily-branch: libvirt: use frozen version on
 stable branches

libvirt master might increase its LIBXL_API_VERSION.  When this feeds
through osstest it can cause the push gates of Xen stable branches to
break.

So for stable Xen branches do not track libvirt upstream.  Instead,
use a frozen revision.  (Only for main push gate tests of stable Xen
branches.)

The frozen branch is never going to be updated so it is not suitable
for other kinds of uses.  In particular it won't get security fixes.
So we call the refs   osstest/frozen/xen-K.L-testing  to discourage
users from using them.

Deployment note: The Xen release checklist needs a new item "add this
frozen libvirt branch".

Signed-off-by: Ian Jackson 
---
 cr-daily-branch | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/cr-daily-branch b/cr-daily-branch
index 8b7c789..21780b8 100755
--- a/cr-daily-branch
+++ b/cr-daily-branch
@@ -186,6 +186,12 @@ if [ "x$REVISION_OVMF" = x ]; then
 fi
 fi
 if [ "x$REVISION_LIBVIRT" = x ]; then
+   case "$xenbranch" in
+   xen-[0-9]*-testing)
+   BASE_TAG_LIBVIRT=osstest/frozen/$xenbranch
+   export BASE_TAG_LIBVIRT
+   ;;
+   esac
determine_version REVISION_LIBVIRT libvirt LIBVIRT
export REVISION_LIBVIRT
 fi
-- 
2.1.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver breakage -- where to define LIBXL_API_VERSION?

2016-06-27 Thread Ian Jackson
Jim Fehlig writes ("Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver 
breakage -- where to define LIBXL_API_VERSION?"):
> I think that is a good idea too. I've sent a patch setting
> LIBXL_API_VERSION to 0x040200
> 
> https://www.redhat.com/archives/libvir-list/2016-April/msg00771.html

It seems that the libvirt LIBXL_API_VERSION is now rather higher, at
0x040400, since libvirt#fccf27253ced
  libxl: switch to using libxl_domain_create_restore from v4.4 API

One unfortunate effect of this is to break the osstest tests of the
Xen 4.3 stable branch, which for the moment is still allegedly in
security support.

I can't really see a way that this kind of problem could be avoided
in principle, without
  - providing a more sophisticated way for libxl callers to set the
requested version
  - providing more compatibility code in libvirt, too, and retaining
it for some time

I think instead that it would probably be better for osstest to
"freeze" the version of libvirt that it is using, every time we branch
Xen.

So Xen 4.4 would be tested with whatever libvirt we were using when
the stable branch for Xen 4.4 was made, and so on.

Does that sound sensible ?

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] gnulib and 32-bit libvirt build, rpl_canonicalize_file_name

2016-06-20 Thread Ian Jackson
Ján Tomko writes ("Re: [libvirt] gnulib and 32-bit libvirt build, 
rpl_canonicalize_file_name"):
> This has been fixed in gnulib already:
> http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=246b3b2
> commit 246b3b28808ee5f4664be674dce573af9497fc7a
> Author: Eric Blake 
> CommitDate: 2016-05-27 14:04:35 -0600
> 
> canonicalize: Fix broken probe for realpath.

Great, thanks.

> Pulled into libvirt by:
> commit 0ebd0b19d35b93eeea9d49318d7a69e147da
> Author: Eric Blake 
> CommitDate: 2016-05-27 14:06:45 -0600
> 
> maint: update to latest gnulib
> 
> Fix a regression in checking for realpath (which caused link
> failures regarding duplicate rpl_canonicalize_file_name), and
> fix the mingw build regarding unsetenv.
> 
> * .gnulib: Update to latest.
> 
> Signed-off-by: Eric Blake 

Right.

> The osstest service seems to be building libvirt commit:
> commit 6ec319b84f67d72bf59fe7e0fd41d88ee9c393c7
> Author: John Ferlan 
> 
> logical: Clean up allocation when building regex on the fly
> 
> git describe: v1.3.1-92-g6ec319b contains: v1.3.2-rc1~262
> 
> which used gnulib from before the commit that broke it.

OK, great.  Sorry for the noise.

Thanks,
Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] gnulib and 32-bit libvirt build, rpl_canonicalize_file_name

2016-06-20 Thread Ian Jackson
The Xen Project's automated test (CI) system has reported a build
failure in libvirt; libvirt uses gnulib.  osstest has bisected it to a
specific gnulib commit.  (Email from osstest is quoted below.)

The error message is:

  ../gnulib/lib/.libs/libgnu.a(canonicalize-lgpl.o): In function
  `rpl_canonicalize_file_name':

  
/home/osstest/build.95966.build-i386-libvirt/libvirt/gnulib/lib/canonicalize-lgpl.c:400:
 multiple definition of `rpl_canonicalize_file_name'

  
.libs/virpcimock_la-virpcimock.o:/home/osstest/build.95966.build-i386-libvirt/libvirt/tests/virpcimock.c:954:
 first defined here

  collect2: error: ld returned 1 exit status

Do you already know about this ?  Would you like us to submit a
patch to gnulib ?

Thanks,
Ian.
 

osstest service owner writes ("[qemu-upstream-4.3-testing bisection] complete 
build-i386-libvirt"):
> branch xen-4.3-testing
> xenbranch xen-4.3-testing
> job build-i386-libvirt
> testid libvirt-build
> 
> Tree: libvirt git://xenbits.xen.org/libvirt.git
> Tree: libvirt_gnulib git://git.sv.gnu.org/gnulib.git
> Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> Tree: xen git://xenbits.xen.org/xen.git
> 
> *** Found and reproduced problem changeset ***
> 
>   Bug is in tree:  libvirt_gnulib git://git.sv.gnu.org/gnulib.git
>   Bug introduced:  54615b95ff238e235e806855efc46a9abad09f2e
>   Bug not present: e78f894d0bdc770101bc040613f4ea94e45f38f7
>   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/95966/
> 
> 
>   commit 54615b95ff238e235e806855efc46a9abad09f2e
>   Author: Paul Eggert 
>   Date:   Sat Feb 6 18:11:48 2016 -0800
>   
>   misc: port better to gcc -fsanitize=address
>   
>   Without these patches, ./configure CFLAGS='-fsanitize=address'
>   would compute incorrect values.  This patch fixes some (but not all)
>   test failures with recent glibc, with this configuration.
>   * m4/acl.m4 (gl_ACL_GET_FILE):
>   * m4/calloc.m4 (_AC_FUNC_CALLOC_IF):
>   * m4/canonicalize.m4 (gl_FUNC_REALPATH_WORKS):
>   * m4/d-ino.m4 (gl_CHECK_TYPE_STRUCT_DIRENT_D_INO):
>   * m4/duplocale.m4 (gl_FUNC_DUPLOCALE):
>   * m4/getcwd.m4 (gl_FUNC_GETCWD_NULL):
>   * m4/getdelim.m4 (gl_FUNC_GETDELIM):
>   * m4/getgroups.m4 (gl_FUNC_GETGROUPS):
>   * m4/getline.m4 (gl_FUNC_GETLINE):
>   * m4/malloc.m4 (_AC_FUNC_MALLOC_IF):
>   * m4/realloc.m4 (_AC_FUNC_REALLOC_IF):
>   * m4/regex.m4 (gl_REGEX):
>   * m4/strndup.m4 (gl_FUNC_STRNDUP):
>   * tests/test-calloc-gnu.c (main):
>   * tests/test-duplocale.c (main):
>   * tests/test-getgroups.c (main):
>   * tests/test-getline.c (main):
>   * tests/test-inttostr.c (main):
>   * tests/test-localename.c (test_locale_name)
>   (test_locale_name_thread, test_locale_name_environ)
>   (test_locale_name_default):
>   * tests/test-regex.c (main):
>   * tests/test-setlocale1.c (main):
>   * tests/test-stat.h (test_stat_func):
>   Free heap-allocated storage before exiting.
>   * m4/asm-underscore.m4 (gl_ASM_SYMBOL_PREFIX):
>   Don't match *_foo symbols inserted by AddressSanitizer.
>   * tests/test-regex.c, tests/test-stat.c: Include stdlib.h, for 'free'.
> 
> 
> For bisection revision-tuple graph see:
>
> http://logs.test-lab.xenproject.org/osstest/results/bisect/qemu-upstream-4.3-testing/build-i386-libvirt.libvirt-build.html
> Revision IDs in each graph node refer, respectively, to the Trees above.
> 
> 
> Running cs-bisection-step 
> --graph-out=/home/logs/results/bisect/qemu-upstream-4.3-testing/build-i386-libvirt.libvirt-build
>  --summary-out=tmp/95966.bisection-summary --basis-template=80927 
> --blessings=real,real-bisect qemu-upstream-4.3-testing build-i386-libvirt 
> libvirt-build
> Searching for failure / basis pass:
>  95930 fail [host=fiano0] / 80927 [host=pinot1] 80730 [host=baroque0] 77983 
> [host=pinot0] 77930 [host=italia0] 77853 [host=italia1] 62112 [host=nocera1] 
> 62045 [host=nocera1] 61805 [host=nocera1] 61729 [host=nocera0] 61620 
> [host=pinot1] 60903 [host=nocera1] 60700 [host=nocera0] 60676 [host=nocera0] 
> 60159 [host=italia1] 58381 [host=pinot1] 57828 [host=nocera0] 56721 
> [host=nocera1] 56679 [host=nocera0] 56637 [host=nocera1] 56604 [host=nocera1] 
> 56583 [host=italia1] 56556 [host=nocera0] 56506 [host=nocera1] 56427 
> [host=nocera1] 56373 [host=nocera0] 55875 [host=nocera1] 50282 [host=fiano1] 
> 36518 [host=scape-moth] 31652 [host=lace-bug] template as basis? using 
> template as basis.
> Failure / basis pass flights: 95930 / 80927
> (tree with no url: seabios)
> Tree: libvirt git://xenbits.xen.org/libvirt.git
> Tree: libvirt_gnulib git://git.sv.gnu.org/gnulib.git
> Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
> Tree: qemuu git://xenbits.xen.org/qemu-xen.git
> Tree: xen git://xenbits.xen.org/xen.git
> Latest eac167e2610d3e59b32f7ec7ba78cbc8c420a425 
> 246b3b28808ee5f46

Re: [libvirt] [Xen-devel] Fixing libvirt's libxl driver breakage -- where to define LIBXL_API_VERSION?

2016-04-14 Thread Ian Jackson
Dario Faggioli writes ("Re: [Xen-devel] Fixing libvirt's libxl driver breakage 
-- where to define LIBXL_API_VERSION?"):
> And, in those cases, usage should be gated by the appropriate
> LIBXL_HAVE_FOOBAR symbol, which I see in the sources (e.g.,
> for LIBXL_HAVE_NO_SUSPEND_RESUME or LIBXL_HAVE_DOMAIN_NODEAFFINITY) to
> be the case, so good again. :-)

If libvirt specifies LIBXL_API_VERSION then it does not need to check
any LIBXL_HAVE_*, since it will actually get an API corresponding to
the specified API_VERSION.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] libxl: libxl_domain_create_restore has an extra argument

2016-04-05 Thread Ian Jackson
Wei Liu writes ("Re: [PATCH] libxl: libxl_domain_create_restore has an extra 
argument"):
> CC Jim as well
> 
> On Tue, Apr 05, 2016 at 03:20:12PM +0100, Wei Liu wrote:
> > In the latest libxenlight code, libxl_domain_create_restore accepts a
> > new argument. Update libvirt's libxl driver for that. Use the macro
> > provided by libxenlight to detect which version should be used.
> > 
> > The new parameter (send_back_fd) is set to -1 because libvirt provides
> > no such fd.
...
> > -#ifdef LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS
> > +#if defined(LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_BACK_FD)
> > +params.checkpointed_stream = 0;
> > +ret = libxl_domain_create_restore(cfg->ctx, &d_config, &domid,
> > +  restore_fd, -1, ¶ms, NULL,
> > +  &aop_console_how);
> > +#elif defined(LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS)
> >  params.checkpointed_stream = 0;
> >  ret = libxl_domain_create_restore(cfg->ctx, &d_config, &domid,
> >restore_fd, ¶ms, NULL,

Another approach would be 

 ret = libxl_domain_create_restore(cfg->ctx, &d_config, &domid,
   restore_fd,
#ifdef LIBXL_HAVE_DOMAIN_CREATE_RESTORE_SEND_BACK_FD
   -1,
#endif
#ifdef LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS
   ¶ms,
#endif
   NULL, &aop_console_how);

But which to choose is a matter of taste.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH RFC] libxl: use libxl_event_wait to process libxl events

2015-11-23 Thread Ian Jackson
Jim Fehlig writes ("[PATCH RFC] libxl: use libxl_event_wait to process libxl 
events"):
> Prior to this patch, libxl events were delivered to libvirt via
> the libxlDomainEventHandler callback registered with libxl.
> Documenation in $xensrc/tools/libxl/libxl_event.h states that the
> callback "may occur on any thread in which the application calls
> libxl". This can result in deadlock since many of the libvirt
> callees of libxl hold a lock on the virDomainObj they are working
> on. When the callback is invoked, it attempts to find a virDomainObj
> corresponding to the domain ID provided by libxl. Searching the
> domain obj list results in locking each obj before checking if it is
> active, and its ID equals the requested ID. Deadlock is possible
> when attempting to lock an obj that is already locked further up
> the call stack. Indeed, Max Ustermann recently reported an instance
> of this deadlock

This sounds like a very plausible failure mode.

> https://www.redhat.com/archives/libvir-list/2015-November/msg00130.html
> 
> This patch moves processing of libxl events to a thread, where
> libxl_event_wait() is used to collect events. This allows processing
> libxl events asynchronously in libvirt, avoiding the deadlock.

The reasoning is sound and the remedy is IMO correct.  (However, you
mean "this allows processing libxl events _synchronously_", since what
you are doing is serialising them all into your main loop.)

So:

Acked-by: Ian Jackson 

NB that I have not reviewed the patch in detail.  I can do that if you
like, although of course my knowledge of the innards of libvirt is not
wonderful.

> The only reservations I have about this patch come from comments
> about libxl_event_wait in libxl_event.h
> 
>   Like libxl_event_check but blocks if no suitable events are
>   available, until some are.  Uses libxl_osevent_beforepoll/
>   _afterpoll so may be inefficient if very many domains are being
>   handled by a single program.

If this turns out to be a problem in practice, we will improve libxl's
data structures to not involve so many linear searches.  (In fact I
think you are probably already calling synchronous libxl ao functions,
which have the same performance properties, although this is not
documented.)

Given what you say above I don't think there is a reasonable
alternative remedy.  So you should go ahead with this patch.

Regards,
Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] [vbox-dev] Assert with libvirt + xen hvm

2015-11-11 Thread Ian Jackson
Cole Robinson writes ("Re: [Xen-devel] [libvirt] [vbox-dev] Assert with libvirt 
+ xen hvm"):
> I took a cursory look at libxl's sigchld handling... it's intense to say the
> least, but there's some driver options that tweak the handling. Maybe there's
> a simple fix.

Sadly the way that there's a single SIGCHLD handler in the Unix API is
extremely awkward in multithreaded programs.  The complicated code in
libxl is trying to help cope with that.

> On 01/26/2015 08:17 AM, Klaus Espenlaub wrote:
> > The rationale why the runtime installs a dummy signal handler is that if
> > SIGCHLD is set to be ignored (that's the default) then POSIX compliant
> > waitpid() won't work. It's mentioned in the wait(2) man page on my linux
> > system, see the notes about the ECHILD error code.

I think you are wrong about this.

SIGCHLD (like all signals) is set to SIG_DFL by default.  But it is
only SIG_IGN that causes automatic reaping of children.  I normally
use the online Open Group spec to check this kind of thing - it's more
authoritative than manpages.

  
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03

  SIG_IGN

  ...

  If the action for the SIGCHLD signal is set to SIG_IGN, child
  processes of the calling processes shall not be transformed into
  zombie processes when they terminate. If the calling process
  subsequently waits for its children, and the process has no
  unwaited-for children that were transformed into zombie processes,
  it shall block until all of its children terminate, and wait(),
  waitid(), and waitpid() shall fail and set errno to
  [ECHILD].

I don't know exactly what virtualbox does with children but if the
only reason it is setting a SIGCHLD handler is to avoid its children
being automatically reaped, leaving it set to SIG_DFL will work just
fine.

If virtualbox always reaps its children synchronously (ie, it does not
use SIGCHLD to get notified of child death) then setting the libxl
signal mode to

  libxl_sigchld_owner_libxl_always_selective_reap

in libvirt should work.  This is what libvirt (at laast in master)
does.

There would then be no need for virtualbox to do anything to SIGCHLD.
Although, it might be worth checking that it isn't SIG_IGN and
screaming somewhere if it is.

Incidentally it seems to me possible that the reason why virtualbox
explictly sets a SIGCHLD handler may be that at some point in the past
you had encountered a bug with someone in the same process setting
SIGCHLD to SIG_IGN.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 0/2] Re: libvirtd live-locking on CTX_LOCK when doing 'virsh save /tmp/blah' with guest corrupting memory (on purpose).

2015-04-16 Thread Ian Jackson
Jim Fehlig writes ("Re: [PATCH 0/2] Re: libvirtd live-locking on CTX_LOCK when 
doing 'virsh  save /tmp/blah' with guest corrupting memory (on 
purpose)."):
> On 04/14/2015 11:31 AM, Ian Jackson wrote:
> > I have produced what I think are two patches that will fix this.  I
> > have compiled them but I haven't tested them.  Konrad, are you able to
> > check whether they fix your bug ?
> 
> I too saw this bug just before Konrad's report, but the patches don't seem to 
> help.  Running a script that continually saves and restores domains will 
> eventually lock libvirtd with essentially the same traces reported by Konrad

I'm a total idiot.  I do the recheck but I don't pay any attention to
the result.

I will send an updated approach which does this centrally.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 0/2] Re: libvirtd live-locking on CTX_LOCK when doing 'virsh save /tmp/blah' with guest corrupting memory (on purpose).

2015-04-14 Thread Ian Jackson
Konrad Rzeszutek Wilk writes ("libvirtd live-locking on CTX_LOCK when doing 
'virsh  save /tmp/blah' with guest corrupting memory (on purpose)."):
> It looks like thread #10 is blocking in libxl_read_exactly waiting
> for 'libxl-save-helper'. Said application (see below) has dispatched
> an message through helper_getreply and is blocking on __read_nocancel.

This is not supposed to block.

helper_stdout_readable assumes that the fd is actually readable.
However, for complicated reasons it can happen in a multithreaded
program that the fd was _reviously_ readable and is now no longer.

This was not clearly documented in the internal API documentation.

I have produced what I think are two patches that will fix this.  I
have compiled them but I haven't tested them.  Konrad, are you able to
check whether they fix your bug ?

If they do they are candidates for backporting.

Thanks,
Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] [PATCH V2] libxl: Set path to console on domain startup.

2014-12-16 Thread Ian Jackson
Ian Jackson writes ("Re: [Xen-devel] [PATCH V2] libxl: Set path to console on 
domain startup."):
> I mention all of these for completeness, and for future reference for
> anyone reading the archives later.
> 
> In libxl you want option I.

I mean `in xl you want option I'.  In libvirt you probably want that
too.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] [PATCH V2] libxl: Set path to console on domain startup.

2014-12-16 Thread Ian Jackson
Anthony PERARD writes ("Re: [Xen-devel] [PATCH V2] libxl: Set path to console 
on domain startup."):
> On Tue, Dec 16, 2014 at 09:30:28AM +, Ian Campbell wrote:
> > Unless by "not running" you meant bottlenecked or not keeping up
> > perhaps.
> 
> Well, I did meant no xenconsoled process. But after, I also tried `kill
> -STOP`, but the same things is happening.

I think this is a bug.  I imagine that you can cause `xl create -c' to
malfunction too.

> > > Or, should I set the callback to NULL and have the
> > > domain_create_console_available event sent through
> > > the callback set by libxl_event_register_callbacks()?
> > 
> > That would make sense, except I don't see libxl_evdisable_foo for these
> > events, so I'm not sure it is possible.
> 
> Well, the domain_create_console_available event is report by
> libxl__ao_progress_report which will either callback() or call
> libxl__event_occurred(). So does not seams better to set callback to
> NULL.

The console available notification is only available via the ao
progress mechanism.  So you have to pass a non-NULL aop_console_how,
and the notification will always be generated during the create.

Your options for having the notification delivered are:

I. aop_console_how->callback != NULL:

   libxl will call
 aop_console_how->callback(aop_console_how->for_callback)

II. aop_console_how->callback == NULL:

   libxl will construct an libxl_event *event with
  event->domid = the domid of the domain being constructed
  event->domuuid = its uuid
  event->for_user = aop_console_how->for_event
  event->type = LIBXL_EVENT_TYPE_DOMAIN_CREATE_CONSOLE_AVAILABLE

What happens to that event depends on whether and how
libxl_event_register_callbacks has been called.

II(a): If libxl_event_register_callbacks(,hooks,user) was called
   and also the bit is set ie
 !!(hooks->event_occurs_mask
  & (1UL << LIBXL_EVENT_TYPE_DOMAIN_CREATE_CONSOLE_AVAILABLE)):

   libxl will call
  hooks->event_occurs(user,event)

II(b): If libxl_event_register_callbacks(,hooks,user) was NOT called
   or if the bit is clear:

   libxl will queue the event for retrieval by libxl_event_check
   or libxl_event-wait.  (And if the application doesn't call those,
   the application will never be notified; the event will be
   retained until the libxl ctx is destroyed and then discarded.)

I mention all of these for completeness, and for future reference for
anyone reading the archives later.

In libxl you want option I.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility

2014-01-30 Thread Ian Jackson
Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD 
flexibility"):
> Ok, thanks.  I'm currently testing on your git branch referenced earlier
> in this thread
> 
> git://xenbits.xen.org/people/iwj/xen.git#wip.enumerate-pids-v2.1

Great.  That's the one.  My current version is pretty much identical -
some unused variables deleted and comments edited.

> >  * You need to fix the timer deregistration arrangements in the
> >libvirt/libxl driver to avoid the crash you identified the other day.
> 
> Yes, I'm testing a fix now.

Great.

> >  * Something needs to be done about the 20ms slop in the libvirt event
> >loop (as it could cause libxl to lock up).  If you can't get rid of
> >it in the libvirt core, then adding 20ms to the every requested
> >callback time in the libvirt/libxl driver would work for now.
> >   
> 
> The commit msg adding the fuzz says
> 
> Fix event test timer checks on kernels with HZ=100
>
> On kernels with HZ=100, the resolution of sleeps in poll() is
> quite bad. Doing a precise check on the expiry time vs the
> current time will thus often thing the timer has not expired
> even though we're within 10ms of the expected expiry time. This
> then causes another pointless sleep in poll() for <10ms. Timers
> do not need to have such precise expiration, so we treat a timer
> as expired if it is within 20ms of the expected expiry time. This
> also fixes the eventtest.c test suite on kernels with HZ=100

I think this is a bug in the kernel.  poll() may sleep longer, but not
shorter, than expected.

> * daemon/event.c: Add 20ms fuzz when checking for timer expiry
> 
> I could handle this in the libxl driver as you say, but doing so makes
> me a bit nervous.  Potentially locking up libxl makes me nervous too :).

I was going to say that the code in libxl_osevent_occurred_timeout
checked the time against the requested time and would ignore the event
(thinking it was stale) if it was too early.

But in fact now that I read the code this is not true.  In fact I
think it will work OK (modulo some things happening too soon).  So the
upshot is that I still think this is a bug in libvirt but I don't
think it's critical to fix it.

Sorry to cause undue alarm.

> Yes.  I've been running my tests for about 24 hours now with no problems
> noted.  The tests include starting/stopping a persistent VM,
> creating/stopping a transient VM, rebooting a persistent VM,
> saving/restoring a transient VM, and getting info on all of these VMs.
> 
> I should probably add saving/restoring a persistent VM to the mix since
> the associated libxl_ctx is never freed.  Only when a persistent VM is
> undefined is the libxl_ctx freed.

Right.  Great.

Thanks,
Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility

2014-01-30 Thread Ian Jackson
Jim Fehlig writes ("Re: [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD 
flexibility"):
> Looking at the libvirt code again, it seems a single thread services the
> event loop. See virNetServerRun() in src/util/virnetserver.c. Indeed, I
> see the same thread ID in all the timer and fd callbacks. One of the
> libvirt core devs can correct me if I'm wrong.

OK.  So just to recap where we stand:

 * I think libxl needs the SIGCHLD flexibility series.  I'll repost
   that (v3) but it's had hardly any changes.

 * You need to fix the timer deregistration arrangements in the
   libvirt/libxl driver to avoid the crash you identified the other day.

 * Something needs to be done about the 20ms slop in the libvirt event
   loop (as it could cause libxl to lock up).  If you can't get rid of
   it in the libvirt core, then adding 20ms to the every requested
   callback time in the libvirt/libxl driver would work for now.

 * I think we can get away with not doing anything about the fd
   deregistration race in libvirt because both Linux and FreeBSD have
   behaviours that are tolerable.

 * libxl should have the fd deregistration race fixed in Xen 4.5.

Have you managed to fix the timer deregistration crash, and retest ?

Thanks,
Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: fork: SIGCHLD flexibility

2014-01-29 Thread Ian Jackson
Daniel P. Berrange writes ("Re: [libvirt] [Xen-devel] [PATCH 00/12] libxl: 
fork: SIGCHLD flexibility"):
> Yes, you are correct. The threading model for libvirtd is that the
> process thread leader executes the event loop, dealing with timer
> and file descriptor I/O callbacks. There are also 'n' worker threads
> which exclusively handle public API calls from libvirt clients. IOW
> all your timer callbacks will be in one thread - which also means
> you want your timer callbacks to be fast to execute.

Right.  Good.  All of libxl's callbacks should be fast.

There is still the problem that libxl functions on other threads may
make an fd deregistration call, while libvirt's event loop is still
polling (or selecting) on the fd in the main loop.

libxl might then close the fd, dup2 something onto it, leave the fd to
be reused for some other object.

Depending on the underlying kernel, that can cause side effects.  I
have reproduced the analogous bug with libxl's event loop, but I had
to use an fd connected to the controlling tty, from a background
process group, when the tty has stty tostop.  polling such a thing for
POLLOUT raises SIGTTOU.

Of course the libvirt xl driver still needs to cope with being told by
libxl to register or modify a timeout, or register deregister or
modify an fd, in other than the master thread.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Segfault in libxl_osevent_occurred_timeout [Was: Re: [libvirt-users] About debugging of libvirt.]

2013-12-19 Thread Ian Jackson
Dario Faggioli writes ("Segfault in libxl_osevent_occurred_timeout [Was: Re: 
[libvirt-users] About debugging of libvirt.]"):
> [Moving this to libvir, libvir-users in Bcc. Also, added xen-devel]

4.2.1 is lacking
  libxl: fix stale timeout event callback race
  libxl: fix stale fd event callback race

These are both in stable-4.2 as
  6f0f339dd4378d062a211969f45cd23af12bf386
  a87ef897295ec17788e41e9a8f4c0ada7a5a45f8

Around the time I was developing those in xen-unstable.git there were
some libvirt patches which were necessary too.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] libxl: cannot connect to PV console

2013-12-18 Thread Ian Jackson
Jim Fehlig writes ("Re: libxl: cannot connect to PV console"):
> Dario Faggioli wrote:
> > Hey Jim,
> >
> > I'm on libvirt.git's trunk and I see the following:
> > [xen@ghoul3 libvirt.git]$ sudo ./tools/virsh list
> >  IdName   State
> > 
> >  3 debian_32  running
> >  4 fedora20_64running
> >
> > [xen@ghoul3 libvirt.git]$ sudo ./tools/virsh console debian_32
> > Connected to domain debian_32
> > Escape character is ^]
> > error: internal error: cannot find character device (null)
> >
> > [xen@ghoul3 libvirt.git]$ sudo ./tools/virsh console fedora20_64
> > Connected to domain fedora20_64
> > Escape character is ^]
> > error: internal error: cannot find character device (null)
> >   
> 
> Looking at libxlDomainOpenConsole() in src/libxl/libxl_driver.c, it
> currently only supports a serial console.  Do you have one defined in
> these domains?  E.g.

If this is the root cause, it's a pretty bad error message.  Should I
or Dario take a look at the libvirt code to see if we can improve it ?

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [Xen-devel] Re: [PATCH V2] Add libxenlight driver

2011-02-23 Thread Ian Jackson
Paolo Bonzini writes ("[Xen-devel] Re: [PATCH V2] Add libxenlight driver"):
> On 02/23/2011 03:56 AM, Jim Fehlig wrote:
> > Add a new xen driver based on libxenlight [1], which is the primary
> > toolstack starting with Xen 4.1.0.  The driver is stateful, runs
> > privileged only, and is accessed with libxl:/// URI.
> 
> Why not let the user keep xen:/// ?

Because for now you want to be able to have both drivers.  Unless,
perhaps, it's somehow possible to select between them at runtime for
the same URI scheme ?

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] Re: [RFC] libxenlight driver

2011-01-21 Thread Ian Jackson
Stefano Stabellini writes ("Re: [Xen-devel] Re: [libvirt] [RFC] libxenlight 
driver"):
> I should also mention that one of the design goals of libxenlight is
> that no libxenlight users (toolstacks like libvirt) should need to use
> anything else but libxenlight to perform Xen operations. So you
> shouldn't need to call any xenctrl functions or deal with xenstore
> directly.

This is quite important.  We have tried to do this and we have
succeeded for the "xl" command's requirements, at least.

However since we currently have only one user of libxl, "xl" itself,
it is very likely that you will find that there are deficiencies in
the libxl API which make it difficult to use properly in libvirt.

We would very much like to fix these problems so please report them
all to us.  We will work with you to make libxl have the API you need
so that you can write libvirt properly.

Inevitably this will mean that the libvirt adaptation will need the
libxl from Xen 4.2, not the Xen 4.1 version which is currently in
freeze.  However in Xen 4.1 xend is still supported.

Please also send your libvirt patches to xen-devel for review, etc.

> I don't know much about libvirt but I would write a brand new
> libxenlight driver.

Yes, absolutely.  Having taken a look at the existing code in libvirt
for driving xenstore/xen/etc., none of it looks sensibly reuseable.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Xen-devel] Re: [RFC] libxenlight driver

2011-01-21 Thread Ian Jackson
Jim Fehlig writes ("Re: [Xen-devel] Re: [libvirt] [RFC] libxenlight driver"):
> Ian Campbell wrote:
> > The xapi toolstack which implements XenAPI is now open source as well
> > and is used in XCP (as well as XenServer).
> 
> But there is no intent on putting this toolstack in the traditional Xen
> releases from xen.org correct?

That's right, there isn't.

Ian.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list