Re: [libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination

2017-12-22 Thread Nikolay Shirokovskiy


On 22.12.2017 17:13, John Ferlan wrote:
> [...]
> 
>>>
>>> Still adding the "virHashRemoveAll(dmn->servers);" into
>>> virNetDaemonClose doesn't help the situation as I can still either crash
>>> randomly or hang, so I'm less convinced this would really fix anything.
>>> It does change the "nature" of the hung thread stack trace though, as
>>> the second thread is now:
>>
>> virHashRemoveAll is not enough now. Due to unref reordeing last ref to @srv 
>> is
>> unrefed after virStateCleanup. So we need to virObjectUnref(srv|srvAdm) 
>> before
>> virStateCleanup. Or we can call virThreadPoolFree from virNetServerClose (
>> as in the first version of the patch and as Erik suggests) instead
>> of virHashRemoveAll.
>>
> 
> Patches w/
> 
>  1. Long pause before GetAllStats (without using [u]sleep)
>  2. Adjustment to call virNetServerServiceToggle in
> virNetServerServiceClose (instead of virNetServerDispose)
>  3. Call virHashRemoveAll in virNetDaemonClose
>  4. Call virThreadPoolFree in virNetServerClose
>  5. Perform Unref (adminProgram, srvAdm, qemuProgram, lxcProgram,
> remoteProgream, and srv) before virNetDaemonClose
> 
> Still has the virCondWait's - so as Daniel points out there's quite a
> bit more work to be done. Like most Red Hat engineers - I will not be
> very active over the next week or so (until the New Year) as it's a
> holiday break/vacation for us.
> 
> So unless you have the burning desire to put together some patches and
> do the work yourself, more thoughts/work will need to wait.
> 
> John
> 

Ok. Happy holydays!

I'm going to check what's going nevertheless ))

Nikolay

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination

2017-12-22 Thread John Ferlan
[...]

>>
>> Still adding the "virHashRemoveAll(dmn->servers);" into
>> virNetDaemonClose doesn't help the situation as I can still either crash
>> randomly or hang, so I'm less convinced this would really fix anything.
>> It does change the "nature" of the hung thread stack trace though, as
>> the second thread is now:
> 
> virHashRemoveAll is not enough now. Due to unref reordeing last ref to @srv is
> unrefed after virStateCleanup. So we need to virObjectUnref(srv|srvAdm) before
> virStateCleanup. Or we can call virThreadPoolFree from virNetServerClose (
> as in the first version of the patch and as Erik suggests) instead
> of virHashRemoveAll.
> 

Patches w/

 1. Long pause before GetAllStats (without using [u]sleep)
 2. Adjustment to call virNetServerServiceToggle in
virNetServerServiceClose (instead of virNetServerDispose)
 3. Call virHashRemoveAll in virNetDaemonClose
 4. Call virThreadPoolFree in virNetServerClose
 5. Perform Unref (adminProgram, srvAdm, qemuProgram, lxcProgram,
remoteProgream, and srv) before virNetDaemonClose

Still has the virCondWait's - so as Daniel points out there's quite a
bit more work to be done. Like most Red Hat engineers - I will not be
very active over the next week or so (until the New Year) as it's a
holiday break/vacation for us.

So unless you have the burning desire to put together some patches and
do the work yourself, more thoughts/work will need to wait.

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2 00/14] Fix race on srv->nclients_unauth and some other changes

2017-12-22 Thread Marc Hartmayer
On Thu, Dec 21, 2017 at 07:23 PM +0100, John Ferlan  wrote:
> On 12/21/2017 09:28 AM, Marc Hartmayer wrote:
>> This patch series fixes some locking issues, a memory leak, some other
>> cosmetic changes, and it fixes a bug that led to a libvirtd which
>> doesn't accept new connections.
>> 
>> Changelog:
>>  v1->v2:
>>   + Added r-b's
>>   + Patch 2: Replaced the patch with a new fix for the memory leak
>>   + Patch 5: Introduce *Locked and simplified the function as John suggested
>>   + Patch 7: Fixed coding style issues, merged in the former patch 9, 
>> reworded commit message
>>   + Patch 9: (former patch 10) reworded the commit message and fixed coding 
>> style issues
>>   + Patch 10: (former patch 11) Fixed coding style issues, moved check
>>   to ExecRestart, reworded function description of
>>   virNetServerSetClientAuthCompletedLocked
>>   + New patch 14: Add locking around the critical section in remoteSASLFinish
>> 
>> Marc Hartmayer (14):
>>   rpc: Remove duplicate declaration of virNetServerAddClient
>>   tests: virnetserverclienttest: Fix memory leak @client
>>   rpc: Use the enum value instead of a numerical value
>>   rpc: Add typedef for the anonymous enum used for authentication
>> methods
>>   rpc: Be more precise in which cases the authentication is needed and
>> introduce *Locked
>>   rpc: First test if authentication is required
>>   rpc: Refactor the condition whether a client needs authentication
>>   rpc: Correct locking and simplify the function
>>   rpc: Introduce virNetServerSetClientAuthenticated
>>   rpc: virnetserver: Fix race on srv->nclients_unauth
>>   tests: virnetdaemontest: Enable testing for 'auth_pending'
>>   rpc: Remove virNetServerClientNeedAuthLocked
>>   rpc: Replace virNetServerClientNeedAuth with
>> virNetServerClientIsAuthenticated
>>   remote: add locking around the critical section in remoteSASLFinish
>> 
>>  daemon/remote.c|  33 +++---
>>  src/libvirt_remote.syms|  14 +--
>>  src/rpc/virnetserver.c |  82 +-
>>  src/rpc/virnetserver.h |   6 +-
>>  src/rpc/virnetserverclient.c   | 118 
>> -
>>  src/rpc/virnetserverclient.h   |  11 +-
>>  src/rpc/virnetserverprogram.c  |   9 +-
>>  src/rpc/virnetserverservice.h  |   4 +-
>>  .../input-data-client-auth-pending-failure.json|  44 
>>  .../input-data-client-auth-pending.json|  70 
>>  .../virnetdaemondata/output-data-admin-nomdns.json |   4 +
>>  .../output-data-admin-server-names.json|   4 +
>>  .../virnetdaemondata/output-data-anon-clients.json |   2 +
>>  ...s.json => output-data-client-auth-pending.json} |   4 +-
>>  tests/virnetdaemondata/output-data-client-ids.json |   2 +
>>  .../output-data-client-timestamp.json  |   2 +
>>  .../output-data-initial-nomdns.json|   2 +
>>  tests/virnetdaemondata/output-data-initial.json|   2 +
>>  .../output-data-no-keepalive-required.json |   4 +
>>  tests/virnetdaemontest.c   |   2 +
>>  tests/virnetserverclienttest.c |   1 +
>>  21 files changed, 331 insertions(+), 89 deletions(-)
>>  create mode 100644 
>> tests/virnetdaemondata/input-data-client-auth-pending-failure.json
>>  create mode 100644 
>> tests/virnetdaemondata/input-data-client-auth-pending.json
>>  copy tests/virnetdaemondata/{output-data-client-ids.json => 
>> output-data-client-auth-pending.json} (94%)
>> 
>
> For patches 1-13,
>
> Reviewed-by: John Ferlan 
>
> Before pushing - probably should wait to make sure there's no other
> objections from anyone else. I'm guessing Dan will take a look.
>
> John

Thanks.

-- 
Beste Grüße / Kind regards
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH v2 14/14] remote: add locking around the critical section in remoteSASLFinish

2017-12-22 Thread Marc Hartmayer
On Thu, Dec 21, 2017 at 07:20 PM +0100, John Ferlan  wrote:
> On 12/21/2017 09:29 AM, Marc Hartmayer wrote:
>> ...as there is an access to priv->sasl the priv->lock is needed.
>>
>> Signed-off-by: Marc Hartmayer 
>> Reviewed-by: Bjoern Walk 
>> Reviewed-by: Boris Fiuczynski 
>> ---
>>  daemon/remote.c | 20 +++-
>>  1 file changed, 15 insertions(+), 5 deletions(-)
>>
>
> Both callers remoteDispatchAuthSaslStart and remoteDispatchAuthSaslStep
> already have priv->lock taken (unless I'm missing something).


Ohhh, you’re right! Sry for that and thanks for checking!!

>
>
> John
>
>> diff --git a/daemon/remote.c b/daemon/remote.c
>> index b6fe6d8539ff..81d570b6e269 100644
>> --- a/daemon/remote.c
>> +++ b/daemon/remote.c
>> @@ -3389,6 +3389,9 @@ remoteSASLFinish(virNetServerPtr server,
>>  const char *identity;
>>  struct daemonClientPrivate *priv = 
>> virNetServerClientGetPrivateData(client);
>>  int ssf;
>> +int rv = 0;
>> +
>> +virMutexLock(>lock);
>>
>>  /* TLS or UNIX domain sockets trivially OK */
>>  if (!virNetServerClientIsSecure(client)) {
>> @@ -3398,15 +3401,15 @@ remoteSASLFinish(virNetServerPtr server,
>>  VIR_DEBUG("negotiated an SSF of %d", ssf);
>>  if (ssf < 56) { /* 56 is good for Kerberos */
>>  VIR_ERROR(_("negotiated SSF %d was not strong enough"), ssf);
>> -return -2;
>> +goto rejected;
>>  }
>>  }
>>
>>  if (!(identity = virNetSASLSessionGetIdentity(priv->sasl)))
>> -return -2;
>> +goto rejected;
>>
>>  if (!virNetSASLContextCheckIdentity(saslCtxt, identity))
>> -return -2;
>> +goto rejected;
>>
>>  if (!(clnt_identity = virNetServerClientGetIdentity(client)))
>>  goto error;
>> @@ -3425,10 +3428,17 @@ remoteSASLFinish(virNetServerPtr server,
>>  virObjectUnref(priv->sasl);
>>  priv->sasl = NULL;
>>
>> -return 0;
>> + cleanup:
>> +virMutexUnlock(>lock);
>> +return rv;
>>
>>   error:
>> -return -1;
>> +rv = -1;
>> +goto cleanup;
>> +
>> + rejected:
>> +rv = -2;
>> +goto cleanup;
>>  }
>>
>>  /*
>>
>
--
Beste Grüße / Kind regards
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] RFC: Introduce a dlm-corosync for Lock manager plugin

2017-12-22 Thread Daniel P. Berrange
On Fri, Dec 22, 2017 at 12:59:24AM -0700, Lin Fu wrote:
> > How are locks acquired by libdlm scoped ?  The reason we have virtlockd is
> > that the fcntl() locks need to be held by a running process, and we wanted
> > them to persist across libvirtd restarts. This required holding them in a
> > separate process.
> Locks are managed by 'dlm_controld' daemon. There is a flag marked as
> `LKF_PERSISTENT` could archive the purpose that locks still exist after
> restarting libvirtd. And within my current cognition, lock would only
> disappear after rebooting OS, (not release the locks manually); at the
> same time, other nodes in clusters could recovery the lock context.
> > Are libdlm locks automatically released when the process that acquired
> > them dies, or is a manual release action required ?  If they always require
> > a manual release, then there would be no need to use virtlockd - the plugin
> > can just take care of acquire & release, and still cope with lbivirtd
> > restarts.
> So why do I talk about virtlockd? It about the lockid. And what is lockid?
> In dlm, there are concepts about 'lockspace', 'resource' and 'lockid'. The
> relationship: multiple locks could be associated with the same resource,
> one lockspace could be added in multiple resource. When acquire a lock for
> the resource, lockid is returned by acquiring API call.
> 
> Question: How could I know that which lockid is associated with the clear
> resource? After all, all lockid information will be lost after rebooting
> libvirtd in generally.

It is valid for the lock manager plugin to create its own state files on
disk to record this information across restarts if desired. No need to pass
the info to virtlockd  in this case.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination

2017-12-22 Thread Daniel P. Berrange
On Fri, Dec 22, 2017 at 10:52:45AM +0300, Nikolay Shirokovskiy wrote:
> 
> 
> On 21.12.2017 15:14, Erik Skultety wrote:
> > On Thu, Dec 21, 2017 at 12:48:44PM +0300, Nikolay Shirokovskiy wrote:
> >>
> >>
> >> On 21.12.2017 11:49, Erik Skultety wrote:
> >>> On Thu, Dec 21, 2017 at 09:39:16AM +0100, Cedric Bosdonnat wrote:
>  Hi John,
> 
>  On Wed, 2017-12-20 at 16:52 -0500, John Ferlan wrote:
> > So I moved the Unref(dmn) after virStateCleanup... That of course led to
> > the virt-manager issue.  So I figured I'd give adding this patch in and
> > that then fixed the virt-manager crash (obviously as we'd be Unref'ing
> > all those servers we added).
> 
>  Seems like I forgot to git-send my patch yesterday evening: here it is:
> 
>  https://www.redhat.com/archives/libvir-list/2017-December/msg00739.html
> 
>  That fixes the virt-manager crash at least.
> >>>
> >>> Hi, so the patch looks good, at first glance it makes complete sense to 
> >>> close
> >>> the client connections when closing the server and unreffing the objects 
> >>> when
> >>> disposing of the server object. However, since this and the other related
> >>> issues have been bugging us for quite some time I'm not going to say ACK 
> >>> just
> >>> yet, I'd like to think about it for a bit - but if someone else already 
> >>> has and
> >>> they feel brave enough :P, then I'm not going to block this to go in.
> >>>
> >>> Erik
> >>>
> >>
> >> Patch looks good to me too. But still original "libvirtd: fix crash on 
> >> termination"
> >> fixes another issue and if applied fixes "virt-manager issue" as well as 
> >> John
> >> figured out.
> > 
> > Finally I'm back on track with this (sorry for it taking so long), although
> > you're right about this, it's not the correct fix, it's just a byproduct of
> > your patch, however, the whole thing about closing connections and releasing
> > memory is a bit of a mess. For example, right now, we only dispose of 
> > service
> > objs (but we don't close them, we do that in virNetServerClose), but we both
> > close and dispose of the client objs. Another thing, we toggle the service 
> > to
> > stop accepting any new connections, but that's irrelevant at that point 
> > because
> > we've closed them already by that time as a product of calling
> > virNetDaemonClose->virNetServerClose->virNetServerServiceClose - so that
> > should be removed...I drifted a bit here, anyway, we need to make a clear
> > distinction when we're releasing memory and when we're shutting down some 
> > kind
> > of service - sometimes both can be done at the same time (see below), 
> > however
> > it's very clear we can't do it here. Your issue is that Close currently only
> > closes the sockets (this reflects my point of "shutting down a service") but
> > does nothing with the threadpool connected to the server, thus leaving the
> > worker threads to continue running and executing APIs the results of which 
> > they
> > shouldn't even be able to return back to the client, since we're shutting 
> > down.
> 
> To me it is not the first order problem. On daemon shutdown clients will 
> anyway
> most probably receive some kind of error. 
> 
> But we should finish worker threads before we cleanup drivers.

Yes, that is absolutely critical. With current situation where we leave clients
connected & worker threads running, while we cleanup drivers, we're pretty much
guaranteed a tonne of bizarre crashes.

For killing clients we should simply make virNetServerClose() tear down all
client connections, at the same time as it tears down listener sockets. There
is no point separating those actions, as they're conceptually related.

> > 1) split virThreadPoolFree into 2 functions, one which only broadcasts the
> > "die" message and joins the threads (or waits for them in this case...) and 
> > the
> > other releasing the resources - can't say I'm a fan of this one
> > 2) call virThreadPoolFree from virNetServerClose instead...
> > 
> > None of those approaches is ideal, but I can't seem to think off anything
> > better at the moment.

Whether the workers threads are running or not is actually not the key
issue here.

What causes the crashes is whether the worker threads are processing jobs
or not.  IOW, all we need todo is first purge all queued jobs, and then
wait for the threads to finish any jobs they already had. The threads
themselves can stay running. This is a conceptually useful operation beyond
just the shutdown scenario.

IOW, we should create  a  virThreadPoolDrain() method to purge all jobs
and wait for idle.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] add support of iSER transport type in qemu with libiscsi

2017-12-22 Thread lichstor
From: zhangshengyu 

---
 docs/schemas/domaincommon.rng  | 28 +
 src/conf/domain_conf.c |  8 
 src/qemu/qemu_block.c  | 24 ++-
 src/qemu/qemu_command.c|  3 ++
 src/qemu/qemu_parse_command.c  | 10 -
 src/storage/storage_backend_gluster.c  |  1 +
 src/util/virstoragefile.c  |  3 +-
 src/util/virstoragefile.h  |  1 +
 .../disk-drive-network-iser-auth.args  | 25 
 .../disk-drive-network-iser-auth.xml   | 44 
 .../qemuargv2xmldata/disk-drive-network-iser.args  | 25 
 tests/qemuargv2xmldata/disk-drive-network-iser.xml | 41 +++
 tests/qemuargv2xmltest.c   |  2 +
 ...-drive-network-iser-auth-secrettype-invalid.xml | 33 +++
 ...sk-drive-network-iser-auth-wrong-secrettype.xml | 33 +++
 .../disk-drive-network-iser-auth.args  | 31 ++
 .../disk-drive-network-iser-auth.xml   | 43 
 .../disk-drive-network-iser-lun.args   | 27 +
 .../disk-drive-network-iser-lun.xml| 28 +
 .../qemuxml2argvdata/disk-drive-network-iser.args  | 29 +
 tests/qemuxml2argvdata/disk-drive-network-iser.xml | 37 +
 tests/qemuxml2argvtest.c   |  1 +
 .../disk-drive-network-iser-auth.xml   | 47 ++
 .../qemuxml2xmloutdata/disk-drive-network-iser.xml | 41 +++
 tests/qemuxml2xmltest.c|  1 +
 25 files changed, 562 insertions(+), 4 deletions(-)
 create mode 100644 tests/qemuargv2xmldata/disk-drive-network-iser-auth.args
 create mode 100644 tests/qemuargv2xmldata/disk-drive-network-iser-auth.xml
 create mode 100644 tests/qemuargv2xmldata/disk-drive-network-iser.args
 create mode 100644 tests/qemuargv2xmldata/disk-drive-network-iser.xml
 create mode 100644 
tests/qemuxml2argvdata/disk-drive-network-iser-auth-secrettype-invalid.xml
 create mode 100644 
tests/qemuxml2argvdata/disk-drive-network-iser-auth-wrong-secrettype.xml
 create mode 100644 tests/qemuxml2argvdata/disk-drive-network-iser-auth.args
 create mode 100644 tests/qemuxml2argvdata/disk-drive-network-iser-auth.xml
 create mode 100644 tests/qemuxml2argvdata/disk-drive-network-iser-lun.args
 create mode 100644 tests/qemuxml2argvdata/disk-drive-network-iser-lun.xml
 create mode 100644 tests/qemuxml2argvdata/disk-drive-network-iser.args
 create mode 100644 tests/qemuxml2argvdata/disk-drive-network-iser.xml
 create mode 100644 tests/qemuxml2xmloutdata/disk-drive-network-iser-auth.xml
 create mode 100644 tests/qemuxml2xmloutdata/disk-drive-network-iser.xml

diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index f22c932f6..819a8791d 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -1534,6 +1534,7 @@
   
 tcp
 rdma
+iser
   
 
   
@@ -1613,6 +1614,15 @@
   
 
   
+  
+
+  
+tcp
+rdma
+iser
+  
+
+  
 
   
 
@@ -4386,6 +4396,15 @@
   
 
   
+  
+
+  
+tcp
+rdma
+iser
+  
+
+  
   
 
   
@@ -4397,6 +4416,15 @@
 
   
 
+
+  
+
+  tcp
+  rdma
+  iser
+
+  
+
 
   
 
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9a62bc472..4c71855b4 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -7080,6 +7080,7 @@ virDomainStorageNetworkParseHosts(xmlNodePtr node,
   virStorageNetHostDefPtr *hosts,
   size_t *nhosts)
 {
+char *transport = NULL;
 xmlNodePtr child;
 
 for (child = node->children; child; child = child->next) {
@@ -7091,6 +7092,9 @@ virDomainStorageNetworkParseHosts(xmlNodePtr node,
 }
 }
 
+if ((*hosts) && (transport = virXMLPropString(node, "transport")))
+(*hosts)->transport = 
virStorageNetHostTransportTypeFromString(transport);
+
 return 0;
 }
 
@@ -8495,6 +8499,7 @@ virDomainDiskSourceNetworkParse(xmlNodePtr node,
 if (virDomainStorageNetworkParseHosts(node, >hosts, >nhosts) < 0)
 goto cleanup;
 
+
 virStorageSourceNetworkAssignDefaultPorts(src);
 
 ret = 0;
@@ -22364,6 +22369,9 @@ 

Re: [libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination

2017-12-22 Thread Nikolay Shirokovskiy


On 21.12.2017 17:59, John Ferlan wrote:
> [...]
> 
>>>
>>> Patch looks good to me too. But still original "libvirtd: fix crash on 
>>> termination"
>>> fixes another issue and if applied fixes "virt-manager issue" as well as 
>>> John
>>> figured out.
> 
> Not sure there's enough coffee in the house this morning to make me
> "awake enough" for all this stuff, especially just before a holiday
> break. But perhaps better now than after the new year... sooo
> 
>>
>> Finally I'm back on track with this (sorry for it taking so long), although
>> you're right about this, it's not the correct fix, it's just a byproduct of
>> your patch, however, the whole thing about closing connections and releasing
>> memory is a bit of a mess. For example, right now, we only dispose of service
>> objs (but we don't close them, we do that in virNetServerClose), but we both
>> close and dispose of the client objs. Another thing, we toggle the service to
>> stop accepting any new connections, but that's irrelevant at that point 
>> because
>> we've closed them already by that time as a product of calling
>> virNetDaemonClose->virNetServerClose->virNetServerServiceClose - so that
> 
> virNetServerServiceClose could call virNetServerServiceToggle(,false),
> even though it probably doesn't matter at this point.
> 
> Makes me wonder why virNetServerUpdateServicesLocked wasn't called in
> virNetServerDispose instead of open coding the nservices loop (sigh).
> 
>> should be removed...I drifted a bit here, anyway, we need to make a clear
>> distinction when we're releasing memory and when we're shutting down some 
>> kind
>> of service - sometimes both can be done at the same time (see below), however
>> it's very clear we can't do it here. Your issue is that Close currently only
>> closes the sockets (this reflects my point of "shutting down a service") but
>> does nothing with the threadpool connected to the server, thus leaving the
>> worker threads to continue running and executing APIs the results of which 
>> they
>> shouldn't even be able to return back to the client, since we're shutting 
>> down.
>> Now the thing with threadpool is that both memory release and teardown are
>> merged into one "object disposal" operation and therefore done as part of
>> virNetServerDispose. Since I understand a removal from the hash table as a
>> memory release operation, we should not be calling virHashRemoveAll from
>> virNetDaemonClose. Now, I see 2 options:
>>
>> 1) split virThreadPoolFree into 2 functions, one which only broadcasts the
>> "die" message and joins the threads (or waits for them in this case...) and 
>> the
>> other releasing the resources - can't say I'm a fan of this one
>>
> 
> Kind of a "virThreadPoolStop" type operation...
> 
> 
>> 2) call virThreadPoolFree from virNetServerClose instead...
>>
>> None of those approaches is ideal, but I can't seem to think off anything
>> better at the moment.
> 
> I like 2 better, but it doesn't fix the problem of a long running thread
> (such as GetAllDomainStats), it just moves the cheese.  Although I have
> a feeling the virStateShutdown series Nikolay is promoting may solve
> that issue.

No, no. The problem will be fixed as I mentined in a different reply.
It just can not be solved by solely virHashRemoveAll with current
unref ordering in libvirtd.c. 

> 
> It's really a conundrum w/r/t how much time to spend on this especially
> if the short/long term goal is a shims for libvirtd (e.g. libvirt_qemu)
> which will move the cheese even more.
> 
> I'm going to move away from this for now, maybe a fresh look will help.
> Right now I'm not sure I can focus on this with the other threads I'm
> involved in.
> 
> John
> 
> 
>>
>> I'm open to discuss any other suggestions.
>> Erik
>>
>> --
>> libvir-list mailing list
>> libvir-list@redhat.com
>> https://www.redhat.com/mailman/listinfo/libvir-list
>>

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v3 0/2] qemu: Add support for hot unplugging

2017-12-22 Thread Chen Hanxiao
v3:
  use helper qemuDomainDelChardevTLSObjects and address some comments
v2:
  rebased on master

Chen Hanxiao (2):
  qemu: Add support for hot unplugging redirdev device which can use the
detach-device --live
  news: add change of hot unplug redirdev

 docs/news.xml   | 11 ++
 src/qemu/qemu_driver.c  |  4 ++-
 src/qemu/qemu_hotplug.c | 91 +
 src/qemu/qemu_hotplug.h |  4 +++
 4 files changed, 109 insertions(+), 1 deletion(-)

-- 
2.14.3

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] RFC: Introduce a dlm-corosync for Lock manager plugin

2017-12-22 Thread Lin Fu
> How are locks acquired by libdlm scoped ?  The reason we have virtlockd is
> that the fcntl() locks need to be held by a running process, and we wanted
> them to persist across libvirtd restarts. This required holding them in a
> separate process.
Locks are managed by 'dlm_controld' daemon. There is a flag marked as
`LKF_PERSISTENT` could archive the purpose that locks still exist after
restarting libvirtd. And within my current cognition, lock would only
disappear after rebooting OS, (not release the locks manually); at the
same time, other nodes in clusters could recovery the lock context.
> Are libdlm locks automatically released when the process that acquired
> them dies, or is a manual release action required ?  If they always require
> a manual release, then there would be no need to use virtlockd - the plugin
> can just take care of acquire & release, and still cope with lbivirtd
> restarts.
So why do I talk about virtlockd? It about the lockid. And what is lockid?
In dlm, there are concepts about 'lockspace', 'resource' and 'lockid'. The
relationship: multiple locks could be associated with the same resource,
one lockspace could be added in multiple resource. When acquire a lock for
the resource, lockid is returned by acquiring API call.

Question: How could I know that which lockid is associated with the clear
resource? After all, all lockid information will be lost after rebooting
libvirtd in generally.

I think about some resolution: one is using virtlockd to maintain those
information, another is use some ways(such as share memory, write it to
disk) to persistent information Fortunately, maybe I don't have to use
virtlockd. LVM project also use the dlm, I could to refer it's implements.

-
>> The first half of 2017, QEMU introduced the `share-rw` and `file.locking` to
>> handle a problem: https://bugzilla.redhat.com/show_bug.cgi?id=1080152 ,
link correction: https://bugzilla.redhat.com/show_bug.cgi?id=1378241

-- 
Regards
River



--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v5 3/3] libvirtd: fix crash on termination

2017-12-22 Thread Nikolay Shirokovskiy


On 21.12.2017 15:57, John Ferlan wrote:
> [...]
> 
>>>
>>> So short story made really long, I think the best course of action will
>>> be to add this patch and reorder the Unref()'s (adminProgram thru srv,
>>> but not dmn). It seems to resolve these corner cases, but I'm also open
>>> to other suggestions. Still need to think about it some more too before
>>> posting any patches.
>>>
>>>
>> Hi.
>>
>> I'm not grasp the whole picture yet but I've managed to find out what
>> triggered the crash. It is not 2f3054c22 where you reordered unrefs but
>> 1fd1b766105 which moves events unregistering from netserver client closing to
>> netservec client disposing. Before 1fd1b766105 we don't have crash
>> because clients simply do not get disposed.
> 
> Oh yeah, that one  But considering Erik's most recent response in
> this overall thread vis-a-vis the separation of "close" vs. "dispose"
> and the timing of each w/r/t Unref and Free, I think having the call to
> remoteClientFreePrivateCallbacks in remoteClientCloseFunc is perhaps
> better than in remoteClientFreeFunc.
> 
>>  
>> As to fixing the crash with this patch I thinks its is coincidence. I want
>> do dispose netservers early to join rpc threads and it turns out that
>> disposing also closing clients too and this fixes the problem.
>>
>> Nikolay
>>
> 
> With Cedric's patch in place, the virt-manager client issue is fixed. So
> that's goodness.
> 
> If I then add the sleep (or usleep) into qemuConnectGetAllDomainStats as
> noted in what started this all, then I can either get libvirtd to crash
> dereferencing a NULL driver pointer or (my favorite) hang with two
> threads stuck waiting:
> 
> (gdb) t a a bt
> 
> Thread 5 (Thread 0x7fffe535b700 (LWP 15568)):
> #0  0x73dc909d in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x73dc1e23 in pthread_mutex_lock () from /lib64/libpthread.so.0
> #2  0x77299a15 in virMutexLock (m=)
> at util/virthread.c:89
> #3  0x7fffc760621e in qemuDriverLock (driver=0x7fffbc190510)
> at qemu/qemu_conf.c:100
> #4  virQEMUDriverGetConfig (driver=driver@entry=0x7fffbc190510)
> at qemu/qemu_conf.c:1002
> #5  0x7fffc75dfa89 in qemuDomainObjBeginJobInternal (
> driver=driver@entry=0x7fffbc190510, obj=obj@entry=0x7fffbc3bcd60,
> job=job@entry=QEMU_JOB_QUERY,
> asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE)
> at qemu/qemu_domain.c:4690
> #6  0x7fffc75e3b2b in qemuDomainObjBeginJob (
> driver=driver@entry=0x7fffbc190510, obj=obj@entry=0x7fffbc3bcd60,
> job=job@entry=QEMU_JOB_QUERY) at qemu/qemu_domain.c:4842
> #7  0x7fffc764f744 in qemuConnectGetAllDomainStats
> (conn=0x7fffb80009a0,
> doms=, ndoms=, stats=,
> retStats=0x7fffe535aaf0, flags=) at
> qemu/qemu_driver.c:20219
> #8  0x7736430a in virDomainListGetStats (doms=0x7fffa8000950,
> stats=0,
> retStats=retStats@entry=0x7fffe535aaf0, flags=0) at
> libvirt-domain.c:11595
> #9  0x5557948d in remoteDispatchConnectGetAllDomainStats (
> server=, msg=, ret=0x7fffa80008e0,
> args=0x7fffa80008c0, rerr=0x7fffe535abf0, client=)
> at remote.c:6538
> #10 remoteDispatchConnectGetAllDomainStatsHelper (server=,
> client=, msg=, rerr=0x7fffe535abf0,
> args=0x7fffa80008c0, ret=0x7fffa80008e0) at remote_dispatch.h:615
> #11 0x773bf59c in virNetServerProgramDispatchCall
> (msg=0x5586cdd0,
> client=0x5586bea0, server=0x5582ed90, prog=0x55869190)
> at rpc/virnetserverprogram.c:437
> #12 virNetServerProgramDispatch (prog=0x55869190,
> server=server@entry=0x5582ed90, client=0x5586bea0,
> msg=0x5586cdd0) at rpc/virnetserverprogram.c:307
> #13 0x555a9318 in virNetServerProcessMsg (msg=,
> prog=, client=, srv=0x5582ed90)
> at rpc/virnetserver.c:148
> #14 virNetServerHandleJob (jobOpaque=, opaque=0x5582ed90)
> at rpc/virnetserver.c:169
> #15 0x7729a521 in virThreadPoolWorker (
> opaque=opaque@entry=0x5583aa40) at util/virthreadpool.c:167
> #16 0x77299898 in virThreadHelper (data=)
> at util/virthread.c:206
> #17 0x73dbf36d in start_thread () from /lib64/libpthread.so.0
> #18 0x73af3e1f in clone () from /lib64/libc.so.6
> 
> Thread 1 (Thread 0x77ef9d80 (LWP 15561)):
> #0  0x73dc590b in pthread_cond_wait@@GLIBC_2.3.2 ()
> ---Type  to continue, or q  to quit---
>from /lib64/libpthread.so.0
> #1  0x77299af6 in virCondWait (c=, m=)
> at util/virthread.c:154
> #2  0x7729a760 in virThreadPoolFree (pool=)
> at util/virthreadpool.c:290
> #3  0x555a8ec2 in virNetServerDispose (obj=0x5582ed90)
> at rpc/virnetserver.c:767
> #4  0x7727923b in virObjectUnref (anyobj=)
> at util/virobject.c:356
> #5  0x7724f069 in virHashFree (table=)
> at util/virhash.c:318
> #6  0x773b8295 in virNetDaemonDispose (obj=0x5582eb10)
> at rpc/virnetdaemon.c:105
> #7  0x7727923b in virObjectUnref 

[libvirt] [PATCH v3 2/2] news: add change of hot unplug redirdev

2017-12-22 Thread Chen Hanxiao
From: Chen Hanxiao 

Add hot unplug redirdev in news 'Improvements' section

Signed-off-by: Chen Hanxiao 
---
v3:
  move it into 'Improvements' section

 docs/news.xml | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/docs/news.xml b/docs/news.xml
index bb611b7ba..513728d09 100644
--- a/docs/news.xml
+++ b/docs/news.xml
@@ -42,6 +42,17 @@
   Xen: Add support for multiple IP addresses on interface devices
 
   
+  
+
+qemu: Add support for hot unplugging redirdev device
+which can use the detach-device --live
+
+
+  We've supported hotplug redirdev device, but lacked of
+  hot unplug redirdev feature.
+  Now we can do it by libvirt.
+
+  
 
 
 
-- 
2.14.3

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v3 1/2] qemu: Add support for hot unplugging redirdev device which can use the detach-device --live

2017-12-22 Thread Chen Hanxiao
From: Chen Hanxiao 

We lacked of hot unplugging redirdev device.
This patch add support for it.
We could use detach-device --live now.

Signed-off-by: Chen Hanxiao 
---
v3:
  use helper qemuDomainDelChardevTLSObjects
  address John's comments

v2:
  rebase on master

 src/qemu/qemu_driver.c  |  4 ++-
 src/qemu/qemu_hotplug.c | 91 +
 src/qemu/qemu_hotplug.h |  4 +++
 3 files changed, 98 insertions(+), 1 deletion(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 97b194b05..a91288d4b 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -7787,6 +7787,9 @@ qemuDomainDetachDeviceLive(virDomainObjPtr vm,
 case VIR_DOMAIN_DEVICE_INPUT:
 ret = qemuDomainDetachInputDevice(vm, dev->data.input);
 break;
+case VIR_DOMAIN_DEVICE_REDIRDEV:
+ret = qemuDomainDetachRedirdevDevice(driver, vm, dev->data.redirdev);
+break;
 
 case VIR_DOMAIN_DEVICE_FS:
 case VIR_DOMAIN_DEVICE_SOUND:
@@ -7796,7 +7799,6 @@ qemuDomainDetachDeviceLive(virDomainObjPtr vm,
 case VIR_DOMAIN_DEVICE_SMARTCARD:
 case VIR_DOMAIN_DEVICE_MEMBALLOON:
 case VIR_DOMAIN_DEVICE_NVRAM:
-case VIR_DOMAIN_DEVICE_REDIRDEV:
 case VIR_DOMAIN_DEVICE_NONE:
 case VIR_DOMAIN_DEVICE_TPM:
 case VIR_DOMAIN_DEVICE_PANIC:
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
index b79807300..724ee4f3f 100644
--- a/src/qemu/qemu_hotplug.c
+++ b/src/qemu/qemu_hotplug.c
@@ -4388,6 +4388,54 @@ qemuDomainRemoveInputDevice(virDomainObjPtr vm,
 }
 
 
+static int
+qemuDomainRemoveRedirdevDevice(virQEMUDriverPtr driver,
+   virDomainObjPtr vm,
+   virDomainRedirdevDefPtr dev)
+{
+qemuDomainObjPrivatePtr priv = vm->privateData;
+virObjectEventPtr event;
+char *charAlias = NULL;
+ssize_t idx;
+int ret = -1;
+
+VIR_DEBUG("Removing redirdev device %s from domain %p %s",
+  dev->info.alias, vm, vm->def->name);
+
+if (!(charAlias = qemuAliasChardevFromDevAlias(dev->info.alias)))
+goto cleanup;
+
+qemuDomainObjEnterMonitor(driver, vm);
+/* DeviceDel from Detach may remove chardev,
+ * so we cannot rely on return status to delete TLS chardevs.
+ */
+ignore_value(qemuMonitorDetachCharDev(priv->mon, charAlias));
+
+if (qemuDomainObjExitMonitor(driver, vm) < 0)
+goto cleanup;
+
+if (qemuDomainDelChardevTLSObjects(driver, vm,
+   dev->source, charAlias) < 0)
+goto cleanup;
+
+virDomainAuditRedirdev(vm, dev, "detach", true);
+
+event = virDomainEventDeviceRemovedNewFromObj(vm, dev->info.alias);
+qemuDomainEventQueue(driver, event);
+
+if ((idx = virDomainRedirdevDefFind(vm->def, dev)) >= 0)
+virDomainRedirdevDefRemove(vm->def, idx);
+qemuDomainReleaseDeviceAddress(vm, >info, NULL);
+virDomainRedirdevDefFree(dev);
+
+ret = 0;
+
+ cleanup:
+VIR_FREE(charAlias);
+return ret;
+}
+
+
 int
 qemuDomainRemoveDevice(virQEMUDriverPtr driver,
virDomainObjPtr vm,
@@ -5128,6 +5176,49 @@ qemuDomainDetachWatchdog(virQEMUDriverPtr driver,
 }
 
 
+int
+qemuDomainDetachRedirdevDevice(virQEMUDriverPtr driver,
+   virDomainObjPtr vm,
+   virDomainRedirdevDefPtr dev)
+{
+int ret = -1;
+qemuDomainObjPrivatePtr priv = vm->privateData;
+virDomainRedirdevDefPtr tmpRedirdevDef;
+ssize_t idx;
+
+if ((idx = virDomainRedirdevDefFind(vm->def, dev)) < 0) {
+virReportError(VIR_ERR_OPERATION_INVALID, "%s",
+   _("no matching redirdev was not found"));
+return -1;
+}
+
+tmpRedirdevDef = vm->def->redirdevs[idx];
+
+if (!tmpRedirdevDef->info.alias) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("alias not set for redirdev device"));
+return -1;
+}
+
+qemuDomainMarkDeviceForRemoval(vm, >info);
+
+qemuDomainObjEnterMonitor(driver, vm);
+if (qemuMonitorDelDevice(priv->mon, tmpRedirdevDef->info.alias) < 0) {
+ignore_value(qemuDomainObjExitMonitor(driver, vm));
+goto cleanup;
+}
+if (qemuDomainObjExitMonitor(driver, vm) < 0)
+goto cleanup;
+
+if ((ret = qemuDomainWaitForDeviceRemoval(vm)) == 1)
+ret = qemuDomainRemoveRedirdevDevice(driver, vm, tmpRedirdevDef);
+
+ cleanup:
+qemuDomainResetDeviceRemoval(vm);
+return ret;
+}
+
+
 int
 qemuDomainDetachNetDevice(virQEMUDriverPtr driver,
   virDomainObjPtr vm,
diff --git a/src/qemu/qemu_hotplug.h b/src/qemu/qemu_hotplug.h
index 3e0d618e0..9a0c057f1 100644
--- a/src/qemu/qemu_hotplug.h
+++ b/src/qemu/qemu_hotplug.h
@@ -126,6 +126,10 @@ int qemuDomainDetachWatchdog(virQEMUDriverPtr driver,
  virDomainObjPtr vm,