Re: [libvirt] [PATCH v2] rpc: make daemon spawning a bit more intelligent

2014-09-17 Thread Martin Kletzander

On Tue, Sep 16, 2014 at 10:22:53AM -0400, John Ferlan wrote:



On 09/16/2014 05:16 AM, Martin Kletzander wrote:

This way it behaves more like the daemon itself does (acquiring a
pidfile, deleting the socket before binding, etc.).

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=927369
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138604

Signed-off-by: Martin Kletzander mklet...@redhat.com
---
 src/rpc/virnetsocket.c | 65 +++---
 1 file changed, 57 insertions(+), 8 deletions(-)



The error path/retry logic needs a tweak... I added some inline thinking
since we don't have a virtual whiteboard to share on this!



Yes, it does.  I didn't think it through from scratch, just adjusted
to your comments.  This time I went through it few times.  Just let me
confirm I understood what you meant everywhere.


diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c
index 80aeddf..7be1492 100644
--- a/src/rpc/virnetsocket.c
+++ b/src/rpc/virnetsocket.c
@@ -51,9 +51,11 @@
 #include virlog.h
 #include virfile.h
 #include virthread.h
+#include virpidfile.h
 #include virprobe.h
 #include virprocess.h
 #include virstring.h
+#include dirname.h
 #include passfd.h

 #if WITH_SSH2
@@ -541,7 +543,10 @@ int virNetSocketNewConnectUNIX(const char *path,
const char *binary,
virNetSocketPtr *retsock)
 {
+char *binname = NULL;
+char *pidpath = NULL;
 int fd, passfd = -1;
+int pidfd = -1;
 virSocketAddr localAddr;
 virSocketAddr remoteAddr;

@@ -580,16 +585,47 @@ int virNetSocketNewConnectUNIX(const char *path,
 goto error;
 }

+if (!(binname = last_component(binary)) || binname[0] == '\0') {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _(Cannot determine basename for binary '%s'),
+   binary);
+goto error;
+}
+
+if (virPidFileConstructPath(false, NULL, binname, pidpath)  0)
+goto error;


Since the first param is false, we are guaranteed only that 'pidpath' is
the path to the virGetUserRuntimeDirectory() for $binname.pid.

We are not sure if we created the path in virFileMakePathHelper() or
not. If we later then delete the file on the error path how does that
affect the daemon that wins the race?  See the conundrum?



This is only about deleting the pidfile, right?  Deleting it only when
it is acquired (pidfd = 0) should fix this.

I'll try describing it here a little bit more:

virNetSocketNewConnectUNIX() is called with (spawnDaemon == true) only
if (privileged == false).  virPidFileConstructPath() is called also
only when (spawnDaemon == true) and guarantees that the path for the
pidfile exists and is constructed the same way it is in daemon.  The
path should not be deleted no matter whether we fail or not, those
directories should be kept there for later.


+
+pidfd = virPidFileAcquirePath(pidpath, false, getpid());
+VIR_FREE(pidpath);


Because you VIR_FREE() here, there is no way for the error: path to have
a non NULL pidpath... and delete the pidpath.



Using VIR_FREE(pidpath); in both error path and before return 0 (my
initial idea) should take care of this.


+if (pidfd  0) {


Getting here means we were unable to get the pidfile lock and I don't
think we want to delete the pidpath... Since it's probably owned by some
other daemon



if (pidfd  0) then it means it will not get deleted.  By the way it
doesn't have to be deleted, closing should be enough to unlock the
file.


+/*
+ * This can happen in a very rare case of two clients spawning two
+ * daemons at the same time, and the error in the logs that gets
+ * reset here can be a clue to some future debugging.
+ */
+virResetLastError();
+spawnDaemon = false;
+goto retry;
+}


If we get here, we've written our pid into the pidfile and we have have
the lock... So that means we should own the file. Errors from here
should delete the file.



Right, there's nothing wrong.


+
 if ((passfd = socket(PF_UNIX, SOCK_STREAM, 0))  0) {
 virReportSystemError(errno, %s, _(Failed to create socket));
 goto error;
 }

 /*
- * We have to fork() here, because umask() is set
- * per-process, chmod() is racy and fchmod() has undefined
- * behaviour on sockets according to POSIX, so it doesn't
- * work outside Linux.
+ * We already even acquired the pidfile, so no one else should be using
+ * @path right now.  So we're OK to unlink it and paying attention to
+ * the return value makes no real sense here.  Only if it's not an
+ * abstract socket, of course.
+ */
+if (path[0] != '@')
+unlink(path);
+
+/*
+ * We have to fork() here, 

Re: [libvirt] [PATCH v2] rpc: make daemon spawning a bit more intelligent

2014-09-17 Thread Martin Kletzander

On Wed, Sep 17, 2014 at 10:59:21AM -0400, John Ferlan wrote:



On 09/17/2014 10:00 AM, Martin Kletzander wrote:

On Tue, Sep 16, 2014 at 10:22:53AM -0400, John Ferlan wrote:



On 09/16/2014 05:16 AM, Martin Kletzander wrote:

This way it behaves more like the daemon itself does (acquiring a
pidfile, deleting the socket before binding, etc.).

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=927369
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138604

Signed-off-by: Martin Kletzander mklet...@redhat.com
---
 src/rpc/virnetsocket.c | 65 +++---
 1 file changed, 57 insertions(+), 8 deletions(-)



The error path/retry logic needs a tweak... I added some inline thinking
since we don't have a virtual whiteboard to share on this!



Yes, it does.  I didn't think it through from scratch, just adjusted
to your comments.  This time I went through it few times.  Just let me
confirm I understood what you meant everywhere.



You did :-)

...snip...



Wow, I just reached this part of the mail when I wrote it already :)


Funny how that happens.



My current diff to the previous version looks like this:

diff --git i/src/rpc/virnetsocket.c w/src/rpc/virnetsocket.c
index 7be1492..e0efb14 100644
--- i/src/rpc/virnetsocket.c
+++ w/src/rpc/virnetsocket.c
@@ -596,7 +596,6 @@ int virNetSocketNewConnectUNIX(const char *path,
 goto error;

 pidfd = virPidFileAcquirePath(pidpath, false, getpid());
-VIR_FREE(pidpath);
 if (pidfd  0) {
 /*
  * This can happen in a very rare case of two clients
  spawning two
@@ -650,6 +649,7 @@ int virNetSocketNewConnectUNIX(const char *path,
  * time without spawning the daemon.
  */
 spawnDaemon = false;
+virPidFileDeletePath(pidpath);
 VIR_FORCE_CLOSE(pidfd);
 VIR_FORCE_CLOSE(passfd);
 goto retry;
@@ -674,6 +674,7 @@ int virNetSocketNewConnectUNIX(const char *path,
  * virCommandHook inside a virNetSocketForkDaemon().
  */
 VIR_FORCE_CLOSE(pidfd);
+pidfd = -1;


VIR_FORCE_CLOSE() will do this for you


 if (virNetSocketForkDaemon(binary, passfd)  0)
 goto error;
 }
@@ -687,10 +688,12 @@ int virNetSocketNewConnectUNIX(const char *path,
 if (!(*retsock = virNetSocketNew(localAddr, remoteAddr, true,
 fd, -1, 0)))
 goto error;

+VIR_FREE(pidpath);
+
 return 0;

  error:
-if (pidfd  0)
+if (pidfd = 0)
 virPidFileDeletePath(pidpath);
 VIR_FREE(pidpath);
 VIR_FORCE_CLOSE(fd);
--

Is that fine or should I use that goto error; everywhere after
acquiring the pidfile or is it better for you to see it in another
version?



This is fine - I think things are now covered.

ACK



Thanks for you thorough review, I squashed it in, remove that one
unnecessary pidfd = -1 and pushed it.

Martin


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH v2] rpc: make daemon spawning a bit more intelligent

2014-09-16 Thread Martin Kletzander
This way it behaves more like the daemon itself does (acquiring a
pidfile, deleting the socket before binding, etc.).

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=927369
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138604

Signed-off-by: Martin Kletzander mklet...@redhat.com
---
 src/rpc/virnetsocket.c | 65 +++---
 1 file changed, 57 insertions(+), 8 deletions(-)

diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c
index 80aeddf..7be1492 100644
--- a/src/rpc/virnetsocket.c
+++ b/src/rpc/virnetsocket.c
@@ -51,9 +51,11 @@
 #include virlog.h
 #include virfile.h
 #include virthread.h
+#include virpidfile.h
 #include virprobe.h
 #include virprocess.h
 #include virstring.h
+#include dirname.h
 #include passfd.h

 #if WITH_SSH2
@@ -541,7 +543,10 @@ int virNetSocketNewConnectUNIX(const char *path,
const char *binary,
virNetSocketPtr *retsock)
 {
+char *binname = NULL;
+char *pidpath = NULL;
 int fd, passfd = -1;
+int pidfd = -1;
 virSocketAddr localAddr;
 virSocketAddr remoteAddr;

@@ -580,16 +585,47 @@ int virNetSocketNewConnectUNIX(const char *path,
 goto error;
 }

+if (!(binname = last_component(binary)) || binname[0] == '\0') {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _(Cannot determine basename for binary '%s'),
+   binary);
+goto error;
+}
+
+if (virPidFileConstructPath(false, NULL, binname, pidpath)  0)
+goto error;
+
+pidfd = virPidFileAcquirePath(pidpath, false, getpid());
+VIR_FREE(pidpath);
+if (pidfd  0) {
+/*
+ * This can happen in a very rare case of two clients spawning two
+ * daemons at the same time, and the error in the logs that gets
+ * reset here can be a clue to some future debugging.
+ */
+virResetLastError();
+spawnDaemon = false;
+goto retry;
+}
+
 if ((passfd = socket(PF_UNIX, SOCK_STREAM, 0))  0) {
 virReportSystemError(errno, %s, _(Failed to create socket));
 goto error;
 }

 /*
- * We have to fork() here, because umask() is set
- * per-process, chmod() is racy and fchmod() has undefined
- * behaviour on sockets according to POSIX, so it doesn't
- * work outside Linux.
+ * We already even acquired the pidfile, so no one else should be using
+ * @path right now.  So we're OK to unlink it and paying attention to
+ * the return value makes no real sense here.  Only if it's not an
+ * abstract socket, of course.
+ */
+if (path[0] != '@')
+unlink(path);
+
+/*
+ * We have to fork() here, because umask() is set per-process, chmod()
+ * is racy and fchmod() has undefined behaviour on sockets according to
+ * POSIX, so it doesn't work outside Linux.
  */
 if ((pid = virFork())  0)
 goto error;
@@ -607,12 +643,15 @@ int virNetSocketNewConnectUNIX(const char *path,

 if (status != EXIT_SUCCESS) {
 /*
- * OK, so the subprocces failed to bind() the socket.  This may 
mean
- * that another daemon was starting at the same time and succeeded
- * with its bind().  So we'll try connecting again, but this time
- * without spawning the daemon.
+ * OK, so the child failed to bind() the socket.  This may mean 
that
+ * another daemon was starting at the same time and succeeded with
+ * its bind() (even though it should not happen because we using a
+ * pidfile for the race).  So we'll try connecting again, but this
+ * time without spawning the daemon.
  */
 spawnDaemon = false;
+VIR_FORCE_CLOSE(pidfd);
+VIR_FORCE_CLOSE(passfd);
 goto retry;
 }

@@ -629,6 +668,12 @@ int virNetSocketNewConnectUNIX(const char *path,
 goto error;
 }

+/*
+ * Do we need to eliminate the super-rare race here any more?  It would
+ * need incorporating the following VIR_FORCE_CLOSE() into a
+ * virCommandHook inside a virNetSocketForkDaemon().
+ */
+VIR_FORCE_CLOSE(pidfd);
 if (virNetSocketForkDaemon(binary, passfd)  0)
 goto error;
 }
@@ -645,8 +690,12 @@ int virNetSocketNewConnectUNIX(const char *path,
 return 0;

  error:
+if (pidfd  0)
+virPidFileDeletePath(pidpath);
+VIR_FREE(pidpath);
 VIR_FORCE_CLOSE(fd);
 VIR_FORCE_CLOSE(passfd);
+VIR_FORCE_CLOSE(pidfd);
 if (spawnDaemon)
 unlink(path);
 return -1;
-- 
2.1.0

--
libvir-list mailing list
libvir-list@redhat.com

Re: [libvirt] [PATCH v2] rpc: make daemon spawning a bit more intelligent

2014-09-16 Thread John Ferlan


On 09/16/2014 05:16 AM, Martin Kletzander wrote:
 This way it behaves more like the daemon itself does (acquiring a
 pidfile, deleting the socket before binding, etc.).
 
 Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=927369
 Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138604
 
 Signed-off-by: Martin Kletzander mklet...@redhat.com
 ---
  src/rpc/virnetsocket.c | 65 
 +++---
  1 file changed, 57 insertions(+), 8 deletions(-)
 

The error path/retry logic needs a tweak... I added some inline thinking
since we don't have a virtual whiteboard to share on this!

 diff --git a/src/rpc/virnetsocket.c b/src/rpc/virnetsocket.c
 index 80aeddf..7be1492 100644
 --- a/src/rpc/virnetsocket.c
 +++ b/src/rpc/virnetsocket.c
 @@ -51,9 +51,11 @@
  #include virlog.h
  #include virfile.h
  #include virthread.h
 +#include virpidfile.h
  #include virprobe.h
  #include virprocess.h
  #include virstring.h
 +#include dirname.h
  #include passfd.h
 
  #if WITH_SSH2
 @@ -541,7 +543,10 @@ int virNetSocketNewConnectUNIX(const char *path,
 const char *binary,
 virNetSocketPtr *retsock)
  {
 +char *binname = NULL;
 +char *pidpath = NULL;
  int fd, passfd = -1;
 +int pidfd = -1;
  virSocketAddr localAddr;
  virSocketAddr remoteAddr;
 
 @@ -580,16 +585,47 @@ int virNetSocketNewConnectUNIX(const char *path,
  goto error;
  }
 
 +if (!(binname = last_component(binary)) || binname[0] == '\0') {
 +virReportError(VIR_ERR_INTERNAL_ERROR,
 +   _(Cannot determine basename for binary '%s'),
 +   binary);
 +goto error;
 +}
 +
 +if (virPidFileConstructPath(false, NULL, binname, pidpath)  0)
 +goto error;

Since the first param is false, we are guaranteed only that 'pidpath' is
the path to the virGetUserRuntimeDirectory() for $binname.pid.

We are not sure if we created the path in virFileMakePathHelper() or
not. If we later then delete the file on the error path how does that
affect the daemon that wins the race?  See the conundrum?

 +
 +pidfd = virPidFileAcquirePath(pidpath, false, getpid());
 +VIR_FREE(pidpath);

Because you VIR_FREE() here, there is no way for the error: path to have
a non NULL pidpath... and delete the pidpath.

 +if (pidfd  0) {

Getting here means we were unable to get the pidfile lock and I don't
think we want to delete the pidpath... Since it's probably owned by some
other daemon

 +/*
 + * This can happen in a very rare case of two clients spawning 
 two
 + * daemons at the same time, and the error in the logs that gets
 + * reset here can be a clue to some future debugging.
 + */
 +virResetLastError();
 +spawnDaemon = false;
 +goto retry;
 +}

If we get here, we've written our pid into the pidfile and we have have
the lock... So that means we should own the file. Errors from here
should delete the file.

 +
  if ((passfd = socket(PF_UNIX, SOCK_STREAM, 0))  0) {
  virReportSystemError(errno, %s, _(Failed to create socket));
  goto error;
  }
 
  /*
 - * We have to fork() here, because umask() is set
 - * per-process, chmod() is racy and fchmod() has undefined
 - * behaviour on sockets according to POSIX, so it doesn't
 - * work outside Linux.
 + * We already even acquired the pidfile, so no one else should be 
 using
 + * @path right now.  So we're OK to unlink it and paying attention to
 + * the return value makes no real sense here.  Only if it's not an
 + * abstract socket, of course.
 + */
 +if (path[0] != '@')
 +unlink(path);
 +
 +/*
 + * We have to fork() here, because umask() is set per-process, 
 chmod()
 + * is racy and fchmod() has undefined behaviour on sockets according 
 to
 + * POSIX, so it doesn't work outside Linux.
   */
  if ((pid = virFork())  0)
  goto error;
 @@ -607,12 +643,15 @@ int virNetSocketNewConnectUNIX(const char *path,
 
  if (status != EXIT_SUCCESS) {
  /*
 - * OK, so the subprocces failed to bind() the socket.  This may 
 mean
 - * that another daemon was starting at the same time and 
 succeeded
 - * with its bind().  So we'll try connecting again, but this time
 - * without spawning the daemon.
 + * OK, so the child failed to bind() the socket.  This may mean 
 that
 + * another daemon was starting at the same time and succeeded 
 with
 + * its bind() (even though it should not happen because we using 
 a
 + * pidfile for the race).  So we'll try connecting again, but 
 this
 +