Hi!

Probably I sent my patch proposals to the wrong list, so I'll resent them (see 
attachments). I also added corresponding new patches (0002-0003) for resetting 
the ringid file. I think the program should not mess with umask, specifically 
not inside an "if" in a subroutine. I failed to get the sense of it, so I added 
a comment saying so.

Regards,
Ulrich

--- Begin Message ---
Answering my own questions:

Those ringid files are created by corosync_ring_id_store() from main.c, and the 
contents is the ring sequence number (binary). The routine is called 
(indrirectly) from memb_state_commit_enter() (totemsrp.c). That routine is 
called by memb_join_process() and message_handler_memb_commit_token().

An obvious question arises:

As the sequence number is only stored for the active ring, what happens if 
rings are switched (rrp_mode == passive)? Likewise it seems corosync is using 
the IP of the first available ring as representative on startup, it seems. 
Shouldn't the representative (and identifier) be stable (as per slide 4 of 
[AMMA00] Amir, Y.; Moser, L. E.; Melliar-Smith, P. M.; Agarwal, D. A.; 
Ciarfella, P.: The Totem Single-Ring Ordering and Membership Protocol)? I doubt 
that representative and ringid are persistent just for each ring individually...

The "rxw"-Permissions are the natural consequence of this code 
(corosync_ring_id_store()):
        fd = open (filename, O_WRONLY, 0777);
        if (fd == -1) {
                fd = open (filename, O_CREAT|O_RDWR, 0777);
        }

So here's my first patch proposal also.

Regards,
Ulrich

>>> "Ulrich Windl" <[email protected]> schrieb am 02.06.2015 um
16:07 in Nachricht <[email protected]>:
>>>> Jan Friesse <[email protected]> schrieb am 02.06.2015 um 12:57 in 
>>>> Nachricht
> <[email protected]>:
> [...]
>> Last but not least is RRP. RRP itself works very well sadly it works in
>> totally different way then most of people expects.
> [...]
> 
> Probably because the documentation leaves to many questions unanswered...
> You can't tell what's wrong as long as you don't know how it should work.
> 
> Example: What's the purpose of /var/lib/corosync/ringid_* files, and why do 
> some hosts have /var/lib/corosync/ringid_127.0.0.1? The file content seems to 
> be some binary number, but the file is created with rwx permissions...
> 
> On systems with two rings (not on 127.*) I see all combinations from one to 
> three ringid files.
> 
> Regards,
> Ulrich
> 
> 
> 
> _______________________________________________
> Users mailing list: [email protected] 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




>From 70fba1e1a547d94e0480aa24b92810a410e36a08 Mon Sep 17 00:00:00 2001
From: Ulrich Windl <[email protected]>
Date: Mon, 8 Jun 2015 12:52:04 +0200
Subject: [PATCH] exec/main.c: Improve corosync_ring_id_store()

exec/main.c: Change file creation mode from 0777 to S_IRUSR|S_IWUSR
improving error messages in corosync_ring_id_store().
---
 exec/main.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/exec/main.c b/exec/main.c
index 2e4fd02..6c9e1b6 100644
--- a/exec/main.c
+++ b/exec/main.c
@@ -1043,15 +1043,14 @@ static void corosync_ring_id_store (
        snprintf (filename, sizeof(filename), "%s/ringid_%s",
                get_run_dir(), totemip_print (addr));
 
-       fd = open (filename, O_WRONLY, 0777);
+       fd = open (filename, O_WRONLY, 0);
        if (fd == -1) {
-               fd = open (filename, O_CREAT|O_RDWR, 0777);
+               fd = open (filename, O_CREAT|O_RDWR, S_IRUSR|S_IWUSR);
        }
        if (fd == -1) {
                LOGSYS_PERROR(errno, LOGSYS_LEVEL_ERROR,
-                       "Couldn't store new ring id %llx to stable storage",
-                       memb_ring_id->seq);
-
+                             "Couldn't open %s: %s",
+                             filename, strerror(errno));
                corosync_exit_error (AIS_DONE_STORE_RINGID);
        }
        log_printf (LOGSYS_LEVEL_DEBUG,
@@ -1060,8 +1059,8 @@ static void corosync_ring_id_store (
        close (fd);
        if (res != sizeof(memb_ring_id->seq)) {
                LOGSYS_PERROR(errno, LOGSYS_LEVEL_ERROR,
-                       "Couldn't store new ring id %llx to stable storage",
-                       memb_ring_id->seq);
+                             "Couldn't store new ring id %llx in %s: %s",
+                             memb_ring_id->seq, filename, strerror(errno));
 
                corosync_exit_error (AIS_DONE_STORE_RINGID);
        }
-- 
1.7.12.4


--- End Message ---
--- Begin Message ---
Hi!

Maybe we want "O_DSYNC" also:
--
diff --git a/exec/main.c b/exec/main.c
index 6c9e1b6..9be33e4 100644
--- a/exec/main.c
+++ b/exec/main.c
@@ -1043,9 +1043,9 @@ static void corosync_ring_id_store (
        snprintf (filename, sizeof(filename), "%s/ringid_%s",
                get_run_dir(), totemip_print (addr));

-       fd = open (filename, O_WRONLY, 0);
+       fd = open (filename, O_WRONLY|O_DSYNC, 0);
        if (fd == -1) {
-               fd = open (filename, O_CREAT|O_RDWR, S_IRUSR|S_IWUSR);
+               fd = open (filename, O_CREAT|O_RDWR|O_DSYNC, S_IRUSR|S_IWUSR);
        }
        if (fd == -1) {
                LOGSYS_PERROR(errno, LOGSYS_LEVEL_ERROR,
--

Ulrich



--- End Message ---
>From 0fcf0cfacf5a60f3d84d2407fa917bd72b3b24ca Mon Sep 17 00:00:00 2001
From: Ulrich Windl <[email protected]>
Date: Tue, 9 Jun 2015 08:39:02 +0200
Subject: [PATCH 2/3] Fix open flags when creating ringid file

exec/main.c (corosync_ring_id_create_or_load): Use zero flags when
opening file O_RDONLY.  Use symbolic permissions when creating ring id
file, dropping useless execute bit.  Questin change of umask when creating
file.
---
 exec/main.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/exec/main.c b/exec/main.c
index 6c9e1b6..9cc0351 100644
--- a/exec/main.c
+++ b/exec/main.c
@@ -996,7 +996,7 @@ static void corosync_ring_id_create_or_load (
 
        snprintf (filename, sizeof(filename), "%s/ringid_%s",
                get_run_dir(), totemip_print (addr));
-       fd = open (filename, O_RDONLY, 0700);
+       fd = open (filename, O_RDONLY, 0);
        /*
         * If file can be opened and read, read the ring id
         */
@@ -1009,8 +1009,8 @@ static void corosync_ring_id_create_or_load (
         */
        if ((fd == -1) || (res != sizeof (uint64_t))) {
                memb_ring_id->seq = 0;
-               umask(0);
-               fd = open (filename, O_CREAT|O_RDWR, 0700);
+               umask(0);               /* why that? */
+               fd = open (filename, O_CREAT|O_RDWR, S_IRUSR|S_IWUSR);
                if (fd != -1) {
                        res = write (fd, &memb_ring_id->seq, sizeof (uint64_t));
                        close (fd);
-- 
1.7.12.4

>From 5a29ecee27d7fa71d7871c3c8d907dad61572fca Mon Sep 17 00:00:00 2001
From: Ulrich Windl <[email protected]>
Date: Tue, 9 Jun 2015 08:44:57 +0200
Subject: [PATCH 3/3] exec/main.c: Improve comments and log messages

exec/main.c (corosync_ring_id_create_or_load): Improve comments.  Add log
message when ring id is reset to zero.  Change wording of error messages
when new ringid file could not be created.
---
 exec/main.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/exec/main.c b/exec/main.c
index 9cc0351..8734507 100644
--- a/exec/main.c
+++ b/exec/main.c
@@ -1005,9 +1005,11 @@ static void corosync_ring_id_create_or_load (
                close (fd);
        }
        /*
-        * If file could not be opened or read, create a new ring id
+        * If file could not be opened or read, reset ring id
         */
        if ((fd == -1) || (res != sizeof (uint64_t))) {
+               log_printf (LOGSYS_LEVEL_NOTICE, "Resetting ring id in %s\n",
+                           filename);
                memb_ring_id->seq = 0;
                umask(0);               /* why that? */
                fd = open (filename, O_CREAT|O_RDWR, S_IRUSR|S_IWUSR);
@@ -1016,13 +1018,15 @@ static void corosync_ring_id_create_or_load (
                        close (fd);
                        if (res == -1) {
                                LOGSYS_PERROR (errno, LOGSYS_LEVEL_ERROR,
-                                       "Couldn't write ringid file '%s'", 
filename);
+                                       "Couldn't write initial ringid file 
'%s'",
+                                              filename);
 
                                corosync_exit_error (AIS_DONE_STORE_RINGID);
                        }
                } else {
                        LOGSYS_PERROR (errno, LOGSYS_LEVEL_ERROR,
-                               "Couldn't create ringid file '%s'", filename);
+                               "Couldn't create ringid file '%s': %s",
+                                      filename, strerror(errno));
 
                        corosync_exit_error (AIS_DONE_STORE_RINGID);
                }
-- 
1.7.12.4

_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss

Reply via email to