ctdb

tridge Mon, 09 Jul 2007 22:46:09 -0700

------------------------------------------------------------
revno: 569
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: Andrew Tridgell <[EMAIL PROTECTED]>
branch nick: tridge
timestamp: Tue 2007-07-10 14:59:23 +1000
message:
  merge from ronnie
modified:
  config/events.d/60.nfs         nfs-20070601141008-hy3h4qgbk1jd2jci-1
  server/ctdb_recoverd.c         recoverd.c-20070503213540-bvxuyd9jm1f7ig90-1
  tools/ctdb.c                   
ctdb_control.c-20070426122705-9ehj1l5lu2gn9kuj-1
  web/nfs.html                   nfs.html-20070608234340-a8i1dxro7a7i6jz6-1
    ------------------------------------------------------------
    revno: 432.1.121
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Tue 2007-07-10 13:09:35 +1000
    message:
      use the socketkiller to kill off all lock manager sessions as well
    ------------------------------------------------------------
    revno: 432.1.120
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Tue 2007-07-10 12:43:46 +1000
    message:
      update the documentation for NFS to mention that the lock manager must 
      run on the same port on all nodes.
      
      remove the CTDB_MANAGES_NFSLOCK variable that is no longer used
    ------------------------------------------------------------
    revno: 432.1.119
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Tue 2007-07-10 10:24:20 +1000
    message:
      make it possible to specify how many times ctdb killtcp will try to RST 
      the tcp connection
      
      change the 60.nfs script to run ctdb killtcp in the foreground so we 
      dont get lots of these running in parallel when there are a lot of tcp 
      connections to rst
    ------------------------------------------------------------
    revno: 432.1.118
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Tue 2007-07-10 10:07:26 +1000
    message:
      run the ctdb killtcp in the background
    ------------------------------------------------------------
    revno: 432.1.117
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Tue 2007-07-10 09:45:14 +1000
    message:
      dont restart the tcp service after a ip takeover,   it is more efficient 
      to just kill off the tcp connections
    ------------------------------------------------------------
    revno: 432.1.116
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Mon 2007-07-09 17:40:15 +1000
    message:
      nicer handling of DISCONNECTED flag  when we update the node flags from 
      a remote message
    ------------------------------------------------------------
    revno: 432.1.115
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Mon 2007-07-09 13:21:17 +1000
    message:
      when a remote node has sent us a message to update the flags for a node,  
 
      dont let those messages modify the DISCONNECTED flag.
      
      the DISCONNECTED flag must be managed locally since it describes whether 
      the local node can communicate with the remote node or not
    ------------------------------------------------------------
    revno: 432.1.114
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Mon 2007-07-09 12:55:15 +1000
    message:
      a better way to fix the DISCONNECT|BANNED vs DISCONNECT bug
    ------------------------------------------------------------
    revno: 432.1.113
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Mon 2007-07-09 12:33:00 +1000
    message:
      when checking the nodemap flags for consitency while monitoring the 
      cluster,   we cant check that both the BANNED and the DISCONNECTED flags 
      are both set at the same time   since if a node becomes banned just 
      before it is DISCONNECTED   there is no guarantee that all other nodes 
      will have seen the BANNED flag.
      
      So we must first check the DISCONNECTED flag only   and only if the 
      DISCONNECTED flag is not set should we check the BANNED flag.
      
      
      othervise this can cause a recovery loop while some nodes thing the 
      disconnected node is DISCONNECTED|BANNED and other think it is just 
      DISCONNECTED
    ------------------------------------------------------------
    revno: 432.1.112
    merged: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    parent: [EMAIL PROTECTED]
    committer: Ronnie Sahlberg <[EMAIL PROTECTED]>
    branch nick: ctdb
    timestamp: Mon 2007-07-09 08:38:01 +1000
    message:
      merge from tridge
=== modified file 'config/events.d/60.nfs'
--- a/config/events.d/60.nfs    2007-07-06 00:54:42 +0000
+++ b/config/events.d/60.nfs    2007-07-10 03:09:35 +0000
@@ -61,12 +61,32 @@
        ;;
 
      recovered)
-        # restart NFS to ensure that all TCP connections to the released ip
-       # are closed
+       [ -f /etc/ctdb/state/nfs/restart ] && [ ! -z "$LOCKD_TCPPORT" ] && {
+               # RST all tcp connections used for NLM to ensure that they do
+               # not survive in ESTABLISHED state across a failover/failback
+               # and create an ack storm
+               netstat -tn |egrep 
"^tcp.*\s+[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+:${LOCKD_TCPPORT}\s+.*ESTABLISHED" | 
awk '{print $4" "$5}' | while read dest src; do
+                       srcip=`echo $src | cut -d: -f1`
+                       srcport=`echo $src | cut -d: -f2`
+                       destip=`echo $dest | cut -d: -f1`
+                       destport=`echo $dest | cut -d: -f2`
+                       ctdb killtcp $srcip:$srcport $destip:$destport 1 
>/dev/null 2>&1 
+#                      ctdb killtcp $destip:$destport $srcip:$srcport 1 
>/dev/null 2>&1
+               done
+       } > /dev/null 2>&1
+
        [ -f /etc/ctdb/state/nfs/restart ] && {
-               ( service nfs status > /dev/null 2>&1 && 
-                      service nfs restart > /dev/null 2>&1 &&
-                     service nfslock restart > /dev/null 2>&1 ) &
+               # RST all tcp connections used for NFS to ensure that they do
+               # not survive in ESTABLISHED state across a failover/failback
+               # and create an ack storm
+               netstat -tn |egrep 
'^tcp.*\s+[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+:2049\s+.*ESTABLISHED' | awk '{print 
$4" "$5}' | while read dest src; do
+                       srcip=`echo $src | cut -d: -f1`
+                       srcport=`echo $src | cut -d: -f2`
+                       destip=`echo $dest | cut -d: -f1`
+                       destport=`echo $dest | cut -d: -f2`
+                       ctdb killtcp $srcip:$srcport $destip:$destport 1 
>/dev/null 2>&1 
+                       ctdb killtcp $destip:$destport $srcip:$srcport 1 
>/dev/null 2>&1
+               done
        } > /dev/null 2>&1
        /bin/rm -f /etc/ctdb/state/nfs/restart


=== modified file 'server/ctdb_recoverd.c'
--- a/server/ctdb_recoverd.c    2007-07-03 22:36:59 +0000
+++ b/server/ctdb_recoverd.c    2007-07-09 07:40:15 +0000
@@ -385,11 +385,6 @@
        for (i=0;i<nodemap->num;i++) {
                struct ctdb_node_flag_change c;
                TDB_DATA data;
-               uint32_t flags = nodemap->nodes[i].flags;
-
-               if (flags & NODE_FLAGS_DISCONNECTED) {
-                       continue;
-               }
 
                c.vnn = nodemap->nodes[i].vnn;
                c.flags = nodemap->nodes[i].flags;
@@ -1073,6 +1068,15 @@
                return;
        }
 
+       /* Dont let messages from remote nodes change the DISCONNECTED flag. 
+          This flag is handled locally based on whether the local node
+          can communicate with the node or not.
+       */
+       c->flags &= ~NODE_FLAGS_DISCONNECTED;
+       if (nodemap->nodes[i].flags&NODE_FLAGS_DISCONNECTED) {
+               c->flags |= NODE_FLAGS_DISCONNECTED;
+       }
+
        if (nodemap->nodes[i].flags != c->flags) {
                DEBUG(0,("Node %u has changed flags - now 0x%x\n", c->vnn, 
c->flags));
        }
@@ -1327,7 +1331,7 @@
                        }
                        if ((remote_nodemap->nodes[i].flags & 
NODE_FLAGS_INACTIVE) != 
                            (nodemap->nodes[i].flags & NODE_FLAGS_INACTIVE)) {
-                               DEBUG(0, (__location__ " Remote node:%u has 
different nodemap flags for %d (0x%x vs 0x%x)\n", 
+                               DEBUG(0, (__location__ " Remote node:%u has 
different nodemap flag for %d (0x%x vs 0x%x)\n", 
                                          nodemap->nodes[j].vnn, i,
                                          remote_nodemap->nodes[i].flags, 
nodemap->nodes[i].flags));
                                do_recovery(rec, mem_ctx, vnn, num_active, 
nodemap, 

=== modified file 'tools/ctdb.c'
--- a/tools/ctdb.c      2007-07-05 00:00:51 +0000
+++ b/tools/ctdb.c      2007-07-10 00:24:20 +0000
@@ -308,10 +308,10 @@
  */
 static int kill_tcp(struct ctdb_context *ctdb, int argc, const char **argv)
 {
-       int i, ret;
+       int i, ret, numrst;
        struct sockaddr_in src, dst;
 
-       if (argc < 2) {
+       if (argc < 3) {
                usage();
        }
 
@@ -325,7 +325,9 @@
                return -1;
        }
 
-       for (i=0;i<5;i++) {
+       numrst = strtoul(argv[2], NULL, 0);
+
+       for (i=0;i<numrst;i++) {
                ret = ctdb_sys_kill_tcp(ctdb->ev, &src, &dst);
 
                printf("ret:%d\n", ret);
@@ -889,7 +891,7 @@
        { "recover",         control_recover,           true,  "force recovery" 
},
        { "freeze",          control_freeze,            true,  "freeze all 
databases" },
        { "thaw",            control_thaw,              true,  "thaw all 
databases" },
-       { "killtcp",         kill_tcp,                  false, "kill a tcp 
connection", "<srcip:port> <dstip:port>" },
+       { "killtcp",         kill_tcp,                  false, "kill a tcp 
connection. Try <num> times.", "<srcip:port> <dstip:port> <num>" },
        { "tickle",          tickle_tcp,                false, "send a tcp 
tickle ack", "<srcip:port> <dstip:port>" },
 };
 

=== modified file 'web/nfs.html'
--- a/web/nfs.html      2007-06-12 04:43:26 +0000
+++ b/web/nfs.html      2007-07-10 02:43:46 +0000
@@ -47,16 +47,18 @@
 This file should look something like :
 <pre>
   CTDB_MANAGES_NFS=yes
-  CTDB_MANAGES_NFSLOCK=yes
+  LOCKD_TCPPORT=599
+  LOCKD_UDPPORT=599
   STATD_SHARED_DIRECTORY=/gpfs0/nfs-state
-  STATD_HOSTNAME=\"ctdb -P $STATD_SHARED_DIRECTORY/192.168.1.1 -H 
/etc/ctdb/statd-callout -p 97\"
+  STATD_HOSTNAME="ctdb -P $STATD_SHARED_DIRECTORY/192.168.1.1 -H 
/etc/ctdb/statd-callout -p 97"
 </pre>
 
 The CTDB_MANAGES_NFS line tells the events scripts that CTDB is to manage 
startup and shutdown of the NFS and NFSLOCK services.<br>
 
-The CTDB_MANAGES_NFSLOCK line tells the events scripts that CTDB is also to 
manage the nfs lock manager.<br>
+With this set to yes, CTDB will start/stop/restart these services as 
required.<br><br>
 
-With these set to yes, CTDB will start/stop/restart these services as 
required.<br><br>
+You need to make sure that the lock manager runs on the same port on all nodes 
in the cluster since some clients will have "issues" and take very long to 
recover if the port suddenly changes.<br>
+599 above is only an example. You can run the lock manager on any available 
port as long as you use the same port on all nodes.<br><br>
 
 STATD_SHARED_DIRECTORY is the shared directory where statd and the 
statd-callout script expects that the state variables and lists of clients to 
notify are found.<br>

Rev 569: merge from ronnie in http://samba.org/~tridge/ctdb

Reply via email to