Tatsuo,

I modified the source file for Pgpool -II 2.3.2.2.
I attached the patch(diff).

Regards,
Rumman
________________________________________
From: Tatsuo Ishii [[email protected]]
Sent: Tuesday, February 23, 2010 9:14 AM
To: Brian Maguire
Cc: [email protected]; Gazi Rahman; Ahmed Iftekhar; Arthur Vossberg; 
[email protected]
Subject: Re: Pgpool II 2.3.1 Patch Contribution

[Cced to pgpool-hackers]

Brian,

Thanks for your contribution. I will look into the code. Also I would
like to hear from pgpool hackers's opinion if any.

If possible, it would be great if you could generate patches (diff)
against CVS HEAD. Also you should change pool_config.l, rather than
pool_config.c. Another request is, please provide documentation. You
can modify doc/pgpool-en.html.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Tatsuo
>
>
>
> I would like to contribibution of a patch of a code change to support a 
> config change for the killing of all child processes rather than waiting for 
> disconnect.  The description of the rational and the code chnage are below 
> and attached.  Please let me know if this can be contributed or even better 
> avoided all together some how.
>
>
>
>
>
> Cheers,
>
>
>
> Brian
>
>
>
>
>
>
>
>
>
>
>
>
>
> We changed the source files of Pgpool II 2.3.1 for the following two purposes:
>     1. To avoid the wait for the disconnection of all sessions at 2nd stage 
> of pgpool recovery.
>     2. To avoid the disconnection of active sessions at Primary node when the 
> secondary node stop communication with the primary pgpool.
>
> 1. To avoid the wait for the disconnection of all sessions at 2nd stage of 
> pgpool recovery
>
> With default implementation of Pgpool II 2.3.1, during recovery, at 2nd stage 
> pgpool waits for disconnection of all sessions.  But in 24x7 production 
> database, without stopping the application, it is difficult to ensure that no 
> connection will exist in the database and if there is a session connected in 
> the database, the recovery process will wait for its disconnection and if it 
> does not happen within a specific period then the recovery will fail.
> To avoid this recovery failure we added a parameter 
> 'kill_all_child_process_before_2nd_stage"
>
> if kill_all_child_process_before_2nd_stage = true/1 then
>     kill all child process before the start of 2nd stage so that no need to 
> wait for disconnections of active sessions
> else
>     wait for the disconnections of all active sessions; that is same as the 
> Pgpool II 2.3.1
> end if
>
> Default value for kill_all_child_process_before_2nd_stage is set to false/0.
>
> The following 3 files have been modified:
>
> pool.h
> pool_config.c
> recovery.c
>
>
>
> File : pool.h   //added the parameter
>
>
>
> typedef struct
> {
> ....
> int kill_all_child_process_during_before_2nd_stage;
> } POOL_CONFIG;
>
>
> File: pool_config.c  //read the pgpool.conf value set by the pgpool 
> administrator
>
>
>
> int pool_init_config()
> {
> ...
> pool_config->kill_all_child_process_before_2nd_stage = 0;
>
> } //int pool_init_config()
>
>
> int pool_get_config(char *confpath, POOL_CONFIG_CONTEXT context)
> {
> ....
>     //kill_all_child_process_before_2nd_stage
>     else if (!strcmp(key, "kill_all_child_process_before_2nd_stage") && 
> CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
>         {
>             int v = eval_logical(yytext);
>
>             if (v < 0)
>             {
>                 pool_error("pool_config: invalid value %s for %s", yytext, 
> key);
>                 return(-1);
>             }
>             pool_config->kill_all_child_process_before_2nd_stage = v;
>     } //kill_all_child_process_before_2nd_stage
> ....
> }//int pool_get_config
>
>
> File: recovery.c //added the parameterized functionality in the block 
> "modified here"
>
>
>
> int start_recovery(int recovery_node)
> {
>     ...
>     pool_log("1st stage is done");
>
>     --------------- modified here -------------------------
>         if (pool_config->kill_all_child_process_before_2nd_stage == 1)
>                 {       /* kill all children processes */
>
>                         pool_log("Disconnection of all sessions before 2nd 
> stage started");
>                         pool_debug("pool_config 
> =%d",pool_config->num_init_children);
>                         int i;
>                         for (i = 0; i < pool_config->num_init_children; i++)
>                         {
>                                 pid_t pid = pids[i].pid;
>                                 if (pid)
>                                 {
>                                         kill(pid, SIGQUIT);
>                                         pool_debug("Disconnect session before 
> 2nd stage: kill %d", pid);
>                                 }//if (pids[i].pid)
>                         }//for (i = 0; i < pool_config->num_init_children; 
> i++)
>                         pool_log("Killed all active sessions");
>                 }// if (pool_config->kill_all_child_process_before_2nd_stage 
> == 1)
>
>     ------------ end of modification ---------------------------------
>     pool_log("starting 2nd stage");
>
>     /* 2nd stage */
>     *InRecovery = 1;
>         -------- modified here ----------------
>         pool_log("---->Waiting for all connection closed <---");
>         --------- end of modification ----------
>     if (wait_connection_closed() != 0)
>     {
>         PQfinish(conn);
>         pool_error("start_recovery: timeover for waiting connection closed");
>         return 1;
>     }
>     pool_log("all connections from clients have been closed");
>
>     ...
> } //int start_recovery
> 2. To avoid the disconnection of active sessions at Primary node when the 
> secondary node stop communication with the primary pgpool
>
> With default implementation of Pgpool II 2.3.1, if either node is down pgpool 
> kills all child processes to stop TCP/IP retrying. But this may cause problem 
> in our environment.
> Let a situation where secondary node somehow disconnected and stopped 
> communication with the primary node. With default implementation, pgpool 
> kills all active calls and other activities of our application.
> It is not desirable. Activities at primary node must not be hampered if 
> secondary node stops communication.
> To avoid this, we added a parameter 
> "restart_all_process_for_same_master_node_in_failover".
>
> If restart_all_process_for_same_master_node_in_failover = true/1 then
>    restart all child processes that is same as the default with Pgpool II 
> 2.3.1
> else
>    don't restart the child processes for same master node
> end if
>
> Default value for restart_all_process_for_same_master_node_in_failover set to 
> 1/true;
>
> Actually, here we have only enabled the old functionaluty of do not restart 
> child process for same master node with the parameter 
> restart_all_process_for_same_master_node_in_failover.
>
>
>
> The following 3 files have been changed:
> pool.h
> pool_config.c
> main.c
>
>
>
> File: pool.h    //added the parameter
>
>
>
> typedef struct
> {
> ....
> int restart_all_process_for_same_master_node_in_failover;
> } POOL_CONFIG;
>
>
> File: pool_config.c  //read the pgpool.conf value set by the pgpool 
> administrator
>
>
>
> int pool_init_config()
> {
> ...
> pool_config->restart_all_process_for_same_master_node_in_failover = 1;
> } //int pool_init_config()
>
> int pool_get_config(char *confpath, POOL_CONFIG_CONTEXT context)
> {
> ....
> else if (!strcmp(key, "restart_all_process_for_same_master_node_in_failover") 
> &&
>                  CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
>         {
>             int v = eval_logical(yytext);
>
>             if (v < 0)
>             {
>                 pool_error("pool_config: invalid value %s for %s", yytext, 
> key);
>                 fclose(fd);
>                 return(-1);
>             }
>             pool_debug("restart_all_process_for_same_master_node_in_failover: 
> %d", v);
>             pool_config->restart_all_process_for_same_master_node_in_failover 
> = v;
>         }
> ....
>
> } //int pool_get_config(char *confpath, POOL_CONFIG_CONTEXT context)
>
>
>
>
> File: main.c // added the parameterized functionality; the block "modified 
> here" indicates our modification
>
>
>
> ...
> static void failover(void)
> {
>
>     ...
>     if (new_master == pool_config->backend_desc->num_backends)
>     {
>         pool_error("failover_handler: no valid DB node found");
>     }
>
> /*
>  * Before we tried to minimize restarting pgpool to protect existing
>  * connections from clients to pgpool children. What we did here was,
>  * if children other than master went down, we did not fail over.
>  * This is wrong. Think about following scenario. If someone
>  * accidentally plugs out the network cable, the TCP/IP stack keeps
>  * retrying for long time (typically 2 hours). The only way to stop
>  * the retry is restarting the process.  Bottom line is, we need to
>  * restart all children in any case.  See pgpool-general list posting
>  * "TCP connections are *not* closed when a backend timeout" on Jul 13
>  * 2008 for more details.
>  * It has been parameterized as people may want behavior depending on their 
> system.
>  */
>
> ----------- modified here ------------------------
> //#ifdef NOT_USED //
>     else
>     {
>         if ( 
> pool_config->restart_all_process_for_same_master_node_in_failover == 0 &&   
> Req_info->master_node_id == new_master && *InRecovery == 0)
>         {
>             pool_log("failover_handler: do not restart pgpool. same master 
> node %d was selected", new_master);
>             if (Req_info->kind == NODE_UP_REQUEST)
>             {
>                 pool_log("failback done. reconnect host %s(%d)",
>                          BACKEND_INFO(node_id).backend_hostname,
>                          BACKEND_INFO(node_id).backend_port);
>             }
>             else
>             {
>                 pool_log("failover done. shutdown host %s(%d)",
>                          BACKEND_INFO(node_id).backend_hostname,
>                          BACKEND_INFO(node_id).backend_port);
>             }
>
>             /* exec failover_command */
>             for (i = 0; i < pool_config->backend_desc->num_backends; i++)
>             {
>                 if (nodes[i])
>                     trigger_failover_command(i, 
> pool_config->failover_command);
>             }
>
>             pool_semaphore_unlock(REQUEST_INFO_SEM);
>             switching = 0;
>             kill(pcp_pid, SIGUSR2);
>             switching = 0;
>             return;
>         }
>     }
> //#endif //
> ----------- end of modification ------------------------
>
>     /* kill all children */
>     for (i = 0; i < pool_config->num_init_children; i++)
>     {
>         pid_t pid = pids[i].pid;
>         if (pid)
>         {
>             kill(pid, SIGQUIT);
>             pool_debug("failover_handler: kill %d", pid);
>         }
>     }
>
>     ...
> } // end of failover()
> ...
diff -aur pgpool-II-2.3.2.2/doc/pgpool-en.html 
pgpool-II-2.3.2.2.modified/doc/pgpool-en.html
--- pgpool-II-2.3.2.2/doc/pgpool-en.html        2010-02-02 06:30:58.000000000 
+0600
+++ pgpool-II-2.3.2.2.modified/doc/pgpool-en.html       2010-03-10 
17:17:36.000000000 +0600
@@ -1035,6 +1035,22 @@
           client_idle_limit_in_recovery is 0, which means the functionality is 
turned
           off. You need to reload pgpool.conf if you change
           client_idle_limit_in_recovery.</p>
+          
+       <dt>kill_all_child_process_before_2nd_stage
+  <dd>
+  <p> This parameter takes effect in recovery stage. If set to true, it kills 
all sessions both active or inactive connected 
+       in the pgpool before the 2nd stage of online recovery and avoid the 
wait for connections close. If set to false, 
+  then this operation is ommited. You need to reload pgpool.conf if you change 
the value of this parameter.</p> 
+       <p>Default value is false.
+       </p>
+         
+  <dt>restart_all_process_for_same_master_node
+  <dd>
+  <p> This parameter takes effect in failover operation. If set to true, 
pgpool kills all connections for any node goes down. Otherwise, that is, if set 
to false, then
+        pgpool does not kill connections for children other than master node 
go down.You need to reload pgpool.conf 
+    if you change the value of this parameter.</p>       
+       <p>Default value is true.
+       </p>
 
 <dt>lobj_lock_table
 <dd>
diff -aur pgpool-II-2.3.2.2/main.c pgpool-II-2.3.2.2.modified/main.c
--- pgpool-II-2.3.2.2/main.c    2010-02-13 17:23:55.000000000 +0600
+++ pgpool-II-2.3.2.2.modified/main.c   2010-03-10 17:17:08.000000000 +0600
@@ -1383,10 +1383,10 @@
  * 2008 for more details.
  */
 
-#ifdef NOT_USED
+//#ifdef NOT_USED
        else
        {
-               if (Req_info->master_node_id == new_master && *InRecovery == 0)
+               if ( 
pool_config->restart_all_process_for_same_master_node_in_failover==0 &&  
Req_info->master_node_id == new_master && *InRecovery == 0)
                {
                        pool_log("failover_handler: do not restart pgpool. same 
master node %d was selected", new_master);
                        if (Req_info->kind == NODE_UP_REQUEST)
@@ -1416,7 +1416,7 @@
                        return;
                }
        }
-#endif
+//#endif
        /* kill all children */
        for (i = 0; i < pool_config->num_init_children; i++)
        {
diff -aur pgpool-II-2.3.2.2/pool_config.l 
pgpool-II-2.3.2.2.modified/pool_config.l
--- pgpool-II-2.3.2.2/pool_config.l     2010-01-31 08:22:24.000000000 +0600
+++ pgpool-II-2.3.2.2.modified/pool_config.l    2010-03-10 17:17:09.000000000 
+0600
@@ -182,7 +182,10 @@
        pool_config->ssl_key = "";
        pool_config->ssl_ca_cert = "";
        pool_config->ssl_ca_cert_dir = "";
-
+  
+  pool_config->kill_all_child_process_before_2nd_stage = 0;
+  pool_config->restart_all_process_for_same_master_node_in_failover = 1;
+  
        res = gethostname(localhostname,sizeof(localhostname));
        if(res !=0 )
        {
@@ -1185,7 +1188,8 @@
                                (context == RELOAD_CONFIG && (status == 
CON_UNUSED || status == CON_DOWN)))
                                
strncpy(BACKEND_INFO(slot).backend_data_directory, str, MAX_PATH_LENGTH);
                }
-               else if (!strcmp(key, "log_statement") && 
CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
+
+    else if (!strcmp(key, "log_per_node_statement") && 
CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
                {
                        int v = eval_logical(yytext);
 
@@ -1194,9 +1198,10 @@
                                pool_error("pool_config: invalid value %s for 
%s", yytext, key);
                                return(-1);
                        }
-                       pool_config->log_statement = v;
+                       pool_config->log_per_node_statement = v;
                }
-               else if (!strcmp(key, "log_per_node_statement") && 
CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
+               
+    else if (!strcmp(key, "log_statement") && 
CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
                {
                        int v = eval_logical(yytext);
 
@@ -1205,9 +1210,10 @@
                                pool_error("pool_config: invalid value %s for 
%s", yytext, key);
                                return(-1);
                        }
-                       pool_config->log_per_node_statement = v;
+                       pool_config->log_statement = v;
                }
-               else if (!strcmp(key, "log_statement") && 
CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
+               
+       else if (!strcmp(key, "kill_all_child_process_before_2nd_stage") && 
CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
                {
                        int v = eval_logical(yytext);
 
@@ -1216,8 +1222,21 @@
                                pool_error("pool_config: invalid value %s for 
%s", yytext, key);
                                return(-1);
                        }
-                       pool_config->log_statement = v;
-               }
+                       pool_config->kill_all_child_process_before_2nd_stage = 
v;
+    } //kill_all_child_process_before_2nd_stage
+       
+       else if (!strcmp(key, 
"restart_all_process_for_same_master_node_in_failover") && 
CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
+               {
+                       int v = eval_logical(yytext);
+
+                       if (v < 0)
+                       {
+                               pool_error("pool_config: invalid value %s for 
%s", yytext, key);
+                               return(-1);
+                       }
+                       
pool_config->restart_all_process_for_same_master_node_in_failover = v;
+    } //restart_all_process_for_same_master_node_in_failover
+
 
                else if (!strcmp(key, "lobj_lock_table") && 
CHECK_CONTEXT(INIT_CONFIG|RELOAD_CONFIG, context))
                {
diff -aur pgpool-II-2.3.2.2/pool.h pgpool-II-2.3.2.2.modified/pool.h
--- pgpool-II-2.3.2.2/pool.h    2010-02-03 14:11:42.000000000 +0600
+++ pgpool-II-2.3.2.2.modified/pool.h   2010-03-10 17:17:10.000000000 +0600
@@ -223,6 +223,9 @@
        char *ssl_key;  /* path to ssl key (frontend only) */
        char *ssl_ca_cert;      /* path to root (CA) certificate */
        char *ssl_ca_cert_dir;  /* path to directory containing CA certificates 
*/
+       
+       int kill_all_child_process_before_2nd_stage;
+       int restart_all_process_for_same_master_node_in_failover;
 } POOL_CONFIG;
 
 #define MAX_PASSWORD_SIZE              1024
diff -aur pgpool-II-2.3.2.2/recovery.c pgpool-II-2.3.2.2.modified/recovery.c
--- pgpool-II-2.3.2.2/recovery.c        2009-12-23 15:30:43.000000000 +0600
+++ pgpool-II-2.3.2.2.modified/recovery.c       2010-03-10 17:17:12.000000000 
+0600
@@ -90,11 +90,30 @@
        }
 
        pool_log("1st stage is done");
-
+  
+   if (pool_config->kill_all_child_process_before_2nd_stage == 1)
+         {       /* kill all children processes */
+       
+                 pool_log("Disconnection of all sessions before 2nd stage 
started");
+                 pool_debug("pool_config =%d",pool_config->num_init_children);
+                 int i;
+                 for (i = 0; i < pool_config->num_init_children; i++)
+                 {
+                         pid_t pid = pids[i].pid;
+                         if (pid)
+                         {
+                                 kill(pid, SIGQUIT);
+                                 pool_debug("Disconnect session before 2nd 
stage: kill %d", pid);
+                         }//if (pids[i].pid)
+                 }//for (i = 0; i < pool_config->num_init_children; i++)
+                 pool_log("Killed all active sessions");
+         }// if (pool_config->kill_all_child_process_before_2nd_stage == 1)
+  
        pool_log("starting 2nd stage");
 
        /* 2nd stage */
        *InRecovery = 1;
+       pool_log("---->Waiting for all connection closed <---");
        if (wait_connection_closed() != 0)
        {
                PQfinish(conn);
_______________________________________________
Pgpool-hackers mailing list
[email protected]
http://pgfoundry.org/mailman/listinfo/pgpool-hackers

Reply via email to