Re: [Pgpool-general] unexpected EOF on client connection

2011-09-14 Thread Lonni J Friedman
On Wed, Sep 14, 2011 at 6:00 PM, Tatsuo Ishii  wrote:
>> On Wed, Sep 14, 2011 at 4:22 PM, Tatsuo Ishii  wrote:
>> I'm pretty sure that's not the case as the messages stop whenever
>> pgpool isn't running, they were not present prior to using pgpool, and
>> pg_hba.conf is setup such that the database servers only accept
>> connections from each other, and the server running pgpool.  None of
>> these servers have normal users connected directly to them (such as
>> with ssh), nor are they running anything that would connect to the
>> database as a client.  Also, the volume of these messages are such
>> that something significant has to be causing them.  Last night, in the
>> span of 5 minutes, there were 117 of these messages.
>
> Ok. I would like to narraow down the reason why we have "unexpected
> EOF on client connection" message frequently. I think currently there
> are two possiblities:
>
> 1) pgpool child was killed by some unknown reason(we can omit
>   segfault case because you don't see it in the pgpool log)
>
> 2) pgpool child disconnects to PostgreSQL in ungraceful manner
>
> For 1) I would like to know if pgpool child process are fine since
> they are spawned. Are you seeing any pgpool child process disappeared
> since pgpool started?

 I assume this should be determined by num_init_children (which I've
 set to 195 in pgpool.conf)?  If so, then I currently have 195
 processes in either the "wait for connection request" state or
 actively connected state.
>>>
>>> No. Pgpool parent process automatically respawns child process if it's
>>> dyning. So having num_init_children child process is not showing
>>> anything usefull. You record 195 process ids and compare current
>>> process ids. If some of them have been changed, we can assume that
>>> child process is dying.
>>
>> Ah, good point.  I just diffed the list of PIDs associated with pgpool
>> processes before and after another EOF message in the log, and there
>> were no differences.  So I think that rules out any processes dying?
>
> Right.
>
>> One other thing that I just noticed from comparing logs between all of
>> the database servers is that the time stamps for every one of the
>> 'unexpected EOF on client connection' instances are identical.  In
>> other words, they are happening at the same time on each server.  I
>> think this further suggests that pgpool has to be doing it?
>
> Yes, I think so unless you set connection_life_time to other than 0 or
> the network connection between PostgreSQL and pgpool is unstable.

connection _life_time is currently 0 (since you recommended I change
it earlier).  I don't have any evidence to suggest that the network
connection is unstable.  There are 0 errors of any kind in ifconfig
output.

>
> Let me think how we can make further investigation...

ok, thanks.
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] unexpected EOF on client connection

2011-09-14 Thread Tatsuo Ishii
> On Wed, Sep 14, 2011 at 4:22 PM, Tatsuo Ishii  wrote:
> I'm pretty sure that's not the case as the messages stop whenever
> pgpool isn't running, they were not present prior to using pgpool, and
> pg_hba.conf is setup such that the database servers only accept
> connections from each other, and the server running pgpool.  None of
> these servers have normal users connected directly to them (such as
> with ssh), nor are they running anything that would connect to the
> database as a client.  Also, the volume of these messages are such
> that something significant has to be causing them.  Last night, in the
> span of 5 minutes, there were 117 of these messages.

 Ok. I would like to narraow down the reason why we have "unexpected
 EOF on client connection" message frequently. I think currently there
 are two possiblities:

 1) pgpool child was killed by some unknown reason(we can omit
   segfault case because you don't see it in the pgpool log)

 2) pgpool child disconnects to PostgreSQL in ungraceful manner

 For 1) I would like to know if pgpool child process are fine since
 they are spawned. Are you seeing any pgpool child process disappeared
 since pgpool started?
>>>
>>> I assume this should be determined by num_init_children (which I've
>>> set to 195 in pgpool.conf)?  If so, then I currently have 195
>>> processes in either the "wait for connection request" state or
>>> actively connected state.
>>
>> No. Pgpool parent process automatically respawns child process if it's
>> dyning. So having num_init_children child process is not showing
>> anything usefull. You record 195 process ids and compare current
>> process ids. If some of them have been changed, we can assume that
>> child process is dying.
> 
> Ah, good point.  I just diffed the list of PIDs associated with pgpool
> processes before and after another EOF message in the log, and there
> were no differences.  So I think that rules out any processes dying?

Right.

> One other thing that I just noticed from comparing logs between all of
> the database servers is that the time stamps for every one of the
> 'unexpected EOF on client connection' instances are identical.  In
> other words, they are happening at the same time on each server.  I
> think this further suggests that pgpool has to be doing it?

Yes, I think so unless you set connection_life_time to other than 0 or
the network connection between PostgreSQL and pgpool is unstable.

Let me think how we can make further investigation...
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] unexpected EOF on client connection

2011-09-14 Thread Lonni J Friedman
On Wed, Sep 14, 2011 at 4:22 PM, Tatsuo Ishii  wrote:
 I'm pretty sure that's not the case as the messages stop whenever
 pgpool isn't running, they were not present prior to using pgpool, and
 pg_hba.conf is setup such that the database servers only accept
 connections from each other, and the server running pgpool.  None of
 these servers have normal users connected directly to them (such as
 with ssh), nor are they running anything that would connect to the
 database as a client.  Also, the volume of these messages are such
 that something significant has to be causing them.  Last night, in the
 span of 5 minutes, there were 117 of these messages.
>>>
>>> Ok. I would like to narraow down the reason why we have "unexpected
>>> EOF on client connection" message frequently. I think currently there
>>> are two possiblities:
>>>
>>> 1) pgpool child was killed by some unknown reason(we can omit
>>>   segfault case because you don't see it in the pgpool log)
>>>
>>> 2) pgpool child disconnects to PostgreSQL in ungraceful manner
>>>
>>> For 1) I would like to know if pgpool child process are fine since
>>> they are spawned. Are you seeing any pgpool child process disappeared
>>> since pgpool started?
>>
>> I assume this should be determined by num_init_children (which I've
>> set to 195 in pgpool.conf)?  If so, then I currently have 195
>> processes in either the "wait for connection request" state or
>> actively connected state.
>
> No. Pgpool parent process automatically respawns child process if it's
> dyning. So having num_init_children child process is not showing
> anything usefull. You record 195 process ids and compare current
> process ids. If some of them have been changed, we can assume that
> child process is dying.

Ah, good point.  I just diffed the list of PIDs associated with pgpool
processes before and after another EOF message in the log, and there
were no differences.  So I think that rules out any processes dying?

One other thing that I just noticed from comparing logs between all of
the database servers is that the time stamps for every one of the
'unexpected EOF on client connection' instances are identical.  In
other words, they are happening at the same time on each server.  I
think this further suggests that pgpool has to be doing it?
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] unexpected EOF on client connection

2011-09-14 Thread Tatsuo Ishii
>>> I'm pretty sure that's not the case as the messages stop whenever
>>> pgpool isn't running, they were not present prior to using pgpool, and
>>> pg_hba.conf is setup such that the database servers only accept
>>> connections from each other, and the server running pgpool.  None of
>>> these servers have normal users connected directly to them (such as
>>> with ssh), nor are they running anything that would connect to the
>>> database as a client.  Also, the volume of these messages are such
>>> that something significant has to be causing them.  Last night, in the
>>> span of 5 minutes, there were 117 of these messages.
>>
>> Ok. I would like to narraow down the reason why we have "unexpected
>> EOF on client connection" message frequently. I think currently there
>> are two possiblities:
>>
>> 1) pgpool child was killed by some unknown reason(we can omit
>>   segfault case because you don't see it in the pgpool log)
>>
>> 2) pgpool child disconnects to PostgreSQL in ungraceful manner
>>
>> For 1) I would like to know if pgpool child process are fine since
>> they are spawned. Are you seeing any pgpool child process disappeared
>> since pgpool started?
> 
> I assume this should be determined by num_init_children (which I've
> set to 195 in pgpool.conf)?  If so, then I currently have 195
> processes in either the "wait for connection request" state or
> actively connected state.

No. Pgpool parent process automatically respawns child process if it's
dyning. So having num_init_children child process is not showing
anything usefull. You record 195 process ids and compare current
process ids. If some of them have been changed, we can assume that
child process is dying.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] seemingly hung pgpool process consuming 100% CPU

2011-09-14 Thread Lonni J Friedman
Thanks for your reply.  I'll do this the next time this happens (which
will likely be within a few days based on history).

On Wed, Sep 14, 2011 at 3:57 PM, Tatsuo Ishii  wrote:
> Please use gdb. For example,
>
> become postgres user (or root user)
> gdb pgpool 29191
> bt
> cont
> bt
> cont
> :
> :
> :
>
> This will give us an idea where it's looping.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>
>> This problem has returned yet again:
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 29191 postgres  20   0 80192  14m 1544 R 89.8  0.2  51:15.91 pgpool
>>
>> postgres 29191  3.4  0.1  80192 14728 ?        R    Sep13  51:40
>> pgpool: lfriedman nightly 10.31.96.84(61698) idle
>>
>>
>> I'd really appreciate some input on how to debug this.
>>
>>
>> On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman  wrote:
>>> No one else has experienced this or has suggestions how to debug it?
>>>
>>> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman  
>>> wrote:
 Greetings,
 I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load
 balancer for a three server postgresql-9.0.4 cluster (1 master, 2
 standby).  I'm seeing strange behavior where a single pgpool process
 seems to hang after some period of time, and then consume 100% of the
 CPU.  I've seen this behavior happen twice since last Friday (when
 pgpool was brought online in my production environment).  At the
 moment the current hung process looks like this in 'ps auxww' output:

 postgres 19838 98.7  0.0  68856  2904 ?        R    Sep06 1027:36
 pgpool: lfriedman nightly 10.31.45.20(58277) idle


 In top, I see:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 19838 postgres  20   0 68856 2904 1072 R 100.0  0.0   1027:29 pgpool


 When to connect to the process with strace, there is no output, so I'm
 guessing the process is stuck spinning somewhere:
 # strace -p 19838
 Process 19838 attached - interrupt to quit
 ...
 ^CProcess 19838 detached

 One thing that i'm certain of is that the client IP (10.31.45.20)
 associated with the hung process has rebooted at least once since that
 process was spawned.  So pgpool seems to be in some confused state, as
 the client definitely severed the connection already.  I checked the
 pgpool log and there are no explicit references to PID 19838.  I'm at
 a loss how to debug this further, but clearly something is wrong
 somewhere, and this isn't normal/expected behavior.
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] unexpected EOF on client connection

2011-09-14 Thread Lonni J Friedman
On Wed, Sep 14, 2011 at 3:56 PM, Tatsuo Ishii  wrote:
>> On Tue, Sep 13, 2011 at 8:48 PM, Tatsuo Ishii  wrote:
 On Mon, Sep 12, 2011 at 6:47 PM, Lonni J Friedman  
 wrote:
> On Mon, Sep 12, 2011 at 6:39 PM, Tatsuo Ishii  wrote:
 I couldn't find anything possibly related to your problem at a first
 grance(in theory client_idle_limit and authentication_timeout are not
 related but you might want to change them to see anything could be
 changed).
>>>
>>> OK, I'll give that a try.  Should I just try increasing them by 10 or 
>>> 20s?
>>
>> I'd suggest giving them 0. This will prevent to initiate those
>> functionalities which the directives are related.
>>
>> Also you hve child_life_time being 300. I don't expect this is related
>> but could you set it to 0 and see anything gest changed for just in
>> case?
>
> OK, i'll make those changes tomorrow (its late in the day here, and I
> don't want to introduce potential problems in the middle of the night
> when no one is closely monitoring the server), and let you know if
> they have any impact.


 client_idle_limit was already 0.  I set authentication_timeout=0 and
 child_life_time=0, and restarted pgpool, however that had no impact.
 I'm still seeing:
 26323 2011-09-13 09:28:19 PDT LOG:  unexpected EOF on client connection
 3933 2011-09-13 09:36:20 PDT LOG:  unexpected EOF on client connection
>>>
>>> Humm. Is it possible that those connections do not come from pgpool
>>> process?
>>
>> I'm pretty sure that's not the case as the messages stop whenever
>> pgpool isn't running, they were not present prior to using pgpool, and
>> pg_hba.conf is setup such that the database servers only accept
>> connections from each other, and the server running pgpool.  None of
>> these servers have normal users connected directly to them (such as
>> with ssh), nor are they running anything that would connect to the
>> database as a client.  Also, the volume of these messages are such
>> that something significant has to be causing them.  Last night, in the
>> span of 5 minutes, there were 117 of these messages.
>
> Ok. I would like to narraow down the reason why we have "unexpected
> EOF on client connection" message frequently. I think currently there
> are two possiblities:
>
> 1) pgpool child was killed by some unknown reason(we can omit
>   segfault case because you don't see it in the pgpool log)
>
> 2) pgpool child disconnects to PostgreSQL in ungraceful manner
>
> For 1) I would like to know if pgpool child process are fine since
> they are spawned. Are you seeing any pgpool child process disappeared
> since pgpool started?

I assume this should be determined by num_init_children (which I've
set to 195 in pgpool.conf)?  If so, then I currently have 195
processes in either the "wait for connection request" state or
actively connected state.
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] seemingly hung pgpool process consuming 100% CPU

2011-09-14 Thread Tatsuo Ishii
Please use gdb. For example,

become postgres user (or root user)
gdb pgpool 29191
bt
cont
bt
cont
:
:
:

This will give us an idea where it's looping.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> This problem has returned yet again:
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 29191 postgres  20   0 80192  14m 1544 R 89.8  0.2  51:15.91 pgpool
> 
> postgres 29191  3.4  0.1  80192 14728 ?RSep13  51:40
> pgpool: lfriedman nightly 10.31.96.84(61698) idle
> 
> 
> I'd really appreciate some input on how to debug this.
> 
> 
> On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman  wrote:
>> No one else has experienced this or has suggestions how to debug it?
>>
>> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman  wrote:
>>> Greetings,
>>> I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load
>>> balancer for a three server postgresql-9.0.4 cluster (1 master, 2
>>> standby).  I'm seeing strange behavior where a single pgpool process
>>> seems to hang after some period of time, and then consume 100% of the
>>> CPU.  I've seen this behavior happen twice since last Friday (when
>>> pgpool was brought online in my production environment).  At the
>>> moment the current hung process looks like this in 'ps auxww' output:
>>>
>>> postgres 19838 98.7  0.0  68856  2904 ?        R    Sep06 1027:36
>>> pgpool: lfriedman nightly 10.31.45.20(58277) idle
>>>
>>>
>>> In top, I see:
>>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>> 19838 postgres  20   0 68856 2904 1072 R 100.0  0.0   1027:29 pgpool
>>>
>>>
>>> When to connect to the process with strace, there is no output, so I'm
>>> guessing the process is stuck spinning somewhere:
>>> # strace -p 19838
>>> Process 19838 attached - interrupt to quit
>>> ...
>>> ^CProcess 19838 detached
>>>
>>> One thing that i'm certain of is that the client IP (10.31.45.20)
>>> associated with the hung process has rebooted at least once since that
>>> process was spawned.  So pgpool seems to be in some confused state, as
>>> the client definitely severed the connection already.  I checked the
>>> pgpool log and there are no explicit references to PID 19838.  I'm at
>>> a loss how to debug this further, but clearly something is wrong
>>> somewhere, and this isn't normal/expected behavior.
> ___
> Pgpool-general mailing list
> Pgpool-general@pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-general
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] unexpected EOF on client connection

2011-09-14 Thread Tatsuo Ishii
> On Tue, Sep 13, 2011 at 8:48 PM, Tatsuo Ishii  wrote:
>>> On Mon, Sep 12, 2011 at 6:47 PM, Lonni J Friedman  
>>> wrote:
 On Mon, Sep 12, 2011 at 6:39 PM, Tatsuo Ishii  wrote:
>>> I couldn't find anything possibly related to your problem at a first
>>> grance(in theory client_idle_limit and authentication_timeout are not
>>> related but you might want to change them to see anything could be
>>> changed).
>>
>> OK, I'll give that a try.  Should I just try increasing them by 10 or 
>> 20s?
>
> I'd suggest giving them 0. This will prevent to initiate those
> functionalities which the directives are related.
>
> Also you hve child_life_time being 300. I don't expect this is related
> but could you set it to 0 and see anything gest changed for just in
> case?

 OK, i'll make those changes tomorrow (its late in the day here, and I
 don't want to introduce potential problems in the middle of the night
 when no one is closely monitoring the server), and let you know if
 they have any impact.
>>>
>>>
>>> client_idle_limit was already 0.  I set authentication_timeout=0 and
>>> child_life_time=0, and restarted pgpool, however that had no impact.
>>> I'm still seeing:
>>> 26323 2011-09-13 09:28:19 PDT LOG:  unexpected EOF on client connection
>>> 3933 2011-09-13 09:36:20 PDT LOG:  unexpected EOF on client connection
>>
>> Humm. Is it possible that those connections do not come from pgpool
>> process?
> 
> I'm pretty sure that's not the case as the messages stop whenever
> pgpool isn't running, they were not present prior to using pgpool, and
> pg_hba.conf is setup such that the database servers only accept
> connections from each other, and the server running pgpool.  None of
> these servers have normal users connected directly to them (such as
> with ssh), nor are they running anything that would connect to the
> database as a client.  Also, the volume of these messages are such
> that something significant has to be causing them.  Last night, in the
> span of 5 minutes, there were 117 of these messages.

Ok. I would like to narraow down the reason why we have "unexpected
EOF on client connection" message frequently. I think currently there
are two possiblities:

1) pgpool child was killed by some unknown reason(we can omit
   segfault case because you don't see it in the pgpool log)

2) pgpool child disconnects to PostgreSQL in ungraceful manner

For 1) I would like to know if pgpool child process are fine since
they are spawned. Are you seeing any pgpool child process disappeared
since pgpool started?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] seemingly hung pgpool process consuming 100% CPU

2011-09-14 Thread Lonni J Friedman
This problem has returned yet again:
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
29191 postgres  20   0 80192  14m 1544 R 89.8  0.2  51:15.91 pgpool

postgres 29191  3.4  0.1  80192 14728 ?RSep13  51:40
pgpool: lfriedman nightly 10.31.96.84(61698) idle


I'd really appreciate some input on how to debug this.


On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman  wrote:
> No one else has experienced this or has suggestions how to debug it?
>
> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman  wrote:
>> Greetings,
>> I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load
>> balancer for a three server postgresql-9.0.4 cluster (1 master, 2
>> standby).  I'm seeing strange behavior where a single pgpool process
>> seems to hang after some period of time, and then consume 100% of the
>> CPU.  I've seen this behavior happen twice since last Friday (when
>> pgpool was brought online in my production environment).  At the
>> moment the current hung process looks like this in 'ps auxww' output:
>>
>> postgres 19838 98.7  0.0  68856  2904 ?        R    Sep06 1027:36
>> pgpool: lfriedman nightly 10.31.45.20(58277) idle
>>
>>
>> In top, I see:
>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 19838 postgres  20   0 68856 2904 1072 R 100.0  0.0   1027:29 pgpool
>>
>>
>> When to connect to the process with strace, there is no output, so I'm
>> guessing the process is stuck spinning somewhere:
>> # strace -p 19838
>> Process 19838 attached - interrupt to quit
>> ...
>> ^CProcess 19838 detached
>>
>> One thing that i'm certain of is that the client IP (10.31.45.20)
>> associated with the hung process has rebooted at least once since that
>> process was spawned.  So pgpool seems to be in some confused state, as
>> the client definitely severed the connection already.  I checked the
>> pgpool log and there are no explicit references to PID 19838.  I'm at
>> a loss how to debug this further, but clearly something is wrong
>> somewhere, and this isn't normal/expected behavior.
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] confirm 2b4736d3dbf2f7ccea62d713d3d64985a93c4c1a

2011-09-14 Thread Imre Facchin
I am looking for Failover and Loadbalancing in postgresql.

my choice is very likely to be pgpool. but i have concerns regarding it
beeing a SPOF. so i found pgpool-HA.
but nowhere is a description of what this actually does. I would like to
keep all my VMs the same (not have a dedicated DB loadbalancer) So there
would be a pgpool server on every database server knowing about all other
databases. my goal would be to be able to takl to any of the pgpool
instances and get the same result. the question is will pgpool-HA keep the
information about what servers are available/disconnected synchronised over
all pgpool instances. or is it just a hot-standby solution where the new
pgpool server takes the place of the old one if it fails.

tldr; is a active:active configuration for pgpool instances possibel with
pgpool-HA?
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] unexpected EOF on client connection

2011-09-14 Thread Lonni J Friedman
On Tue, Sep 13, 2011 at 8:48 PM, Tatsuo Ishii  wrote:
>> On Mon, Sep 12, 2011 at 6:47 PM, Lonni J Friedman  wrote:
>>> On Mon, Sep 12, 2011 at 6:39 PM, Tatsuo Ishii  wrote:
>> I couldn't find anything possibly related to your problem at a first
>> grance(in theory client_idle_limit and authentication_timeout are not
>> related but you might want to change them to see anything could be
>> changed).
>
> OK, I'll give that a try.  Should I just try increasing them by 10 or 20s?

 I'd suggest giving them 0. This will prevent to initiate those
 functionalities which the directives are related.

 Also you hve child_life_time being 300. I don't expect this is related
 but could you set it to 0 and see anything gest changed for just in
 case?
>>>
>>> OK, i'll make those changes tomorrow (its late in the day here, and I
>>> don't want to introduce potential problems in the middle of the night
>>> when no one is closely monitoring the server), and let you know if
>>> they have any impact.
>>
>>
>> client_idle_limit was already 0.  I set authentication_timeout=0 and
>> child_life_time=0, and restarted pgpool, however that had no impact.
>> I'm still seeing:
>> 26323 2011-09-13 09:28:19 PDT LOG:  unexpected EOF on client connection
>> 3933 2011-09-13 09:36:20 PDT LOG:  unexpected EOF on client connection
>
> Humm. Is it possible that those connections do not come from pgpool
> process?

I'm pretty sure that's not the case as the messages stop whenever
pgpool isn't running, they were not present prior to using pgpool, and
pg_hba.conf is setup such that the database servers only accept
connections from each other, and the server running pgpool.  None of
these servers have normal users connected directly to them (such as
with ssh), nor are they running anything that would connect to the
database as a client.  Also, the volume of these messages are such
that something significant has to be causing them.  Last night, in the
span of 5 minutes, there were 117 of these messages.
___
Pgpool-general mailing list
Pgpool-general@pgfoundry.org
http://pgfoundry.org/mailman/listinfo/pgpool-general


Re: [Pgpool-general] online recovery fails on HPUX

2011-09-14 Thread Sandeep Thakkar
I have set "enable_pool_hba = false" in pgpool.conf and do not use 
pool_hba.conf. Do we need to enable this? How does the client authentication 
works when pool_hba.conf is disabled?





From: Sandeep Thakkar 
To: Sandeep Thakkar ; Jose Mendoza 
; "pgpool-general@pgfoundry.org" 

Sent: Wednesday, September 14, 2011 4:33 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX


I still face this issue and I wonder why do I see the following error in the 
pgpool log:

2011-09-14 04:48:52 LOG:   pid 10268: starting recovering node 1
2011-09-14 04:48:53 ERROR: pid 10268: start_recover: could not connect master 
node.


Please help! Thanks

 



From: Sandeep Thakkar 
To: Sandeep Thakkar ; Jose Mendoza 
; "pgpool-general@pgfoundry.org" 

Sent: Tuesday, September 13, 2011 4:52 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX


I mean, pcp_remote_start contains "$PGCTL -w -D $DESTDIR start". i.e without SSH
 
Sandeep.



From: Sandeep Thakkar 
To: Jose Mendoza ; "pgpool-general@pgfoundry.org" 

Sent: Tuesday, September 13, 2011 4:24 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX


Well, actually my database servers and pgpool running on the same host, so my 
pcp_remote_start does not contains "$PGCTL -w -D $DESTDIR start". i.e without 
SSH. This works fine on Linux, though.

 
Sandeep.



From: Jose Mendoza 
To: pgpool-general@pgfoundry.org
Sent: Friday, September 2, 2011 12:47 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX


 
If you already have an sshkey defined and it’s the same
user on both servers then. I am not sure what could be causing the failure but
according to your log its an ssh auth issue.
 
Have you tried running a telnet test to the port to check that
is listening…
 
Jose
Autonomy Ops
verificare tua hinc
From:Sandeep Thakkar
[mailto:sandee...@yahoo.com] 
Sent: Friday, September 02, 2011 12:05 AM
To: Jose Mendoza; pgpool-general@pgfoundry.org
Subject: Re: [Pgpool-general] online recovery fails on HPUX
 
Sorry, I didn't get you..  Actually,
both the server instances and the pgpool running on the same host..  Could
this be an issue of loopback?


 
From:Jose
Mendoza 
To: pgpool-general@pgfoundry.org
Sent: Friday, September 2, 2011 12:16 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX
SSh trust must be created on
both server for all the accounts involved in the recovery process.
I would add the ssh-keys the
the user for pgpool and try again.
 
 
Jose
Autonomy Ops
verificare
tua hinc
From:Sandeep
Thakkar [mailto:sandee...@yahoo.com] 
Sent: Thursday, September 01, 2011 10:17 PM
To: Jose Mendoza; pgpool-general@pgfoundry.org
Subject: Re: [Pgpool-general] online recovery fails on HPUX
 
I can see the following lines in /tmp/pgpool.log:
..
2011-09-01 22:57:34 ERROR: pid 22975: check_replication_time_lag: DB
node is valid but no persistent connection
2011-09-01 22:57:34 DEBUG: pid 22936: health_check: 1 th DB node status: 3
2011-09-01 22:57:36 LOG:   pid 23019: starting recovering node 1
2011-09-01 22:57:36 ERROR: pid 23019: start_recover: could not connect master
node.
2011-09-01 22:57:38 DEBUG: pid 22936: starting health checking
2011-09-01 22:57:38 DEBUG: pid 22936: health_check: 0 th DB node status: 1
2011-09-01 22:57:38 DEBUG: pid 22975: pool_ssl: SSL requested but SSL support
is not available
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: auth kind: 0
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: parameter status data received
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: parameter status data received
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: backend key data received
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: transaction state: I
2011-09-01 22:57:39 DEBUG: pid 22936: health_check: 1 th DB node status: 3
2011-09-01 22:57:39 ERROR: pid 22975: connect_inet_domain_socket: connect() 
failed:
Connection refused
2011-09-01 22:57:39 ERROR: pid 22975: make_persistent_db_connection: connection
to localhost(5445) failed
2011-09-01 22:57:39 DEBUG: pid 22975: do_query: kind: T
2011-09-01 22:57:39 DEBUG: pid 22975: num_fileds: 1
2011-09-01 22:57:39 DEBUG: pid 22975: do_query: kind: D
2011-09-01 22:57:39 DEBUG: pid 22975: do_query: kind: C
2011-09-01 22:57:39 DEBUG: pid 22975: do_query: kind: Z
2011-09-01 22:57:39 ERROR: pid 22975: check_replication_time_lag: DB node is
valid but no persistent connection
2011-09-01 22:57:43 DEBUG: pid 22936: starting health checking
..
 
 


 
From:Sandeep
Thakkar 
To: Jose Mendoza ;
"pgpool-general@pgfoundry.org" 
Sent: Friday, September 2, 2011 10:10 AM
Subject: Re: [Pgpool-general] online recovery fails on HPUX
# Logging directory
logdir = '/tmp'
 
 


 
From:Jose
Mendoza 
To: pgpool-general@pgfoundry.org
Sent: Wednesday, August 31, 2011 10:28 PM
Subject: Re: 

Re: [Pgpool-general] online recovery fails on HPUX

2011-09-14 Thread Sandeep Thakkar
I still face this issue and I wonder why do I see the following error in the 
pgpool log:

2011-09-14 04:48:52 LOG:   pid 10268: starting recovering node 1
2011-09-14 04:48:53 ERROR: pid 10268: start_recover: could not connect master 
node.


Please help! Thanks

 



From: Sandeep Thakkar 
To: Sandeep Thakkar ; Jose Mendoza 
; "pgpool-general@pgfoundry.org" 

Sent: Tuesday, September 13, 2011 4:52 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX


I mean, pcp_remote_start contains "$PGCTL -w -D $DESTDIR start". i.e without SSH
 
Sandeep.



From: Sandeep Thakkar 
To: Jose Mendoza ; "pgpool-general@pgfoundry.org" 

Sent: Tuesday, September 13, 2011 4:24 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX


Well, actually my database servers and pgpool running on the same host, so my 
pcp_remote_start does not contains "$PGCTL -w -D $DESTDIR start". i.e without 
SSH. This works fine on Linux, though.

 
Sandeep.



From: Jose Mendoza 
To: pgpool-general@pgfoundry.org
Sent: Friday, September 2, 2011 12:47 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX


 
If you already have an sshkey defined and it’s the same
user on both servers then. I am not sure what could be causing the failure but
according to your log its an ssh auth issue.
 
Have you tried running a telnet test to the port to check that
is listening…
 
Jose
Autonomy Ops
verificare tua hinc
From:Sandeep Thakkar
[mailto:sandee...@yahoo.com] 
Sent: Friday, September 02, 2011 12:05 AM
To: Jose Mendoza; pgpool-general@pgfoundry.org
Subject: Re: [Pgpool-general] online recovery fails on HPUX
 
Sorry, I didn't get you..  Actually,
both the server instances and the pgpool running on the same host..  Could
this be an issue of loopback?


 
From:Jose
Mendoza 
To: pgpool-general@pgfoundry.org
Sent: Friday, September 2, 2011 12:16 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX
SSh trust must be created on
both server for all the accounts involved in the recovery process.
I would add the ssh-keys the
the user for pgpool and try again.
 
 
Jose
Autonomy Ops
verificare
tua hinc
From:Sandeep
Thakkar [mailto:sandee...@yahoo.com] 
Sent: Thursday, September 01, 2011 10:17 PM
To: Jose Mendoza; pgpool-general@pgfoundry.org
Subject: Re: [Pgpool-general] online recovery fails on HPUX
 
I can see the following lines in /tmp/pgpool.log:
..
2011-09-01 22:57:34 ERROR: pid 22975: check_replication_time_lag: DB
node is valid but no persistent connection
2011-09-01 22:57:34 DEBUG: pid 22936: health_check: 1 th DB node status: 3
2011-09-01 22:57:36 LOG:   pid 23019: starting recovering node 1
2011-09-01 22:57:36 ERROR: pid 23019: start_recover: could not connect master
node.
2011-09-01 22:57:38 DEBUG: pid 22936: starting health checking
2011-09-01 22:57:38 DEBUG: pid 22936: health_check: 0 th DB node status: 1
2011-09-01 22:57:38 DEBUG: pid 22975: pool_ssl: SSL requested but SSL support
is not available
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: auth kind: 0
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: parameter status data received
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: parameter status data received
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: backend key data received
2011-09-01 22:57:38 DEBUG: pid 22975: s_do_auth: transaction state: I
2011-09-01 22:57:39 DEBUG: pid 22936: health_check: 1 th DB node status: 3
2011-09-01 22:57:39 ERROR: pid 22975: connect_inet_domain_socket: connect() 
failed:
Connection refused
2011-09-01 22:57:39 ERROR: pid 22975: make_persistent_db_connection: connection
to localhost(5445) failed
2011-09-01 22:57:39 DEBUG: pid 22975: do_query: kind: T
2011-09-01 22:57:39 DEBUG: pid 22975: num_fileds: 1
2011-09-01 22:57:39 DEBUG: pid 22975: do_query: kind: D
2011-09-01 22:57:39 DEBUG: pid 22975: do_query: kind: C
2011-09-01 22:57:39 DEBUG: pid 22975: do_query: kind: Z
2011-09-01 22:57:39 ERROR: pid 22975: check_replication_time_lag: DB node is
valid but no persistent connection
2011-09-01 22:57:43 DEBUG: pid 22936: starting health checking
..
 
 


 
From:Sandeep
Thakkar 
To: Jose Mendoza ;
"pgpool-general@pgfoundry.org" 
Sent: Friday, September 2, 2011 10:10 AM
Subject: Re: [Pgpool-general] online recovery fails on HPUX
# Logging directory
logdir = '/tmp'
 
 


 
From:Jose
Mendoza 
To: pgpool-general@pgfoundry.org
Sent: Wednesday, August 31, 2011 10:28 PM
Subject: Re: [Pgpool-general] online recovery fails on HPUX
What does your pgpool.conf say about logging:
# Logging directory
logdir = '/var/log'
 
 
Jose
Autonomy Ops
verificare
tua hinc
From:pgpool-general-boun...@pgfoundry.org
[mailto:pgpool-general-boun...@pgfoundry.org] On Behalf Of Sandeep
Thakkar
Sent: Wednesday, August 31, 2011 4:58 AM
To: Sandeep Thakkar; pgpool-general@pgfoundry.org
Subject: Re: [Pgpool-general] online r