Re: [GENERAL] - PostgreSQL Replication Types

2015-12-17 Thread Adrian Klaver

On 12/17/2015 07:56 AM, Will McCormick wrote:

Thanks a ton for the prompt response.

I've read most of this but some it was not clear until we discussed.





See here for more detail: WLM: Reading now :)



While reading I would suggest having the  postgres.conf files on the 
master and the standby and the recovery.conf file on the standby open so 
you can see how your settings match up.


I would also take a look at what is in:

/opt/postgres/9.1/archive/


--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] - PostgreSQL Replication Types

2015-12-17 Thread Will McCormick
Thanks for the great assistance

On Thu, Dec 17, 2015 at 11:27 AM, Adrian Klaver 
wrote:

> On 12/17/2015 07:56 AM, Will McCormick wrote:
>
>> Thanks a ton for the prompt response.
>>
>> I've read most of this but some it was not clear until we discussed.
>>
>> Updated with WLM:
>>
>> On 12/17/2015 07:17 AM, Will McCormick wrote:
>>
>> I inherited a 9.1 replication environment
>>
>> Few basic questions that I can't find clear answers / clarifications
>> for
>> if possible:
>>
>> 3 types of replication in 9.1 I've read about from the offical docs:
>>
>> 1) warm standby
>> 2) hot standby
>> 3) streaming replication
>>
>> I'm using streaming replication I believe,  the only indication I have
>> is that  there is the primary_conninfo on the standby. Is this the
>> only
>> indication?
>>
>>
>> WLM: I also see streaming replication in the logs.
>>
>> On standby:
>>
>> http://www.postgresql.org/docs/9.1/interactive/functions-admin.html
>> "
>> pg_last_xlog_receive_location()
>>
>> Get last transaction log location received and synced to disk by
>> streaming replication. While streaming replication is in progress this
>> will increase monotonically. If recovery has completed this will remain
>> static at the value of the last WAL record received and synced to disk
>> during recovery. If streaming replication is disabled, or if it has not
>> yet started, the function returns NULL."
>>
>> WLM: When I do this on the standby I get an error:
>>
>>ERROR:  recovery is in progress
>>
>>HINT:  WAL control functions cannot be ...
>>
>
> What is the full hint message?
> The functions are supposed to be able to be run while the server is in
> recovery.
>
>
>
>>
>>
>> Is it possible to get if using streaming replication under normal
>> operations?
>>
>> /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
>> No such file or directory/
>>
>> /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
>> No such file or directory/
>>
>> /LOG:  streaming replication successfully connected to primary/
>>
>> /FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
>> segment 0007000F0057 has already been removed/
>>
>>
>> Assuming above is from standby log, correct? WLM: yes
>>
>> The cp lines would seem to indicate a restore_command in the standby
>> recovery.conf, is that the case?:
>>
>> http://www.postgresql.org/docs/9.1/static/archive-recovery-settings.html
>>
>> restore_command (string) WLM: Correct
>>
>>
>> The FATAL indicates that the WAL file has already been recycled on the
>> master.
>>
>> WLM: I had read this what confuses me a bit is:
>>
>>/cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
>>No such file or directory/
>>
>> Does Streaming replication automatically use Archived WAL files when
>> WAL XLOG files don't contain a transaction?
>>
>> We did have wal_keep_segments set to 0. I changed this to 50 but
>> want to better understand this. Especially the correlation between the
>> Archived WALs and the XLOG WALs. My guess is the difference between
>> Streaming replication and the others is very simply that Streaming
>> replication can read the XLOG WALs as well? So if all the Archived WALs
>> have been shipped and processed to the Standby then the XLOGs are
>> processed but not shipped? This meaning at a transaction level "kindof"?
>>
>
> Not really see the section below(#STREAMING-REPLICATION) I posted
> previously. It is either or, if streaming is set up and the standby can
> reach the master xlog directory then it will stream the WAL files from
> there. If the standby cannot access the xlog directory and if you have WAL
> archiving set up on the master and archive restore setup on the standby it
> will switch to full WAL log shipping from the archived WAL directory,
> assuming the files it needs are there.
>
>
>
>> See:
>>
>>
>> http://www.postgresql.org/docs/9.1/interactive/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-MASTER
>>
>> "wal_keep_segments (integer)
>>
>>  Specifies the minimum number of past log file segments kept in the
>> pg_xlog directory, in case a standby server needs to fetch them for
>> streaming replication. Each segment is normally 16 megabytes. If a
>> standby server connected to the primary falls behind by more than
>> wal_keep_segments segments, the primary might remove a WAL segment still
>> needed by the standby, in which case the replication connection will be
>> terminated. (However, the standby server can recover by fetching the
>> segment from archive, if WAL archiving is in use.)
>>
>>  This sets only the minimum number of segments retained in pg_xlog;
>> the system might need to retain more segments for WAL archival or to
>> recover from a checkpoint. If wal_keep_segments is zero (the default),
>> the system doesn't keep any extra segments for standby purposes, 

Re: [GENERAL] - PostgreSQL Replication Types

2015-12-17 Thread David Steele
Hi Will,

On 12/17/15 10:17 AM, Will McCormick wrote:
> I inherited a 9.1 replication environment
> 
> Few basic questions that I can't find clear answers / clarifications for
> if possible:
> 
> 3 types of replication in 9.1 I've read about from the offical docs:
> 
> 1) warm standby 

This is a standby which is applying WAL segments from the master (via
recovery_command (log shipping) or streaming replication.

> 2) hot standby

Just like a warm standby but adds the ability to do read-only queries
(and the master must know it is a hot standby).

> 3) streaming replication

Another way to deliver WAL segments from the master, but the advantage
over recovery_command log shipping is that transactions are applied as
soon as they happen on the master, rather than when a WAL segment is
pushed from the master to the archive.

> I'm using streaming replication I believe,  the only indication I have
> is that  there is the primary_conninfo on the standby. Is this the only
> indication?
> 
> Is it possible to get if using streaming replication under normal
> operations?
> 
> /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
> No such file or directory/
> 
> /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
> No such file or directory/
> 
> /LOG:  streaming replication successfully connected to primary/
> 
> /FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
> segment 0007000F0057 has already been removed/

What this means is your stannby has been out-of-date for some time.  The
WAL segment it needs has been expired from both the archive
(/opt/postgres/9.1/archive) and the master's pg_xlog directory.

Your only option now (unless you can find to required WAL segments
somewhere) is to do a new backup using pg_basebackup or some other
mechanism to bring the standby up to date.

> My understanding is that warm standby and hot standby do log shipping
> and there is a greater window for transactions not to be send to the
> standby because WAL XLOG must be filled.

A hot or warm standby can be maintained with either log shipping or
streaming replication.

> Whereas Streaming replication basically sends at the transaction level?

But yes, this is the advantage of streaming replication.

I have written a tutorial that covers setting up a hot standby with or
without streaming replication using pgBackRest:

http://www.pgbackrest.org/user-guide.html#replication

You can replace backup/restore/archive with other methods but the
principal remains the same.  The tutorial is for 9.4 but should work
equally well with 9.1.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


[GENERAL] - PostgreSQL Replication Types

2015-12-17 Thread Will McCormick
I inherited a 9.1 replication environment

Few basic questions that I can't find clear answers / clarifications for if
possible:

3 types of replication in 9.1 I've read about from the offical docs:

1) warm standby
2) hot standby
3) streaming replication

I'm using streaming replication I believe,  the only indication I have is
that  there is the primary_conninfo on the standby. Is this the only
indication?

Is it possible to get if using streaming replication under normal
operations?

*cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057': No
such file or directory*

*cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057': No
such file or directory*

*LOG:  streaming replication successfully connected to primary*

*FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
segment 0007000F0057 has already been removed*


My understanding is that warm standby and hot standby do log shipping and
there is a greater window for transactions not to be send to the standby
because WAL XLOG must be filled.


Whereas Streaming replication basically sends at the transaction level?


I'm sure this is somewhat misinformed!


Thanks,


Will


Re: [GENERAL] - PostgreSQL Replication Types

2015-12-17 Thread Adrian Klaver

On 12/17/2015 07:56 AM, Will McCormick wrote:

Thanks a ton for the prompt response.

I've read most of this but some it was not clear until we discussed.

Updated with WLM:

On 12/17/2015 07:17 AM, Will McCormick wrote:

I inherited a 9.1 replication environment

Few basic questions that I can't find clear answers / clarifications for
if possible:

3 types of replication in 9.1 I've read about from the offical docs:

1) warm standby
2) hot standby
3) streaming replication

I'm using streaming replication I believe,  the only indication I have
is that  there is the primary_conninfo on the standby. Is this the only
indication?


WLM: I also see streaming replication in the logs.

On standby:

http://www.postgresql.org/docs/9.1/interactive/functions-admin.html
"
pg_last_xlog_receive_location()

Get last transaction log location received and synced to disk by
streaming replication. While streaming replication is in progress this
will increase monotonically. If recovery has completed this will remain
static at the value of the last WAL record received and synced to disk
during recovery. If streaming replication is disabled, or if it has not
yet started, the function returns NULL."

WLM: When I do this on the standby I get an error:

   ERROR:  recovery is in progress

   HINT:  WAL control functions cannot be ...


What is the full hint message?
The functions are supposed to be able to be run while the server is in 
recovery.






Is it possible to get if using streaming replication under normal
operations?

/cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
No such file or directory/

/cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
No such file or directory/

/LOG:  streaming replication successfully connected to primary/

/FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
segment 0007000F0057 has already been removed/


Assuming above is from standby log, correct? WLM: yes

The cp lines would seem to indicate a restore_command in the standby
recovery.conf, is that the case?:

http://www.postgresql.org/docs/9.1/static/archive-recovery-settings.html

restore_command (string) WLM: Correct


The FATAL indicates that the WAL file has already been recycled on the
master.

WLM: I had read this what confuses me a bit is:

   /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
   No such file or directory/

Does Streaming replication automatically use Archived WAL files when
WAL XLOG files don't contain a transaction?

We did have wal_keep_segments set to 0. I changed this to 50 but
want to better understand this. Especially the correlation between the
Archived WALs and the XLOG WALs. My guess is the difference between
Streaming replication and the others is very simply that Streaming
replication can read the XLOG WALs as well? So if all the Archived WALs
have been shipped and processed to the Standby then the XLOGs are
processed but not shipped? This meaning at a transaction level "kindof"?


Not really see the section below(#STREAMING-REPLICATION) I posted 
previously. It is either or, if streaming is set up and the standby can 
reach the master xlog directory then it will stream the WAL files from 
there. If the standby cannot access the xlog directory and if you have 
WAL archiving set up on the master and archive restore setup on the 
standby it will switch to full WAL log shipping from the archived WAL 
directory, assuming the files it needs are there.




See:

http://www.postgresql.org/docs/9.1/interactive/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-MASTER

"wal_keep_segments (integer)

 Specifies the minimum number of past log file segments kept in the
pg_xlog directory, in case a standby server needs to fetch them for
streaming replication. Each segment is normally 16 megabytes. If a
standby server connected to the primary falls behind by more than
wal_keep_segments segments, the primary might remove a WAL segment still
needed by the standby, in which case the replication connection will be
terminated. (However, the standby server can recover by fetching the
segment from archive, if WAL archiving is in use.)

 This sets only the minimum number of segments retained in pg_xlog;
the system might need to retain more segments for WAL archival or to
recover from a checkpoint. If wal_keep_segments is zero (the default),
the system doesn't keep any extra segments for standby purposes, so the
number of old WAL segments available to standby servers is a function of
the location of the previous checkpoint and status of WAL archiving.
This parameter has no effect on restartpoints. This parameter can only
be set in the postgresql.conf file or on the server command line.
"

http://www.postgresql.org/docs/9.1/interactive/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based 

Re: [GENERAL] - PostgreSQL Replication Types

2015-12-17 Thread Will McCormick
Almost forgot this:


SELECT pg_current_xlog_location();

ERROR:  recovery is in progress

HINT:  WAL control functions cannot be executed during recovery.

bms=> SELECT pg_current_xlog_location();

ERROR:  recovery is in progress

HINT:  WAL control functions cannot be executed during recovery.



On Thu, Dec 17, 2015 at 11:27 AM, Adrian Klaver 
wrote:

> On 12/17/2015 07:56 AM, Will McCormick wrote:
>
>> Thanks a ton for the prompt response.
>>
>> I've read most of this but some it was not clear until we discussed.
>>
>> Updated with WLM:
>>
>> On 12/17/2015 07:17 AM, Will McCormick wrote:
>>
>> I inherited a 9.1 replication environment
>>
>> Few basic questions that I can't find clear answers / clarifications
>> for
>> if possible:
>>
>> 3 types of replication in 9.1 I've read about from the offical docs:
>>
>> 1) warm standby
>> 2) hot standby
>> 3) streaming replication
>>
>> I'm using streaming replication I believe,  the only indication I have
>> is that  there is the primary_conninfo on the standby. Is this the
>> only
>> indication?
>>
>>
>> WLM: I also see streaming replication in the logs.
>>
>> On standby:
>>
>> http://www.postgresql.org/docs/9.1/interactive/functions-admin.html
>> "
>> pg_last_xlog_receive_location()
>>
>> Get last transaction log location received and synced to disk by
>> streaming replication. While streaming replication is in progress this
>> will increase monotonically. If recovery has completed this will remain
>> static at the value of the last WAL record received and synced to disk
>> during recovery. If streaming replication is disabled, or if it has not
>> yet started, the function returns NULL."
>>
>> WLM: When I do this on the standby I get an error:
>>
>>ERROR:  recovery is in progress
>>
>>HINT:  WAL control functions cannot be ...
>>
>
> What is the full hint message?
> The functions are supposed to be able to be run while the server is in
> recovery.
>
>
>
>>
>>
>> Is it possible to get if using streaming replication under normal
>> operations?
>>
>> /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
>> No such file or directory/
>>
>> /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
>> No such file or directory/
>>
>> /LOG:  streaming replication successfully connected to primary/
>>
>> /FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
>> segment 0007000F0057 has already been removed/
>>
>>
>> Assuming above is from standby log, correct? WLM: yes
>>
>> The cp lines would seem to indicate a restore_command in the standby
>> recovery.conf, is that the case?:
>>
>> http://www.postgresql.org/docs/9.1/static/archive-recovery-settings.html
>>
>> restore_command (string) WLM: Correct
>>
>>
>> The FATAL indicates that the WAL file has already been recycled on the
>> master.
>>
>> WLM: I had read this what confuses me a bit is:
>>
>>/cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
>>No such file or directory/
>>
>> Does Streaming replication automatically use Archived WAL files when
>> WAL XLOG files don't contain a transaction?
>>
>> We did have wal_keep_segments set to 0. I changed this to 50 but
>> want to better understand this. Especially the correlation between the
>> Archived WALs and the XLOG WALs. My guess is the difference between
>> Streaming replication and the others is very simply that Streaming
>> replication can read the XLOG WALs as well? So if all the Archived WALs
>> have been shipped and processed to the Standby then the XLOGs are
>> processed but not shipped? This meaning at a transaction level "kindof"?
>>
>
> Not really see the section below(#STREAMING-REPLICATION) I posted
> previously. It is either or, if streaming is set up and the standby can
> reach the master xlog directory then it will stream the WAL files from
> there. If the standby cannot access the xlog directory and if you have WAL
> archiving set up on the master and archive restore setup on the standby it
> will switch to full WAL log shipping from the archived WAL directory,
> assuming the files it needs are there.
>
>
>
>> See:
>>
>>
>> http://www.postgresql.org/docs/9.1/interactive/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-MASTER
>>
>> "wal_keep_segments (integer)
>>
>>  Specifies the minimum number of past log file segments kept in the
>> pg_xlog directory, in case a standby server needs to fetch them for
>> streaming replication. Each segment is normally 16 megabytes. If a
>> standby server connected to the primary falls behind by more than
>> wal_keep_segments segments, the primary might remove a WAL segment still
>> needed by the standby, in which case the replication connection will be
>> terminated. (However, the standby server can recover by fetching the
>> segment from archive, if WAL archiving is in use.)
>>
>>  This sets only 

Re: [GENERAL] - PostgreSQL Replication Types

2015-12-17 Thread Adrian Klaver

On 12/17/2015 07:17 AM, Will McCormick wrote:

I inherited a 9.1 replication environment

Few basic questions that I can't find clear answers / clarifications for
if possible:

3 types of replication in 9.1 I've read about from the offical docs:

1) warm standby
2) hot standby
3) streaming replication

I'm using streaming replication I believe,  the only indication I have
is that  there is the primary_conninfo on the standby. Is this the only
indication?


On standby:

http://www.postgresql.org/docs/9.1/interactive/functions-admin.html
"
pg_last_xlog_receive_location()

Get last transaction log location received and synced to disk by 
streaming replication. While streaming replication is in progress this 
will increase monotonically. If recovery has completed this will remain 
static at the value of the last WAL record received and synced to disk 
during recovery. If streaming replication is disabled, or if it has not 
yet started, the function returns NULL."




Is it possible to get if using streaming replication under normal
operations?

/cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
No such file or directory/

/cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
No such file or directory/

/LOG:  streaming replication successfully connected to primary/

/FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
segment 0007000F0057 has already been removed/


Assuming above is from standby log, correct?

The cp lines would seem to indicate a restore_command in the standby 
recovery.conf, is that the case?:


http://www.postgresql.org/docs/9.1/static/archive-recovery-settings.html

restore_command (string)


The FATAL indicates that the WAL file has already been recycled on the 
master.


See:

http://www.postgresql.org/docs/9.1/interactive/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-MASTER

"wal_keep_segments (integer)

Specifies the minimum number of past log file segments kept in the 
pg_xlog directory, in case a standby server needs to fetch them for 
streaming replication. Each segment is normally 16 megabytes. If a 
standby server connected to the primary falls behind by more than 
wal_keep_segments segments, the primary might remove a WAL segment still 
needed by the standby, in which case the replication connection will be 
terminated. (However, the standby server can recover by fetching the 
segment from archive, if WAL archiving is in use.)


This sets only the minimum number of segments retained in pg_xlog; 
the system might need to retain more segments for WAL archival or to 
recover from a checkpoint. If wal_keep_segments is zero (the default), 
the system doesn't keep any extra segments for standby purposes, so the 
number of old WAL segments available to standby servers is a function of 
the location of the previous checkpoint and status of WAL archiving. 
This parameter has no effect on restartpoints. This parameter can only 
be set in the postgresql.conf file or on the server command line.

"

http://www.postgresql.org/docs/9.1/interactive/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based continuous 
archiving, you have to set wal_keep_segments in the master to a value 
high enough to ensure that old WAL segments are not recycled too early, 
while the standby might still need them to catch up. If the standby 
falls behind too much, it needs to be reinitialized from a new base 
backup. If you set up a WAL archive that's accessible from the standby, 
wal_keep_segments is not required as the standby can always use the 
archive to catch up."




/
/

My understanding is that warm standby and hot standby do log shipping
and there is a greater window for transactions not to be send to the
standby because WAL XLOG must be filled.


Hot versus warm standby refer to whether it is possible to run read only 
queries on the standby in the first case or not in the second case.





Whereas Streaming replication basically sends at the transaction level?


The difference you are looking for is log shipping versus streaming, 
where log shipping moves complete WAL files and streaming streams the 
same files.


See here for more detail:

http://www.postgresql.org/docs/9.1/interactive/high-availability.html




I'm sure this is somewhat misinformed!


Thanks,


Will




--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] - PostgreSQL Replication Types

2015-12-17 Thread Will McCormick
Thanks a ton for the prompt response.

I've read most of this but some it was not clear until we discussed.

Updated with WLM:

On 12/17/2015 07:17 AM, Will McCormick wrote:
>
> I inherited a 9.1 replication environment
>
> Few basic questions that I can't find clear answers / clarifications for
> if possible:
>
> 3 types of replication in 9.1 I've read about from the offical docs:
>
> 1) warm standby
> 2) hot standby
> 3) streaming replication
>
> I'm using streaming replication I believe,  the only indication I have
> is that  there is the primary_conninfo on the standby. Is this the only
> indication?
>

WLM: I also see streaming replication in the logs.

On standby:

http://www.postgresql.org/docs/9.1/interactive/functions-admin.html
"
pg_last_xlog_receive_location()

Get last transaction log location received and synced to disk by streaming
replication. While streaming replication is in progress this will increase
monotonically. If recovery has completed this will remain static at the
value of the last WAL record received and synced to disk during recovery.
If streaming replication is disabled, or if it has not yet started, the
function returns NULL."

WLM: When I do this on the standby I get an error:

  ERROR:  recovery is in progress

  HINT:  WAL control functions cannot be ...


> Is it possible to get if using streaming replication under normal
> operations?
>
> /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
> No such file or directory/
>
> /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
> No such file or directory/
>
> /LOG:  streaming replication successfully connected to primary/
>
> /FATAL:  could not receive data from WAL stream: FATAL:  requested WAL
> segment 0007000F0057 has already been removed/
>

Assuming above is from standby log, correct? WLM: yes

The cp lines would seem to indicate a restore_command in the standby
recovery.conf, is that the case?:

http://www.postgresql.org/docs/9.1/static/archive-recovery-settings.html

restore_command (string) WLM: Correct


The FATAL indicates that the WAL file has already been recycled on the
master.

WLM: I had read this what confuses me a bit is:

  /cp: cannot stat `/opt/postgres/9.1/archive/0007000F0057':
  No such file or directory/

   Does Streaming replication automatically use Archived WAL files when WAL
XLOG files don't contain a transaction?

   We did have wal_keep_segments set to 0. I changed this to 50 but want to
better understand this. Especially the correlation between the Archived
WALs and the XLOG WALs. My guess is the difference between Streaming
replication and the others is very simply that Streaming replication can
read the XLOG WALs as well? So if all the Archived WALs have been shipped
and processed to the Standby then the XLOGs are processed but not shipped?
This meaning at a transaction level "kindof"?

See:

http://www.postgresql.org/docs/9.1/interactive/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-MASTER

"wal_keep_segments (integer)

Specifies the minimum number of past log file segments kept in the
pg_xlog directory, in case a standby server needs to fetch them for
streaming replication. Each segment is normally 16 megabytes. If a standby
server connected to the primary falls behind by more than wal_keep_segments
segments, the primary might remove a WAL segment still needed by the
standby, in which case the replication connection will be terminated.
(However, the standby server can recover by fetching the segment from
archive, if WAL archiving is in use.)

This sets only the minimum number of segments retained in pg_xlog; the
system might need to retain more segments for WAL archival or to recover
from a checkpoint. If wal_keep_segments is zero (the default), the system
doesn't keep any extra segments for standby purposes, so the number of old
WAL segments available to standby servers is a function of the location of
the previous checkpoint and status of WAL archiving. This parameter has no
effect on restartpoints. This parameter can only be set in the
postgresql.conf file or on the server command line.
"

http://www.postgresql.org/docs/9.1/interactive/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based continuous archiving,
you have to set wal_keep_segments in the master to a value high enough to
ensure that old WAL segments are not recycled too early, while the standby
might still need them to catch up. If the standby falls behind too much, it
needs to be reinitialized from a new base backup. If you set up a WAL
archive that's accessible from the standby, wal_keep_segments is not
required as the standby can always use the archive to catch up."


> /
> /
>
> My understanding is that warm standby and hot standby do log shipping
> and there is a greater window for transactions not to be send to the
> standby because WAL XLOG must be filled.
>

Hot versus warm standby refer to 

Re: [GENERAL] - PostgreSQL Replication Types

2015-12-17 Thread Adrian Klaver

On 12/17/2015 08:43 AM, Will McCormick wrote:

Almost forgot this:


SELECT pg_current_xlog_location();


I was not paying attention earlier that should be:

pg_last_xlog_receive_location() 

from:

http://www.postgresql.org/docs/9.1/interactive/functions-admin.html

Table 9-58. Recovery Information Functions



ERROR:  recovery is in progress

HINT:  WAL control functions cannot be executed during recovery.

bms=> SELECT pg_current_xlog_location();

ERROR:  recovery is in progress

HINT:  WAL control functions cannot be executed during recovery.






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql replication assistance

2012-09-12 Thread Albe Laurenz
Gustav Potgieter wrote:
 Hope you can assist and that I am posting to the right forum.

Sending the question twice with the same wording does not
make it clearer...

 We currently have multiple Postgresql 9 instances running with warm
standby, and the replication work
 wonderfully.
 
 The problem is the following, we take the slave database out of
recovery and it works perfectly, but
 when we create a new database and setup streaming replication to point
to this database it is read
 only and never starts up.
 
 I created a test database and it did replicate across, but I could
only see it once I removed
 recovery.conf and re-started obviously not in recovery anymore, I
attached the logs at the bottom FYI
 
 Is this even possible, to re-create a streaming database again from an
ex slave database with a base
 backup and simply re-creating recovery.conf before startup?

If I understand you correctly, there are three computers involved:
An original master, an original slave that is taken out of recovery
(promoted?), and a new slave that is set up using the original slave
as master.
Right?

That should work if you do it right.

You quote from a number of logs, but to me it is not clear which log
belongs to which server.

Let's say the original master is A, the original slave and new master
is B and the new slave is C.

It would be interesting to see B's log from when replication is set up
and starts working, B's log from the time it is promoted to master,
and C's log from the time when you try to set it up as B's slave.

Do you share WAL archives between the machines?
If yes, how?
How exactly did you promote B?

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgresql replication assistance

2012-09-11 Thread Gustav Potgieter
Hi All,

Hope you can assist and that I am posting to the right forum.

We currently have multiple Postgresql 9 instances running with warm standby, 
and the replication work wonderfully.

The problem is the following, we take the slave database out of recovery and it 
works perfectly, but when we create a new database and setup streaming 
replication to point to this database it is read only and never starts up.

I created a test database and it did replicate across, but I could only see it 
once I removed recovery.conf and re-started obviously not in recovery anymore, 
I attached the logs at the bottom FYI 

Is this even possible, to re-create a streaming database again from an ex slave 
database with a base backup and simply re-creating recovery.conf before startup?

Thank you,
Gustav

##

LOGS ON SLAVE
LOG:  entering standby mode
LOG:  record with zero length at 1D/3C000634
LOG:  streaming replication successfully connected to primary
LOG:  redo starts at 1D/3C000634
FATAL:  the database system is starting up

##

recovery.conf
standby_mode = 'on'
  primary_conninfo = 'host=emr-db-1 port=5432 user=postgres password=postgres'
  trigger_file = '/tmp/pgsql.trigger'

##

and postgresql.conf 
# - Standby Servers -
hot_standby = on# on allows queries during recovery
# (change requires restart)

##

Solaris 10# postmaster --version
postgres (PostgreSQL) 9.0.9

##

Solaris 10# uname -a
SunOS database-machine-1 4 5.10 Generic_142910-17 i86pc i386 i86pc


#

The database did not recover when we created a trigger file and I had to 
manually remove recovery.conf and restart the database before it started up and 
I was able to live queries again, obviously not in recovery anymore.

LOG:  trigger file found: /tmp/pgsql.trigger
FATAL:  terminating walreceiver process due to administrator command
LOG:  redo done at 1D/47002BC8
LOG:  last completed transaction was at log time 2012-09-10 18:33:04.734928+02
FATAL:  WAL ends before consistent recovery point
LOG:  startup process (PID 28326) exited with exit code 1
LOG:  terminating any other active server processes
LOG:  database system was interrupted while in recovery at log time 2012-09-10 
18:32:21 SAST
HINT:  If this has occurred more than once some data might be corrupted and you 
might need to choose an earlier recovery target.
LOG:  entering standby mode
LOG:  redo starts at 1D/4520
LOG:  invalid record length at 1D/47002C04
LOG:  streaming replication successfully connected to primary
FATAL:  the database system is starting up



[GENERAL] Postgresql replication assistance

2012-09-10 Thread Gustav Potgieter
Hi All,

Hope you can assist and that I am posting to the right forum.

We currently have multiple Postgresql 9 instances running with warm standby, 
and the replication work wonderfully.

The problem is the following, we take the slave database out of recovery and it 
works perfectly, but when we create a new database and setup streaming 
replication to point to this database it is read only and never starts up.

I created a test database and it did replicate across, but I could only see it 
once I removed recovery.conf and re-started obviously not in recovery anymore, 
I attached the logs at the bottom FYI 

Is this even possible, to re-create a streaming database again from an ex slave 
database with a base backup and simply re-creating recovery.conf before startup?

Thank you,
Gustav

##

LOGS ON SLAVE
LOG:  entering standby mode
LOG:  record with zero length at 1D/3C000634
LOG:  streaming replication successfully connected to primary
LOG:  redo starts at 1D/3C000634
FATAL:  the database system is starting up

##

recovery.conf
standby_mode = 'on'
  primary_conninfo = 'host=emr-db-1 port=5432 user=postgres password=postgres'
  trigger_file = '/tmp/pgsql.trigger'

##

and postgresql.conf 
# - Standby Servers -
hot_standby = on# on allows queries during recovery
# (change requires restart)

##

Solaris 10# postmaster --version
postgres (PostgreSQL) 9.0.9

##

Solaris 10# uname -a
SunOS database-machine-1 4 5.10 Generic_142910-17 i86pc i386 i86pc


#

The database did not recover when we created a trigger file and I had to 
manually remove recovery.conf and restart the database before it started up and 
I was able to live queries again, obviously not in recovery anymore.

LOG:  trigger file found: /tmp/pgsql.trigger
FATAL:  terminating walreceiver process due to administrator command
LOG:  redo done at 1D/47002BC8
LOG:  last completed transaction was at log time 2012-09-10 18:33:04.734928+02
FATAL:  WAL ends before consistent recovery point
LOG:  startup process (PID 28326) exited with exit code 1
LOG:  terminating any other active server processes
LOG:  database system was interrupted while in recovery at log time 2012-09-10 
18:32:21 SAST
HINT:  If this has occurred more than once some data might be corrupted and you 
might need to choose an earlier recovery target.
LOG:  entering standby mode
LOG:  redo starts at 1D/4520
LOG:  invalid record length at 1D/47002C04
LOG:  streaming replication successfully connected to primary
FATAL:  the database system is starting up



[GENERAL] Postgresql Replication Comparison Required

2011-12-29 Thread saurabh gupta
I am doing POC on Posgtresql replication. I am using latest version of
postgresql i.e. 9.1. There are multiple replication solutions avaliable in
the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide
in-built replication solutions (Streaming replication, Warm Standby and hot
standby). I am confused which solution is best for the financial
application for which I am doing POC. The application will write around 160
million records with row size of 2.5 KB in database. My questions is for
following scenarios which replication solution will be suitable:

If I would require replication for backup purpose only
If I would require to scale the reads
If I would require High Avaliability and Consistency
Also It will be very helpful if you can share the perfomance or experience
with postgresql replication solutions.

Thanks


Re: [GENERAL] Postgresql Replication Comparison Required

2011-12-29 Thread Gabriele Bartolini

Hello,

in general my advice would be to stick with native features, therefore 
use either Streaming Replication (or alternatively log shipping 
replication). You might need some tools to help you manage the cluster, 
clients routing and balancing but I suggest you look into this later.


On Thu, 29 Dec 2011 16:55:00 +0530, saurabh gupta 
saurabh@gmail.com wrote:

If I would require replication for backup purpose only 


For disaster recovery, you need physical base backups with continous 
archiving 
(http://www.postgresql.org/docs/9.1/interactive/continuous-archiving.html)



 If I would require to scale the reads 


You need Hot Standby here (and you might need to route read only 
applications to one of the slaves somehow - but you can worry about this 
later).



 If I would require High Avaliability and Consistency 


Streaming replication. With 9.1 you also have Synchronous Streaming 
Replication which means you have zero data loss of committed 
transactions within your PostgreSQL cluster. Another useful tool you 
might want to look into is repmgr (www.repmgr.org).



 Also It will be very helpful if you can share the perfomance or
experience with postgresql replication solutions. 


I wish I could help you more here, but most of our professional work is 
performed under strict NDAs.


An interesting and useful documentation section is this also: 
http://www.postgresql.org/docs/9.1/interactive/high-availability.html


I hope this helps.

Cheers,
Gabriele
--
 Gabriele Bartolini - 2ndQuadrant Italia
 PostgreSQL Training, Services and Support
 gabriele.bartol...@2ndquadrant.it - www.2ndQuadrant.it

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-31 Thread Vivek Khera


On Mar 25, 2008, at 4:28 PM, Jeff Davis wrote:
This obviously does not work in real time, but it may be useful. It  
does

not require a lot of additional space to do this because of the ZFS
copy-on-write implementation.


But what benefit does it give you if you're pounding on the same set  
of physical disks?  You might as well run it on the original since  
you're limit is the disk I/O.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-26 Thread Simon Riggs
On Tue, 2008-03-25 at 17:53 -0600, Keaton Adams wrote:

 That is an interesting question.  If our organization were to help fund the
 development of such a feature, would that be something taken into
 consideration by the development team?

Yes. Many of my major projects have been funded that way. I have funding
now for some projects, but not this one. Most of my smaller
contributions and advocacy work are not made for direct reward.

I'm trying to get in touch with Florian to discuss working together on
this.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PostgreSQL Replication with read-only

2008-03-26 Thread Keaton Adams

Our organization is looking for a hot-standby option for PostgreSQL that
uses the WAL (transaction) data to keep the standby current and also allows
the standby to be read-only accessible for reporting.  We have implemented
WAL shipping through a set of scripts we developed and that works well to
have a standby DB on the ready in case we need to fail over, but we are
looking to increase the value of the standby server by making it available
for queries.  Because of the complexities of our environment using a
table/trigger based replication method such as Slony won¹t work well.

It would be great if there was a solution (Open Source or Commercial) that
worked in a similar manner as Oracle Active Data Guard:

³Oracle Active Data Guard enables a physical standby database to be open for
read-only access ­ for reporting, simple or complex queries ­ while changes
from the production database are being applied to it. This means any
operation that requires up-to-date read-only access can be offloaded to the
replica, enhancing and protecting the performance of the production
database.²  

³All queries reading from the physical replica execute in real-time, and
return current results.  A Data Guard configuration consists of one
production (or primary) database and up to nine standby databases.  A
standby database is initially created from a backup copy of the primary
database.  Once created, Data Guard automatically maintains the standby
database as a synchronized copy of the primary database by transmitting
primary database redo data to the standby system and then applying the redo
data to the standby database.²

Does anyone know of such a solution for PostgreSQL?

Thanks,

Keaton






Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-26 Thread Dimitri Fontaine
Le mercredi 26 mars 2008, Greg Smith a écrit :
 (My favorite acronym is TLA)

Hehe :)
I'd vote for A...
-- 
dim


signature.asc
Description: This is a digitally signed message part.


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-26 Thread Andrew Sullivan
On Wed, Mar 26, 2008 at 01:03:34AM -0400, Greg Smith wrote:

 against.  People who are using the current warm-standby code are already 
 grappling with issues like how to coordinate master/slave failover 
 (including my second favorite acronym, STONITH for shoot the other node 
 in the head).  I don't expect handling that sort of thing will ever be 
 integrated into the PostgreSQL database core.  

Note that most other database products don't integrate it in their core
either.  They package separate tools for it, and sell it as a single system,
but you can often buy the separate tools independently.  Oracle's RAC is an
exception, but it also works completely differently than any of this.

A


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-26 Thread Chris Browne
[EMAIL PROTECTED] (Keaton Adams) writes:
 That is an interesting question.  If our organization were to help fund the
 development of such a feature, would that be something taken into
 consideration by the development team?

I seem to recall there being a relevant Google Summer of Code project
about this, last year.

http://code.google.com/soc/2007/postgres/appinfo.html?csaid=6545828A8197EBC6

http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php

I do not recall how far it got.  It obviously didn't make it into 8.3
;-)!
-- 
(format nil [EMAIL PROTECTED] cbbrowne acm.org)
http://cbbrowne.com/info/rdbms.html
Why are they called apartments, when they're all stuck together? 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-26 Thread Alvaro Herrera
Chris Browne wrote:

 I seem to recall there being a relevant Google Summer of Code project
 about this, last year.

 I do not recall how far it got.  It obviously didn't make it into 8.3
 ;-)!

Some parts of it did -- for example we got read-only transactions
which were a step towards that goal.  (The point here is that a hot
standby needs to be able to execute readonly transactions.)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Keaton Adams

Our organization is looking for a hot-standby option for PostgreSQL that
uses the WAL (transaction) data to keep the standby current and also allows
the standby to be read-only accessible for reporting.  We have implemented
WAL shipping through a set of scripts we developed and that works well to
have a standby DB on the ready in case we need to fail over, but we are
looking to increase the value of the standby server by making it available
for queries.  Because of the complexities of our environment using a
table/trigger based replication method such as Slony won¹t work well.

It would be great if there was a solution (Open Source or Commercial) that
worked in a similar manner as Oracle Active Data Guard:

³Oracle Active Data Guard enables a physical standby database to be open for
read-only access ­ for reporting, simple or complex queries ­ while changes
from the production database are being applied to it. This means any
operation that requires up-to-date read-only access can be offloaded to the
replica, enhancing and protecting the performance of the production
database.²  

³All queries reading from the physical replica execute in real-time, and
return current results.  A Data Guard configuration consists of one
production (or primary) database and up to nine standby databases.  A
standby database is initially created from a backup copy of the primary
database.  Once created, Data Guard automatically maintains the standby
database as a synchronized copy of the primary database by transmitting
primary database redo data to the standby system and then applying the redo
data to the standby database.²

Does anyone know of such a solution for PostgreSQL?

Thanks,

Keaton


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread salman



Keaton Adams wrote:

Our organization is looking for a hot-standby option for PostgreSQL that
uses the WAL (transaction) data to keep the standby current and also allows
the standby to be read-only accessible for reporting.  We have implemented
WAL shipping through a set of scripts we developed and that works well to
have a standby DB on the ready in case we need to fail over, but we are
looking to increase the value of the standby server by making it available
for queries.  Because of the complexities of our environment using a
table/trigger based replication method such as Slony won¹t work well.

It would be great if there was a solution (Open Source or Commercial) that
worked in a similar manner as Oracle Active Data Guard:

³Oracle Active Data Guard enables a physical standby database to be open for
read-only access ­ for reporting, simple or complex queries ­ while changes
from the production database are being applied to it. This means any
operation that requires up-to-date read-only access can be offloaded to the
replica, enhancing and protecting the performance of the production
database.²  


³All queries reading from the physical replica execute in real-time, and
return current results.  A Data Guard configuration consists of one
production (or primary) database and up to nine standby databases.  A
standby database is initially created from a backup copy of the primary
database.  Once created, Data Guard automatically maintains the standby
database as a synchronized copy of the primary database by transmitting
primary database redo data to the standby system and then applying the redo
data to the standby database.²

Does anyone know of such a solution for PostgreSQL?

Thanks,

Keaton



IIRC, it was mentioned previously in one posting that this a TODO for a 
future version of postgres but not something that's expected soon.


Someone please correct me if I'm wrong.

-salman

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Richard Broersma
On Tue, Mar 25, 2008 at 1:17 PM, salman [EMAIL PROTECTED] wrote:
 IIRC, it was mentioned previously in one posting that this a TODO for a
 future version of postgres but not something that's expected soon.

 Someone please correct me if I'm wrong.

This is what I saw on the TODO list:
Write-Ahead Log
  * Allow WAL traffic to be streamed to another server for stand-by replication


-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Keaton Adams
But will that stand-by replication provide for a read-only slave?


On 3/25/08 2:26 PM, Richard Broersma [EMAIL PROTECTED] wrote:

 On Tue, Mar 25, 2008 at 1:17 PM, salman [EMAIL PROTECTED] wrote:
 IIRC, it was mentioned previously in one posting that this a TODO for a
 future version of postgres but not something that's expected soon.
 
 Someone please correct me if I'm wrong.
 
 This is what I saw on the TODO list:
 Write-Ahead Log
   * Allow WAL traffic to be streamed to another server for stand-by
 replication
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Jeff Davis
On Tue, 2008-03-25 at 14:11 -0600, Keaton Adams wrote:
 “All queries reading from the physical replica execute in real-time,
 and return current results.  A Data Guard configuration consists of
 one production (or primary) database and up to nine standby
 databases.  A standby database is initially created from a backup copy
 of the primary database.  Once created, Data Guard automatically
 maintains the standby database as a synchronized copy of the primary
 database by transmitting primary database redo data to the standby
 system and then applying the redo data to the standby database.”
 
 Does anyone know of such a solution for PostgreSQL?
 

There has been work in this area, but unfortunately this feature has not
been completed yet. Many people would like this feature.

I have heard of a strategy used by some PostgreSQL users in which they
use something like ZFS (which allows fast filesystem snapshots/clones)
to quickly clone the data in a separate area, and then bring up the
database on the clone for reporting purposes.

This obviously does not work in real time, but it may be useful. It does
not require a lot of additional space to do this because of the ZFS
copy-on-write implementation.

Regards,
Jeff Davis


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Keaton Adams

It is close, but has limitations that will be problematic for our
environment, such as:

Replicator will not replicate the schema. You must restore your schema to th
e slaves from the master before you begin replication.

Replicator can only replicate one database. If you have multiple databases y
ou can 
either initialize clusters for each database or move all databases into a si
ngle database using schemas/namespaces.

It is possible to add and drop columns to replicated tables within Replicato
r. This type 
of change to your table structure will require a full sync and therefore is 
best done in batch or after hours.

Thanks for the reply,

Keaton 



On 3/25/08 2:18 PM, Richard Broersma [EMAIL PROTECTED] wrote:

 On Tue, Mar 25, 2008 at 1:11 PM, Keaton Adams [EMAIL PROTECTED] wrote:
 Our organization is looking for a hot-standby option for PostgreSQL that
 uses the WAL (transaction) data to keep the standby current and also allows
 the standby to be read-only accessible for reporting.  We have implemented
 WAL shipping through a set of scripts we developed and that works well to
 have a standby DB on the ready in case we need to fail over, but we are
 looking to increase the value of the standby server by making it available
 for queries.  Because of the complexities of our environment using a
 table/trigger based replication method such as Slony won't work well.
 
 It would be great if there was a solution (Open Source or Commercial) that
 worked in a similar manner as Oracle Active Data Guard:
 
 Does anyone know of such a solution for PostgreSQL?
 
 I think this does what you want.
 
 http://commandprompt.com/products/mammothreplicator/



Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Simon Riggs
On Tue, 2008-03-25 at 14:11 -0600, Keaton Adams wrote:

 “Oracle Active Data Guard enables a physical standby database to be
 open for read-only access – for reporting, simple or complex queries –
 while changes from the production database are being applied to it.
 This means any operation that requires up-to-date read-only access can
 be offloaded to the replica, enhancing and protecting the performance
 of the production database.”  
 
 “All queries reading from the physical replica execute in real-time,
 and return current results.  A Data Guard configuration consists of
 one production (or primary) database and up to nine standby
 databases.  A standby database is initially created from a backup copy
 of the primary database.  Once created, Data Guard automatically
 maintains the standby database as a synchronized copy of the primary
 database by transmitting primary database redo data to the standby
 system and then applying the redo data to the standby database.”
 
 Does anyone know of such a solution for PostgreSQL?

Some funding would help that move forwards. If you or others would
consider that, it would help, even if just to provide the seed for
additional contributors.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Richard Broersma
On Tue, Mar 25, 2008 at 3:08 PM, Simon Riggs [EMAIL PROTECTED] wrote:
 On Tue, 2008-03-25 at 14:11 -0600, Keaton Adams wrote:
 Some funding would help that move forwards. If you or others would
 consider that, it would help, even if just to provide the seed for
 additional contributors.

That is an interesting thought, is it possible to earmark donations
for a specific (set of) todo(s)?


-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Richard Broersma
On Tue, Mar 25, 2008 at 1:11 PM, Keaton Adams [EMAIL PROTECTED] wrote:
 Our organization is looking for a hot-standby option for PostgreSQL that
 uses the WAL (transaction) data to keep the standby current and also allows
 the standby to be read-only accessible for reporting.  We have implemented
 WAL shipping through a set of scripts we developed and that works well to
 have a standby DB on the ready in case we need to fail over, but we are
 looking to increase the value of the standby server by making it available
 for queries.  Because of the complexities of our environment using a
 table/trigger based replication method such as Slony won't work well.

 It would be great if there was a solution (Open Source or Commercial) that
 worked in a similar manner as Oracle Active Data Guard:

 Does anyone know of such a solution for PostgreSQL?

I think this does what you want.

http://commandprompt.com/products/mammothreplicator/
-- 
Regards,
Richard Broersma Jr.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Keaton Adams
That is an interesting question.  If our organization were to help fund the
development of such a feature, would that be something taken into
consideration by the development team?

-Keaton



On 3/25/08 4:32 PM, Richard Broersma [EMAIL PROTECTED] wrote:

 On Tue, Mar 25, 2008 at 3:08 PM, Simon Riggs [EMAIL PROTECTED] wrote:
 On Tue, 2008-03-25 at 14:11 -0600, Keaton Adams wrote:
 Some funding would help that move forwards. If you or others would
 consider that, it would help, even if just to provide the seed for
 additional contributors.
 
 That is an interesting thought, is it possible to earmark donations
 for a specific (set of) todo(s)?
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Tatsuo Ishii
Similar case has been already happened. 

For example, I have propsed to implement WITH RECURSIVE clause and the
work is supported by Sumitomo Electric Information Systems Co.,
Ltd. (http://www.sei-info.co.jp/) and SRA OSS, Inc. Japan
(http://www.sraoss.co.jp).
--
Tatsuo Ishii
SRA OSS, Inc. Japan

 That is an interesting question.  If our organization were to help fund the
 development of such a feature, would that be something taken into
 consideration by the development team?
 
 -Keaton
 
 
 
 On 3/25/08 4:32 PM, Richard Broersma [EMAIL PROTECTED] wrote:
 
  On Tue, Mar 25, 2008 at 3:08 PM, Simon Riggs [EMAIL PROTECTED] wrote:
  On Tue, 2008-03-25 at 14:11 -0600, Keaton Adams wrote:
  Some funding would help that move forwards. If you or others would
  consider that, it would help, even if just to provide the seed for
  additional contributors.
  
  That is an interesting thought, is it possible to earmark donations
  for a specific (set of) todo(s)?
  
 
 
 -- 
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Jonathan Bond-Caron
I'm in the same boat, looking for master-slave replication for 1 master  2
'standby' read-only servers (one would get promoted to master in case of
failure).

I recently read about WAL here:
http://developer.postgresql.org/pgdocs/postgres/warm-standby.html

The standby server is not available for access, since it is continually
performing recovery processing.

PostgreSQL does not provide the system software required to identify a
failure on the primary and notify the standby system and then the standby
database server. Many such tools exist and are well integrated with other
aspects required for successful failover, such as IP address migration.

In short there's not much automation magic at the moment and doesn't seem
like what you're looking for. 
Pgpool-II might be the best alternative. 

I know very little about postgreSQL internals but it would be great if:
- WAL files could be applied while the standby server is operational / allow
read-only queries
- Allow master server to send WAL files to standby servers / * WAL traffic
to be streamed to another server
- Allow master server to send list of all known standby servers
- Allow standby server to check if master server is alive and promote itself
as master (would need to ask / make sure other standby servers do not try
promote themselves at the same time)

Then in my ways, you can use a pool to query the read-only standby servers. 

As I was writing this out, I thought this would make a great SOC project,
but then found it already exists!
http://code.google.com/soc/2007/postgres/appinfo.html?csaid=6545828A8197EBC6

Great news, I'd be happy to pitch in any time to help design a solution like
this :)

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Keaton Adams
Sent: March 25, 2008 4:29 PM
To: Richard Broersma; salman
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] PostgreSQL Replication with read-only access to
standby DB

But will that stand-by replication provide for a read-only slave?


On 3/25/08 2:26 PM, Richard Broersma [EMAIL PROTECTED] wrote:

 On Tue, Mar 25, 2008 at 1:17 PM, salman [EMAIL PROTECTED]
wrote:
 IIRC, it was mentioned previously in one posting that this a TODO for a
 future version of postgres but not something that's expected soon.
 
 Someone please correct me if I'm wrong.
 
 This is what I saw on the TODO list:
 Write-Ahead Log
   * Allow WAL traffic to be streamed to another server for stand-by
 replication
 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Replication with read-only access to standby DB

2008-03-25 Thread Greg Smith

On Tue, 25 Mar 2008, Jonathan Bond-Caron wrote:


I know very little about postgreSQL internals but it would be great if:
- WAL files could be applied while the standby server is operational / allow
read-only queries


This is the part that requires modifying PostgreSQL, and that progress was 
made toward by Florian's GSoC project last summer.



- Allow master server to send WAL files to standby servers / * WAL traffic
to be streamed to another server
- Allow master server to send list of all known standby servers
- Allow standby server to check if master server is alive and promote itself
as master (would need to ask / make sure other standby servers do not try
promote themselves at the same time)


These parts you could build right now, except that there's not too much 
value to more than one standby if you're not using them to execute queries 
against.  People who are using the current warm-standby code are already 
grappling with issues like how to coordinate master/slave failover 
(including my second favorite acronym, STONITH for shoot the other node 
in the head).  I don't expect handling that sort of thing will ever be 
integrated into the PostgreSQL database core.  What is happening instead 
is that the appropriate interfaces to allow building higher-level tools 
are being designed and made available.


(My favorite acronym is TLA)

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostGreSQL Replication

2007-07-16 Thread Gabriele

 Sounds like something you'd want to handle within the application

I believe i will try to follow this path.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] PostGreSQL Replication

2007-07-10 Thread Andrew Sullivan
On Sat, Jul 07, 2007 at 05:16:56AM -0700, Gabriele wrote:
 Let's have a server which feed data to multiple slaves, usually using
 direct online connections. Now, we may want to allow those client to
 sync the data to a local replica, work offline and then resync the
 data back to the server. Which is the easiest way to approach this
 problem?

Write yourself a new system that does it?  To my knowledge, nobody
has yet built a production-ready system that can do this sort of
multi-master merge-back replication.  I can think of possible ways
you could bodge this up using Slony-I and log shipping, but I don't
think it'd be pretty.
 
 resource on this topic? I've found people talking about Slony-I, what
 can you tell me on this tool? Are there other tools?

You can find out more about Slony-I at http://www.slony.info.

 Moreover, additionally to the previous question, let's take the case i
 do have multiple indipendent and separate server and then i want to
 upload the information of these server (actually not everything but a
 subset of the data) to another server (here we have a unidirectional
 comm, as opposed to previous bidirectional comm and we have asymmetric
 structure as opposed to the previous exact replica), what are the
 tools, documentation, best practices or other resources to approach
 this problem?

This also sounds like a multimaster problem, and again I know of no
system that currently supports what you're talking about.  This is
easier to bodge together under Slony using views and such like, but
it's still not trivial.  If you wish to discuss how to do this with
Slony, I suggest taking it up on that list (available from the site
mentioned above).

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
--Alexander Hamilton

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] PostGreSQL Replication

2007-07-10 Thread Guido Neitzer

On 07.07.2007, at 06:16, Gabriele wrote:


Let's have a server which feed data to multiple slaves, usually using
direct online connections. Now, we may want to allow those client to
sync the data to a local replica, work offline and then resync the
data back to the server. Which is the easiest way to approach this
problem?


I don't know anything for PostgreSQL doing that.

FrontBase [1] might have something like that if I understood Geert  
correctly. But I'm not sure about that. Maybe you ask them directly.  
There DBMS is free but not open source though.


cug

[1] http://www.frontbase.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] PostGreSQL Replication

2007-07-10 Thread Adrian von Bidder
On Saturday 07 July 2007 14.16:56 Gabriele wrote:
 I know this is a delicate topic which must be approached cautiously.

 Let's have a server which feed data to multiple slaves, usually using
 direct online connections. Now, we may want to allow those client to
 sync the data to a local replica, work offline and then resync the
 data back to the server. Which is the easiest way to approach this
 problem?

Sounds like something you'd want to handle within the application, not at 
the database layer.  The application can know much more about how data is 
modified and how it can be modified safely so a later merge operation won't 
fail, or how merge conflicts (however these are defined) can be safely 
handled.  The database must assume that a user can modify anything during 
the offline period, and can't assume anything about what to do on merge 
operations.

cheers
-- vbi

-- 
The days just prior to marriage are like a snappy introduction to a
tedious book.


signature.asc
Description: This is a digitally signed message part.


[GENERAL] PostGreSQL Replication

2007-07-09 Thread Gabriele
I know this is a delicate topic which must be approached cautiously.

Let's have a server which feed data to multiple slaves, usually using
direct online connections. Now, we may want to allow those client to
sync the data to a local replica, work offline and then resync the
data back to the server. Which is the easiest way to approach this
problem?

Can you address me to tools, documentation, tech reference or other
resource on this topic? I've found people talking about Slony-I, what
can you tell me on this tool? Are there other tools?

Moreover, additionally to the previous question, let's take the case i
do have multiple indipendent and separate server and then i want to
upload the information of these server (actually not everything but a
subset of the data) to another server (here we have a unidirectional
comm, as opposed to previous bidirectional comm and we have asymmetric
structure as opposed to the previous exact replica), what are the
tools, documentation, best practices or other resources to approach
this problem?

Thank you!


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-02 Thread Alexander Staubo

On 5/2/07, Jamie Deppeler [EMAIL PROTECTED] wrote:

You might want to check pgcluster out
http://pgcluster.projects.postgresql.org/ witch does both.


That page will give you the impression that this project is dead and
abandoned -- the last update is from early 2005. PGCluster does seem
to be active on PgFoundry:

 http://pgfoundry.org/projects/pgcluster/

Mind you, PGCluster is synchronous replication. For updates it scales
even worse than Slony, since every update has to wait for all masters
to complete.

Alexander.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-02 Thread Alvaro Herrera
Alexander Staubo wrote:
 On 5/2/07, Jamie Deppeler [EMAIL PROTECTED] wrote:
 You might want to check pgcluster out
 http://pgcluster.projects.postgresql.org/ witch does both.
 
 That page will give you the impression that this project is dead and
 abandoned -- the last update is from early 2005. PGCluster does seem
 to be active on PgFoundry:
 
  http://pgfoundry.org/projects/pgcluster/

Huh, that page (the one Jamie linked to) _is_ pgfoundry's project page.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-02 Thread Alexander Staubo

On 5/2/07, Alvaro Herrera [EMAIL PROTECTED] wrote:

Alexander Staubo wrote:
 On 5/2/07, Jamie Deppeler [EMAIL PROTECTED] wrote:
 You might want to check pgcluster out
 http://pgcluster.projects.postgresql.org/ witch does both.

 That page will give you the impression that this project is dead and
 abandoned -- the last update is from early 2005. PGCluster does seem
 to be active on PgFoundry:

  http://pgfoundry.org/projects/pgcluster/

Huh, that page (the one Jamie linked to) _is_ pgfoundry's project page.


I don't know what's going on here, but:

$ curl -s http://pgcluster.projects.postgresql.org/ | md5
dad4aaf6659f1a65f228cee1ec71eba4
$ curl -s http://pgfoundry.org/projects/pgcluster/ | md5
3c6645ab3bbffa4e9f77a1ccab6a663a

They're different pages. The first one is horribly out of date;
unfortunately, it is (for me) the first hit on Google, whereas the
PgFoundry project page is the third.

Alexander.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-02 Thread Alvaro Herrera
Alexander Staubo wrote:
 On 5/2/07, Alvaro Herrera [EMAIL PROTECTED] wrote:
 Alexander Staubo wrote:
  On 5/2/07, Jamie Deppeler [EMAIL PROTECTED] wrote:
  You might want to check pgcluster out
  http://pgcluster.projects.postgresql.org/ witch does both.
 
  That page will give you the impression that this project is dead and
  abandoned -- the last update is from early 2005. PGCluster does seem
  to be active on PgFoundry:
 
   http://pgfoundry.org/projects/pgcluster/
 
 Huh, that page (the one Jamie linked to) _is_ pgfoundry's project page.
 
 I don't know what's going on here, but:
 
 $ curl -s http://pgcluster.projects.postgresql.org/ | md5
 dad4aaf6659f1a65f228cee1ec71eba4
 $ curl -s http://pgfoundry.org/projects/pgcluster/ | md5
 3c6645ab3bbffa4e9f77a1ccab6a663a
 
 They're different pages. The first one is horribly out of date;
 unfortunately, it is (for me) the first hit on Google, whereas the
 PgFoundry project page is the third.

Sure, they are different pages, but the first one is supposed to be
maintained by the pgcluster guys.  It is linked from the second page,
and it is hosted in pgfoundry.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-02 Thread Alexander Staubo

On 5/2/07, Alvaro Herrera [EMAIL PROTECTED] wrote:

 They're different pages. The first one is horribly out of date;
 unfortunately, it is (for me) the first hit on Google, whereas the
 PgFoundry project page is the third.

Sure, they are different pages, but the first one is supposed to be
maintained by the pgcluster guys.  It is linked from the second page,
and it is hosted in pgfoundry.


My entire point was that the first one *isn't* maintained. So avoid it
if you're looking for up-to-date information about this project.

Alexander.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-02 Thread Alvaro Herrera
Alexander Staubo wrote:
 On 5/2/07, Alvaro Herrera [EMAIL PROTECTED] wrote:
  They're different pages. The first one is horribly out of date;
  unfortunately, it is (for me) the first hit on Google, whereas the
  PgFoundry project page is the third.
 
 Sure, they are different pages, but the first one is supposed to be
 maintained by the pgcluster guys.  It is linked from the second page,
 and it is hosted in pgfoundry.
 
 My entire point was that the first one *isn't* maintained. So avoid it
 if you're looking for up-to-date information about this project.

My entire point was that that page is also on pgfoundry :-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-01 Thread Jan Bilek

Hello,
We're building database system with replication. Slony-I seems to be a quite 
good solution for the replication, but beside the replication 
(master-to-multiple slaves), we need load balancing aswell - multiple users 
will access the database at the same time=multiple queries.
Is Slony-I capable of load balancing and how to set it up? We searched the 
web and some people mentioned that Slony-I could do load balancing, but 
haven't found how to make Slony-I to do it.


Simple applications like pgpool could do load balancing, too, but they seem 
to be too simple to be used as replication system.


Thanks for any help.

J. 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-01 Thread Alexander Staubo

On 5/1/07, Jan Bilek [EMAIL PROTECTED] wrote:

Is Slony-I capable of load balancing and how to set it up? We searched the
web and some people mentioned that Slony-I could do load balancing, but
haven't found how to make Slony-I to do it.


Slony does not do load balancing. Personally, I recommend doing the
load balancing in the application if possible; you will need to funnel
updates to the master and distribute queries to the slaves, and the
app is in the best position to determine whether something is an
update or a query. Any intermediate middleware is going to add
additional latency and overhead and will add just another single point
of failure to an already complex system.

Alexander.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-01 Thread Jan Bilek

On 5/1/07, Jan Bilek [EMAIL PROTECTED] wrote:
Is Slony-I capable of load balancing and how to set it up? We searched 
the

web and some people mentioned that Slony-I could do load balancing, but
haven't found how to make Slony-I to do it.


Slony does not do load balancing. Personally, I recommend doing the
load balancing in the application if possible; you will need to funnel
updates to the master and distribute queries to the slaves, and the
app is in the best position to determine whether something is an
update or a query. Any intermediate middleware is going to add
additional latency and overhead and will add just another single point
of failure to an already complex system.

Alexander.


Thanks for info.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] PostgreSql replication and load balancing ( is Slony-I a solution?)

2007-05-01 Thread Jamie Deppeler

Alexander Staubo wrote:

On 5/1/07, Jan Bilek [EMAIL PROTECTED] wrote:
Is Slony-I capable of load balancing and how to set it up? We 
searched the

web and some people mentioned that Slony-I could do load balancing, but
haven't found how to make Slony-I to do it.


You might want to check pgcluster out 
http://pgcluster.projects.postgresql.org/ witch does both.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql Replication

2006-02-22 Thread Richard Huxton

Nareen S wrote:

Current Setup
I am having a Mailserver running on Postfix.For the same I had configured  a
High Availability Solution using Heartbeat and DRBD for
replication.Thesetup is like Primary/Secondary Node Servers and all
mails are replicated to
secondary using DRBD.If primary server fails secondary will be up with mails
..
The total number of users is 600 users,and Hardware used is Two Compaq
servers with SCSI HDD,1 GB RAM each,
Network-100Mbps

New Planned Setup
Now I am planning for Postgresql server in the same setup.I want to know
will it work in the same setup.ie if I configure
a new device in drbd can I get HA for postgresql also.If I use DRBD for
postgresql also ,will it affects the existing setups.Finally  if it is not
possible what is the next option for this.


It *should* work as long as you don't start the PostgreSQL server on the 
second machine until you are sure that the first has failed. It will 
then think it is recovering from having crashed.


This is assuming that drbd doesn't mess with how fsync() works, and I 
seem to remember it doesn't last time I looked.


The alternative (assuming you're using the database for user 
details/routing etc) would be replication (slony/dbmirror/etc) or 
perhaps even PITR depending on the rate of change of the database.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql Replication

2006-02-22 Thread Alexander Schreiber
Richard Huxton dev@archonet.com wrote:
 Nareen S wrote:
 Current Setup
 I am having a Mailserver running on Postfix.For the same I had configured  a
 High Availability Solution using Heartbeat and DRBD for
 replication.Thesetup is like Primary/Secondary Node Servers and all
 mails are replicated to
 secondary using DRBD.If primary server fails secondary will be up with mails
 ..
 The total number of users is 600 users,and Hardware used is Two Compaq
 servers with SCSI HDD,1 GB RAM each,
 Network-100Mbps
 
 New Planned Setup
 Now I am planning for Postgresql server in the same setup.I want to know
 will it work in the same setup.ie if I configure
 a new device in drbd can I get HA for postgresql also.If I use DRBD for
 postgresql also ,will it affects the existing setups.Finally  if it is not
 possible what is the next option for this.

 It *should* work as long as you don't start the PostgreSQL server on the 
 second machine until you are sure that the first has failed. It will 
 then think it is recovering from having crashed.

This is simply not possible. DRBD allows access to the replicated device
only on the primary node and rejects any device access on the secondary
node. So the second machine doesn't even have access to the filesystem
with the PostgreSQL data files. Unless, of course, one accesses the
physical devices below the DRBD layer directly - but when you cheat on
your replication system you're bound to get what you deserve ;-)

Regards,
   Alex.
-- 
Opportunity is missed by most people because it is dressed in overalls and
 looks like work.  -- Thomas A. Edison

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Postgresql Replication

2006-02-21 Thread Nareen S
Current Setup

I am having a Mailserver running on Postfix.For the same I had
configured a High Availability Solution using Heartbeat and DRBD
for replication.The setup is like Primary/Secondary Node Servers and
all mails are replicated to secondary using DRBD.If primary server
fails secondary will be up with mails ..

The total number of users is 600 users,and Hardware used is Two Compaq servers with SCSI HDD,1 GB RAM each,

Network-100Mbps



New Planned Setup

Now I am planning for Postgresql server in the same setup.I want to know will it work in the same setup.ie if I configure


a new device in drbd can I get HA for postgresql also.If I use DRBD for
postgresql also ,will it affects the existing setups.Finally if
it is not possible what is the next option for this.



Thanks for any Help



Regards

John


Re: [GENERAL] Postgresql replication

2005-08-30 Thread Andrew Sullivan
On Thu, Aug 25, 2005 at 01:44:15PM +0200, Bohdan Linda wrote:
 there are some other db solutions which have good performance when doing
 this kind of replication across the world.

Bluntly, No.


-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Information security isn't a technological problem.  It's an economics
problem.
--Bruce Schneier

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Postgresql replication

2005-08-30 Thread Andrew Sullivan
On Wed, Aug 24, 2005 at 11:54:42AM -0400, Chris Browne wrote:
 There is a Slony-II project ongoing that is trying to construct a
 more-or-less synchronous multimaster replication system (where part of
 the cleverness involves trying to get as much taking place in an
 asynchronous fashion as possible) that would almost certainly be of no
 use to your use case.

Just to emphasise this point: assuming we ever get Slony-II to work,
it is all but guaranteed to be useless for cases like the one that
started this thread: it'll simply require very fast network
connections to work.  I've had more than one person ask me when
multi-site multimaster is coming, and my answer is always, Have you
started work on it yet?  I think there might be a way to hack up
Slony-I to do it -- Josh Berkus gave me a quick outline while at
OSCON that made me think it possible -- but AFAIK, nobody is actually
doing that work.

It's worth noting that single-origin master-slave async replication
is tricky, but by no means impossible.  Multi-master _anything_ is
hard, no question about it; and it more or less always imposes some
overhead that you won't like.  The question is merely whether you
want to pay that price.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
--Alexander Hamilton

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql replication

2005-08-27 Thread William Yu

Chris Travers wrote:

1)  Efficiency of network throughput
2)  Tolerance to attempts at repeat transactions before replication 
(emptying an account multiple times)

3)  Availability of a transaction.


We ended up having to give up #1. It's possible to have our transactions 
routed to multiple servers before it becomes a final transaction. User1 
might request a payment on ServerA. User2 then marks the payment as 
approved on ServerB. ServerC is authoritative and checks the bank/budget 
balances before posting as final. After each of these steps requires 
replication of the latest changes to all other servers. (In theory, the 
first 2 steps only require replication to the authoritative server but 
we do so anyways so all servers can act as backups for each other -- 
pending transactions still need to be restored in case of total DB failure.)


There's definitely a delay in terms of getting from point A to point Z; 
duplicate servers. But there's guaranteed financial integrity, users can 
connect to any server the load balancer picks and no server requires any 
other server/connection to be up for individual user tranactions to occur.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Postgresql replication

2005-08-27 Thread Chris Travers

William Yu wrote:


Chris Travers wrote:


1)  Efficiency of network throughput
2)  Tolerance to attempts at repeat transactions before replication 
(emptying an account multiple times)

3)  Availability of a transaction.



We ended up having to give up #1. It's possible to have our 
transactions routed to multiple servers before it becomes a final 
transaction. User1 might request a payment on ServerA. User2 then 
marks the payment as approved on ServerB. ServerC is authoritative and 
checks the bank/budget balances before posting as final. After each of 
these steps requires replication of the latest changes to all other 
servers. (In theory, the first 2 steps only require replication to the 
authoritative server but we do so anyways so all servers can act as 
backups for each other -- pending transactions still need to be 
restored in case of total DB failure.)


Ok.  I see where you are going with this.

It is an interesting problem.  Multimaster Async Replication will give 
you a problem in that it allows the attack you are describing due to the 
asynchronous nature of the replication.  If I were trying to prevent 
this sort of attack, I would try to build into this some sort of 
account authority which can manage these transactions.  Origionally I 
was thinking of the home server as the obvious place to start if it is 
available.  But if it is not, then you would need some infrastructure to 
track attempted withdrawals and handle them appropriately.  Such servers 
could cache requests and if they see duplicates or many requests coming 
from many servers on the same account could flag that.  One option might 
be to have a rotational authority (i.e. home-server, then next, then 
next) in a ring so that an unavailable server still allows reasonable 
precautions to be held against emptying the account many times.


Basically, if the servers aren't talking to eachother at the time of the 
transaction, then they are not going to know about duplicates.  You have 
to have some setup beyond your replication to handle this.  Otherwise 
you will have some issues with delays causing the security risks you 
deem unacceptable.


Your question seems to be How do I set up multimaster async replication 
such that a person cannot empty his account on each server and the 
answer is that this is an inherent limitation of multimaster async 
replication.   This also means that you will have to have some sort of 
other verification process for such transactions beyond what is locally 
available on the replicants.


There's definitely a delay in terms of getting from point A to point 
Z; duplicate servers. But there's guaranteed financial integrity, 
users can connect to any server the load balancer picks and no server 
requires any other server/connection to be up for individual user 
tranactions to occur.


The delay will by definition defeat any guarantee of financial integrity 
if you are allowing read-write operations to the replica without 
checking with some sort of central authority.  At very least, the 
central authority should look for suspicious patterns.  Again, it may be 
possible to do some failover here, but I don't think you can do without 
*some* sort of centralized control.


(Note, here load balancing is handled by the distribution of accounts.  
A down server simply means that the next server in the ring will take 
over its remote verification role).


This doesn;t make the security issue go away, but it may reduce it to an 
acceptable level. I.e. it is still possible for duplicates to be 
submitted just before and after a home server goes down, but this is a 
lot better than being able to have one transaction repeated on each 
server and then dealing with the massively overdrawn account.


Best Wishes,
Chris Travers
Metatron Technology Consulting

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Postgresql replication

2005-08-27 Thread William Yu

Our own personal IM :)

Chris Travers wrote:
The delay will by definition defeat any guarantee of financial integrity 
if you are allowing read-write operations to the replica without 
checking with some sort of central authority.  At very least, the 
central authority should look for suspicious patterns.  Again, it may be 
possible to do some failover here, but I don't think you can do without 
*some* sort of centralized control.


Actually this is the easy part. When the home server finally issues 
payments, it only issues what it knows about and what can be verified as 
OK. Any transactions that are currently being entered on another server 
will appear after the next replication cycle and it will be verified 
afterwards. If the previous payment issuing cycle used up all the money, 
the new requests are kept in pending until money is put in. This does 
allow for newer requests that happen to be executed on home servers to 
possibly take precendence over old requests but there is no requirement 
in the business process that payments must come out in any specific order.



This doesn;t make the security issue go away, but it may reduce it to an 
acceptable level. I.e. it is still possible for duplicates to be 
submitted just before and after a home server goes down, but this is a 
lot better than being able to have one transaction repeated on each 
server and then dealing with the massively overdrawn account.


The home server going down is the trickiest issue. Because when a 
server disappears, is that because it went down temporarily? For good? A 
temporary internet problem where nobody can get access to it? Or an 
internet routing issue where just the connection between those two 
servers is severed? If it's the last, users might still be doing stuff 
on ServerA with ServerA is posting financials but ServerB thinks the 
server is down and decides to take over ServerA's duties. Of course, in 
ServerA's view, it's ServerB and ServerC that's down -- not itself.


Maybe we can mitigate this by having more servers at more data centers 
around the world so everybody can monitor everybody. At some point, if 
you have N servers and N-1 servers say ServerA is down, it probably is 
down. With a high enough N, ServerA could probably decisively decide it 
was the server severed from the internet and refuse to post any 
financials until connection to the outside world was restore + some 
extra threshold.


This last problem, which luckily occurs rarely, we do by hand right now. 
We're not ready to run this on full auto because we only have 2 data 
centers (with multiple servers within each data center). The servers do 
not have enough info to know which server is actually down in order to 
auto-promote/demote. It does require staff that's not just in 1 location 
though because our primary office going down w/ our local datacenter 
would mean nobody there could do the switchover. (Assuming major natural 
disaster that kept us from using our laptops at the local Starbucks to 
do the work.)


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql replication

2005-08-27 Thread Chris Browne
William Yu [EMAIL PROTECTED] writes:
 Chris Browne wrote:
 I'm pretty sure that they _don't_ track balance updates for each
 transaction that applies to a customer's account.  You could, via one
 form of trickery or another, overdraw your account by a fairly hefty
 amount, and they probably won't notice for a day or even three.  But
 once they notice/estimate that the Gentle Caller has built up some
 dangerously high balance, they'll warn of impending discontinuation of
 service if some sum isn't paid by some date.

 This works for companies that have some degree of power over their
 customers. E.g. pay up or we disconnect your service. Return your
 overdrafts/pay your fees or we mess up your credit.

 This doesn't work if it's a small company who's account has been
 emptied. Sure the bank will refuse to honor the check but then that
 company will be hit with overdraw penalties and possible legal
 penalties to the payee for the bounced check.

 The warning threshold system is easy to implement but there will
 always be corner cases where the warning is not soon enough or a
 single payment wipes out ~ 100% of the account. Warn too often and
 it'll be ignored by people as a boy crying wolf alarm.

In a context where there is no degree of power over their customers,
I would hardly think that the presence/lack of automated controls or
presence/lack of balance synchronization is particularly material.

In other words, if trying to apply policies is going to forcibly
break, then building the data system may have been a futile exercise
in the first place.  And trying to distribute the system may again be
a *business* mistake that admits no technical solution.

If that's the case, then we can conclude that replication is no
solution, and that the organization should improve connectivity for
their single centralized system.

That may be the case, but is the uninteresting case, as it amounts to
throwing our hands up, saying no answers; you can't have a
distributed system, and giving up on trying anything further.  No
point to further discussion...
-- 
let name=cbbrowne and tld=acm.org in String.concat @ [name;tld];;
http://cbbrowne.com/info/linux.html
There is no  reason anyone would want a computer  in their home.  
-- Ken Olson, Pres. and founder of Digital Equipment Corp.  1977

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Postgresql replication

2005-08-26 Thread William Yu

Chris Travers wrote:
Why not have the people who have rights to review this all write to the 
master database and have that replicated back?  It seems like latency is 
not really an issue.  Replication here is only going to complicate 


What master database? Having a single master defeats the purpose of load 
balancing to handle more users.



 things.  If it were me, I would be having my approval app pull data
 from *all* of the databases independently and not rely on the
 replication for this part.  The replication could then be used to
 replicate *approved* data back to the slaves.

If your app client happens to have high speed access to all servers, 
fine. And you can guarantee uptime connections to all servers except for 
the rare cases of hardware failure. The problem is if you don't, you end 
up with every transaction running at the speed of the slowest connection 
between a client and the farthest DB. While the final status of a 
transaction does not need to show up anytime soon on a user's screen, 
there still needs to be fast response for each individual user action.


How bad does the response get? I've done some simple tests comparing APP 
-LAN- DB versus APP -cross country VPN- DB. Even simple actions like 
inserting a recording and checking for a dupe key violation (e.g. almost 
no bandwidth needed) takes about 10,000 times longer than over a 100mbit 
LAN.




I still don't understand the purpose of replicating the pending data...


Imagine a checking account. A request to make an online payment can be 
made on any server. The moment the user submits a request, he sees it on 
his screen. This request is considered pending and not a true 
transaction yet. All requests are collected together via replication and 
the home server for that account then can check the account balance 
and decide whether there's enough funds to issue those payments.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql replication

2005-08-26 Thread Chris Travers

William Yu wrote:



This system sounds ok for documents and general data that can always 
be revived via version control/history. But I can't see how this would 
work for financial transactions where you're dealing with money and 
bank accounts. Suppose I have $100 in my account. I decided to login 
to multiple servers and wire transfer $100 to another account on every 
server. And I hit submit exactly at the same time for every server so 
check. Sure they can resolve the conflict afterwards in terms of 
saying in terms of which transfer to kill off. But the fact is that my 
other account has that N x $100 already and I've just fleeced the bank.



Hmmm...

I think you should realize by now that no multimaster async replication 
solution is going to adequately work for something like these financial 
transactions.  You need either synchronous or single-master simply 
because you need to have a concept that there is a *single* 
authoritative source of the current status and balance on the account.  
You can't really expect async-replicated servers to know about 
transactions that haven't been replicated yet can you?


In other words this problem is inherent to multimaster async 
replication.  There is no way around it.  If there was it would either 
not be async or not be multimaster :-)


I have built my own async multimaster replication systems for some 
applications.  One of them is actually a system where you have two and 
only two masters which replicate back and forth.  The trick I used here 
was for each master to use a different SERIAL field as its primary key 
so there are no duplicates.  Another option is to have a compound 
primary key which is a serial and a server-hash (this would scale to a 
larger number of masters).


Best Wishes,
Chris Travers
Metatron Technology Consulting
begin:vcard
fn:Chris Travers
n:Travers;Chris
email;internet:[EMAIL PROTECTED]
x-mozilla-html:FALSE
version:2.1
end:vcard


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Postgresql replication

2005-08-26 Thread Carlos Henrique Reimer
Hi,

I´mthinking to test your suggestion, basically because there are only few sites to connect, but there are some points that aren´t very clear to me.

My doubts:

1. How to make a view updatable? Using the rule system?
1. Why are insertshandled differently from updates? 
2. Can not I use the rule system to make updatesgo to the base table too (the master piece) like the inserts?
3. Is there not some peformance issues related with the fact of using a table that is an union instead a normal table?
Thanks in advance!

Reimer
Jeff Davis [EMAIL PROTECTED] escreveu:
Chris Browne wrote:  Slony-I is a master/slave asynchronous replication system; if you already considered it unsuitable, then I see little likelihood of other systems with the same sorts of properties being suitable.  What could conceivably be of use to you would be a *multimaster* asynchronous replication system. Unfortunately, multimaster *anything* is a really tough nut to crack. In general that's a difficult problem, but in practice there may be asolution.For instance, perhaps the following configuration would be helpful:Make a database for each physical server, called db1 ... dbN. Let yourlogical tables in each database be table1 ... tableM. Now, for eachlogical tableX (where 1 = X = M), make N physical tables, tableX_1 ...tableX_N. Now, make a view call
 ed
 tableX that is the UNION of tableX_1... tableX_N (tableX is not a real table, it's just a logical table).Now, use Slony-I. For each dbY (where 1 = Y = N), make dbY a masterfor tableX_Y (for all X where 1 = X = M) and a slave for tableX_Z (forall X,Z where 1 = X = M, Z != Y).Now, use a rule that replaces all INSERTs to tableX (where 1 = X = M)on dbY (where 1 = Y = N) with INSERTs to tableX_Y.That was my attempt at being unambiguous. In general what I mean is thateach database is master of one piece of a table, and slave to all theother pieces of that table, and then you have a view which is the unionof those pieces. That view is the logical table. Then have a RULE whichmakes INSERTs go to the physical table for which that database is master.The advantages: if one machine goes down, the rest keep going, andmerely miss the updates from that one site to that table. If one
 machinemakes an insert to the table, it quickly propogates to the othermachines and transparently becomes a part of the logical tables on thosemachines.The disadvantages: UPDATEs are difficult, and might end up with acomplicated set of rules/procedures/triggers. You may have to programthe application defensively in case the database is unable to update aremote database for various reasons (if the record to be updated is apart of a table for which another database is master). Also, since thesolution is asynchronous, the databases may provide different results tothe same query.In general, this solution does not account for all kinds of dataconstraints. The conflict resolution is very simplified because it'sbasically just the union of data. If that union could cause a constraintviolation itself, this solution might not be right for you. Forinstance, let's say you're tracking video rentals, and store policy says
 that
 you only rent one video per person. However, maybe they go to store1 and rent a video, and run to store 2 and rent a video before store 1sends the INSERT record over to store 2. Now, when they finally doattempt to UNION the data for the view, you have an inconsistant state.Many applications can get by just fine by UNIONing the data like that,and if not, perhaps work around it.I hope this is helpful. Let me know if there's some reason my plan won'twork.Regards,Jeff Davis---(end of broadcast)---TIP 9: In versions below 8.0, the planner will ignore your desire tochoose an index scan if your joining column's datatypes do notmatch__Converse com seus amigos em tempo real com o Yahoo! Messenger http://br.download.yahoo.com/messenger/ 

Re: [GENERAL] Postgresql replication

2005-08-26 Thread Chris Browne
[EMAIL PROTECTED] (Chris Travers) writes:
 William Yu wrote:
 This system sounds ok for documents and general data that can
 always be revived via version control/history. But I can't see how
 this would work for financial transactions where you're dealing
 with money and bank accounts. Suppose I have $100 in my account. I
 decided to login to multiple servers and wire transfer $100 to
 another account on every server. And I hit submit exactly at the
 same time for every server so check. Sure they can resolve the
 conflict afterwards in terms of saying in terms of which transfer
 to kill off. But the fact is that my other account has that N x
 $100 already and I've just fleeced the bank.

 Hmmm...

 I think you should realize by now that no multimaster async
 replication solution is going to adequately work for something like
 these financial transactions.  You need either synchronous or
 single-master simply because you need to have a concept that there
 is a *single* authoritative source of the current status and balance
 on the account.  You can't really expect async-replicated servers to
 know about transactions that haven't been replicated yet can you?

Actually, I disagree.  There _is_ a way to cope with that issue.

The methodology involves not having a Balance Column anywhere in the
database, where activity *always* involves recording the deltas.

If the DEBIT and the CREDIT for each financial transaction are always
committed, in any given place, as part of a single transaction, then
there is no reason for the system to fall out of balance.

This doesn't prevent the scenario of someone logging into many servers
simultaneously and emptying their account multiple times.  But there
are other ways of addressing that problem.

This sort of scenario is almost certainly the case for the
stupendously enormous quantities of billing transactions that result
from long distance and cellular activity in the telco industry.

I'm pretty sure that they _don't_ track balance updates for each
transaction that applies to a customer's account.  You could, via one
form of trickery or another, overdraw your account by a fairly hefty
amount, and they probably won't notice for a day or even three.  But
once they notice/estimate that the Gentle Caller has built up some
dangerously high balance, they'll warn of impending discontinuation of
service if some sum isn't paid by some date.

This sort of analysis does not require that any location is considered
authoritative for the balance of the account.  It suffices to have
some sort of transaction cutoff, that the analysis is sure to
include all transactions up to *yesterday* at midnight UTC.

Some bad things might take place during the up-to-24h lag; the
assumption is that that won't be material, or that you can do other
sorts of traffic analysis to warn of impending problem accounts...
-- 
output = (cbbrowne @ ntlug.org)
http://www.ntlug.org/~cbbrowne/sap.html
... The book [CLtL1] is about  400 pages of 8.5 by 11 Dover output.
Apparently  the publisher and  typesetter decided  that this  made the
lines too wide for  easy reading, so they will use a  6 by 9 format.
This  will make  the shape  of the  book approximately  cubical.  Now,
there are  26 chapters counting the  index, and a Rubik's  cube has 26
exterior cubies.  I'll let  you individually extrapolate and fantasize
from there.  -- GLS

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Postgresql replication

2005-08-26 Thread Chris Travers

William Yu wrote:


Chris Travers wrote:

Why not have the people who have rights to review this all write to 
the master database and have that replicated back?  It seems like 
latency is not really an issue.  Replication here is only going to 
complicate 



What master database? Having a single master defeats the purpose of 
load balancing to handle more users.


I guess I am thinking along different lines than you.  I was thinking 
that the simplest solution would be to have master/slave replication for 
*approved* transactions only and no replication for initial commits 
prior to approval.  This makes the assumption that a single transaction 
will be committed on a single server, and that a single transaction will 
not be split over multiple servers.  In this way, you can commit a 
pending transaction to any single server, and when it is approved, it 
gets replicated via the master.  See below for more.





 things.  If it were me, I would be having my approval app pull data
 from *all* of the databases independently and not rely on the
 replication for this part.  The replication could then be used to
 replicate *approved* data back to the slaves.

If your app client happens to have high speed access to all servers, 
fine. And you can guarantee uptime connections to all servers except 
for the rare cases of hardware failure. The problem is if you don't, 
you end up with every transaction running at the speed of the slowest 
connection between a client and the farthest DB. While the final 
status of a transaction does not need to show up anytime soon on a 
user's screen, there still needs to be fast response for each 
individual user action.


Well...  It depends on how it is implimented I guess.  If you pull 
transactional information in the background while the user is doing 
other things, then it shouldn't matter.  Besides, what should actually 
happen is that your connection is only as slow as the connection to the 
server which hosts the pending transaction you are trying to commit at 
the moment.  In this way, each request only goes to one server (the one 
which has the connection).  You could probably use DBI-Link and some 
clever materialized views to maintain the metadata at each location 
without replicating the whole transaction.  You could probably even use 
DBI-Link or dblink to pull the transactions in a transparent way.  Or 
you could replicate transactions into a pending queue dynamically...  
There are all sorts of ways you could make this respond well over slow 
connections.  Remember, PostgreSQL allows you to separate storage from 
presentation of the data, and this is quite powerful.




How bad does the response get? I've done some simple tests comparing 
APP -LAN- DB versus APP -cross country VPN- DB. Even simple 
actions like inserting a recording and checking for a dupe key 
violation (e.g. almost no bandwidth needed) takes about 10,000 times 
longer than over a 100mbit LAN.


I think you could design a database such that duplicate keys are not an 
issue and only get checked on the master and then should never be a 
problem.


Thinking about it  It seems here that one ends up with a sort of 
weird multi-master replication based on master/slave replication if 
you replicate these changes in the background (via another process, 
Notify, etc).






I still don't understand the purpose of replicating the pending data...



Imagine a checking account. A request to make an online payment can be 
made on any server. The moment the user submits a request, he sees it 
on his screen. This request is considered pending and not a true 
transaction yet. All requests are collected together via replication 
and the home server for that account then can check the account 
balance and decide whether there's enough funds to issue those payments.


Thinking about this  The big issue is that you only want to 
replicate the deltas, not the entire account.  I am still thinking 
master/slave, but something where the deltas are replicated in the 
background or where the user, in checking his account, is actually 
querying the home server.  This second issue could be done via dblink or 
DBI-Link and would simply require that a master table linking the 
accounts with home servers be replicated (this should, I think, be 
fairly low-overhead).


Best Wishes,
Chris Travers
Metatron Technology Consulting

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Postgresql replication

2005-08-26 Thread Matt Miller
On Thu, 2005-08-25 at 17:45 -0700, Jeff Davis wrote:
 The replicator surely is not optional, and must be centralized.

From http://pgcluster.projects.postgresql.org/1_3/index.html:

Several replication server can be set up. When an problem occurs at the
replication server, Cluster DB automatically changes connection to the
standing-by replication server. When all replication servers stop,
Cluster DB operates in stand-alone mode. There are two modes for
stand-alone mode: read only mode which only permits references, and read
write mode, which permits updating.

 PGCluster is query-based, right?

Yes.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Postgresql replication

2005-08-26 Thread William Yu

Chris Travers wrote:
I guess I am thinking along different lines than you.  I was thinking 
that the simplest solution would be to have master/slave replication for 
*approved* transactions only and no replication for initial commits 
prior to approval.  This makes the assumption that a single transaction 
will be committed on a single server, and that a single transaction will 
not be split over multiple servers.  In this way, you can commit a 
pending transaction to any single server, and when it is approved, it 
gets replicated via the master.  See below for more.


This works if you don't care that multiple servers commit transactions 
that force a budget or bank account to be exceeded.



Thinking about this  The big issue is that you only want to 
replicate the deltas, not the entire account.  I am still thinking 
master/slave, but something where the deltas are replicated in the 
background or where the user, in checking his account, is actually 
querying the home server.  This second issue could be done via dblink or 
DBI-Link and would simply require that a master table linking the 
accounts with home servers be replicated (this should, I think, be 
fairly low-overhead).


Except what you know have is your system fails if any server fail or is 
inaccessible.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Postgresql replication

2005-08-26 Thread Chris Travers

William Yu wrote:


Chris Travers wrote:

I guess I am thinking along different lines than you.  I was thinking 
that the simplest solution would be to have master/slave replication 
for *approved* transactions only and no replication for initial 
commits prior to approval.  This makes the assumption that a single 
transaction will be committed on a single server, and that a single 
transaction will not be split over multiple servers.  In this way, 
you can commit a pending transaction to any single server, and when 
it is approved, it gets replicated via the master.  See below for more.



This works if you don't care that multiple servers commit transactions 
that force a budget or bank account to be exceeded.


Ok. then lets look at possibility B.  (Async Multimaster Replication is 
out).




Thinking about this  The big issue is that you only want to 
replicate the deltas, not the entire account.  I am still thinking 
master/slave, but something where the deltas are replicated in the 
background or where the user, in checking his account, is actually 
querying the home server.  This second issue could be done via dblink 
or DBI-Link and would simply require that a master table linking the 
accounts with home servers be replicated (this should, I think, be 
fairly low-overhead).



Except what you know have is your system fails if any server fail or 
is inaccessible.


Ok.  If you have a system where each location is authoritative for its 
customers and the server transparently queries that server where needed 
(via a view). then when any server becomes inaccessible then the 
customers whose accounts are on that server become inaccessible.  This 
may not be accessible.  But if this is the case, then you could treat 
this as a partitioned table, where each partition is authoritative on 
one location (see a previous post on how to do this) and then use Slony 
to replicate.  Again this does not get around your objection above 
namely that it is possible to do duplicate transactions at multiple 
locations.  For this you would need an explicit call to the 
authoritative server.  I see no other way around that.  This might allow 
people to view tentative balances from other branches if the home (or 
its connection) server is down, but they would not be able to withdraw 
funds.


But if you do this, you have a different problem.  Namely that 
replication will be occuring over your slower than desired links.  As 
the number of branches grow, so will the bandwidth demands on every 
branch.  This may not therefore be scalable.


Unfortunately there is no free lunch here.  And I think that at some 
point you are going to have to choose between:


1)  Efficiency of network throughput
2)  Tolerance to attempts at repeat transactions before replication 
(emptying an account multiple times)

3)  Availability of a transaction.

You can pick any two.  I think that this is the way it will work with 
any other database system as well.


Best Wishes,
Chris Travers
Metatron Technology Consulting


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq





---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Chris Travers

Carlos Henrique Reimer wrote:

I read some documents about replication and realized that if you plan 
on using asynchronous replication, your application should be designed 
from the outset with that in mind because asynchronous replication is 
not something that can be easily “added on” after the fact.

Am I right?


Depending on your needs, you may find pgpool and Slony to be a workable 
combination. This is better when you have a lot of reads and only 
occasional writes. This way writes get redirected back to the master, 
and read-only transactions get run on the slaves.


Best Wishes,
Chris Travers
Metatron Technology Consulting

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Postgresql replication

2005-08-25 Thread William Yu
It provides pseudo relief if all your servers are in the same building. 
Having a front-end pgpool connector pointing to servers across the world 
is not workable -- performance ends up being completely decrepit due to 
the high latency.


Which is the problem we face. Great, you've got multiple servers for 
failover. Too bad it doesn't do much good if your building gets hit by 
fire/earthquake/hurricane/etc.



Aly Dharshi wrote:
I know I am wadding into this discussion as an beginner compared to the 
rest who have answered this thread, but doesn't something like pgpool 
provide relief for pseudo-multimaster replication, and what about 
software like sqlrelay wouldn't these suites help to some extent ? 
Looking forward to be enlightened.


Cheers,

Aly.

William Yu wrote:


Carlos Henrique Reimer wrote:

I read some documents about replication and realized that if you plan 
on using asynchronous replication, your application should be 
designed from the outset with that in mind because asynchronous 
replication is not something that can be easily “added on” after the 
fact.



Yes, it requires a lot foresight to do multi-master replication -- 
especially across high latency connections. I do that now for 2 
different projects. We have servers across the country replicating 
data every X minutes with custom app logic resolves conflicting data.


Allocation of unique IDs that don't collide across servers is a must. 
For 1 project, instead of using numeric IDs, we using CHAR and 
pre-append a unique server code so record #1 on server A is 
A01 versus ?x01 on other servers. For the other 
project, we were too far along in development to change all our 
numerics into chars so we wrote custom sequence logic to divide our 
10billion ID space into 1-Xbillion for server 1, X-Ybillion for server 
2, etc.


With this step taken, we then had to isolate (1) transactions could 
run on any server w/o issue (where we always take the newest record), 
(2) transactions required an amalgam of all actions and (3) 
transactions had to be limited to home servers. Record keeping stuff 
where we keep a running history of all changes fell into the first 
category. It would have been no different than 2 users on the same 
server updating the same object at different times during the day. 
Updating of summary data fell into category #2 and required parsing 
change history of individual elements. Category #3 would be financial 
transactions requiring strict locks were be divided up by client/user 
space and restricted to the user's home server. This case would not 
allow auto-failover. Instead, it would require some prolonged 
threshold of downtime for a server before full financials are allowed 
on backup servers.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Bohdan Linda

I would have a slight offtopic question, this is issue only of pgsql or
there are some other db solutions which have good performance when doing
this kind of replication across the world.

Regards,
Bohdan 

On Thu, Aug 25, 2005 at 09:01:49AM +0200, William Yu wrote:
 It provides pseudo relief if all your servers are in the same building. 
 Having a front-end pgpool connector pointing to servers across the world 
 is not workable -- performance ends up being completely decrepit due to 
 the high latency.
 
 Which is the problem we face. Great, you've got multiple servers for 
 failover. Too bad it doesn't do much good if your building gets hit by 
 fire/earthquake/hurricane/etc.
 
 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Tino Wildenhain

Bohdan Linda schrieb:

I would have a slight offtopic question, this is issue only of pgsql or
there are some other db solutions which have good performance when doing
this kind of replication across the world.


it depends entirely on your application. There is no one size
fits all

For example to have an online backup, WAL archiving to remote
sites is often sufficient.

However you cannot have synchronous multimaster replication
over slow lines and high performance with updates the same
time.

There is always a tradeoff in any (even in high cost
commercial solutions) you have to carefully consider.



Regards,
Bohdan 


On Thu, Aug 25, 2005 at 09:01:49AM +0200, William Yu wrote:

It provides pseudo relief if all your servers are in the same building. 
Having a front-end pgpool connector pointing to servers across the world 
is not workable -- performance ends up being completely decrepit due to 
the high latency.


Which is the problem we face. Great, you've got multiple servers for 
failover. Too bad it doesn't do much good if your building gets hit by 
fire/earthquake/hurricane/etc.


This would remove the application using that data too, or not? ;)


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Postgresql replication

2005-08-25 Thread William Yu
As far as I know, nobody has a generic solution for multi-master 
replication where servers are not in close proximity. Single master 
replication? Doable. Application specific conflict resolution? Doable. 
Off the shelf package that somehow knows financial transactions on a 
server shouldn't be duplicated on another? Uhh...I'd be wary of trying 
it out myself.



Bohdan Linda wrote:

I would have a slight offtopic question, this is issue only of pgsql or
there are some other db solutions which have good performance when doing
this kind of replication across the world.

Regards,
Bohdan 


On Thu, Aug 25, 2005 at 09:01:49AM +0200, William Yu wrote:

It provides pseudo relief if all your servers are in the same building. 
Having a front-end pgpool connector pointing to servers across the world 
is not workable -- performance ends up being completely decrepit due to 
the high latency.


Which is the problem we face. Great, you've got multiple servers for 
failover. Too bad it doesn't do much good if your building gets hit by 
fire/earthquake/hurricane/etc.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql replication

2005-08-25 Thread William Yu
Another tidbit I'd like to add. What has helped a lot in implementing 
high-latency master-master replication writing our software with a 
business process model in mind where data is not posted directly to the 
final tables. Instead, users are generally allowed to enter anything -- 
could be incorrect, incomplete or the user does not have rights -- the 
data is still dumped into pending tables for people with rights to 
fix/review/approve later. Only after that process is the data posted to 
the final tables. (Good data entered on the first try still gets pended 
-- validation phase simply assumes the user who entered the data is also 
the one who fixed/reviewed/approved.)


In terms of replication, this model allows for users to enter data on 
any server. The pending records then get replicated to every server. 
Each specific server then looks at it's own dataset of pendings to post 
to final tables. Final data is then replicated back to all the 
participating servers.


There may be a delay for the user if he/she is working on a server that 
doesn't have rights to post his data. However, the pending-post model 
gets users used to the idea of (1) entering all data in large swoop and 
validating/posting it afterwards and (2) data can/will sit in pending 
for a period of time until it is acted upon with somebody/some server 
with the proper authority. Hence users aren't expecting results to pop 
up on the screen the moment they press the submit button.





William Yu wrote:
Yes, it requires a lot foresight to do multi-master replication -- 
especially across high latency connections. I do that now for 2 
different projects. We have servers across the country replicating data 
every X minutes with custom app logic resolves conflicting data.


Allocation of unique IDs that don't collide across servers is a must. 
For 1 project, instead of using numeric IDs, we using CHAR and 
pre-append a unique server code so record #1 on server A is A01 
versus ?x01 on other servers. For the other project, we were too 
far along in development to change all our numerics into chars so we 
wrote custom sequence logic to divide our 10billion ID space into 
1-Xbillion for server 1, X-Ybillion for server 2, etc.


With this step taken, we then had to isolate (1) transactions could run 
on any server w/o issue (where we always take the newest record), (2) 
transactions required an amalgam of all actions and (3) transactions had 
to be limited to home servers. Record keeping stuff where we keep a 
running history of all changes fell into the first category. It would 
have been no different than 2 users on the same server updating the same 
object at different times during the day. Updating of summary data fell 
into category #2 and required parsing change history of individual 
elements. Category #3 would be financial transactions requiring strict 
locks were be divided up by client/user space and restricted to the 
user's home server. This case would not allow auto-failover. Instead, it 
would require some prolonged threshold of downtime for a server before 
full financials are allowed on backup servers.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Postgresql replication

2005-08-25 Thread William Yu

Tino Wildenhain wrote:
Which is the problem we face. Great, you've got multiple servers for 
failover. Too bad it doesn't do much good if your building gets hit 
by fire/earthquake/hurricane/etc.



This would remove the application using that data too, or not? ;)


Yes and no. If your DB is an internal app for a company where the users 
are in the same building as the servers, doesn't matter really I guess. 
Meteor hitting the building would kill the users in addition to the 
server so nobody will be calling you to complain about system downtime.


If your app is used by external customers who are all across the 
country, they want to continue to still use your software even though 
you and data center #1 are 6 feet under due to an 8.0 earthquake. They 
want auto-failover to data center #2 which is in close proximity to CIA 
headquarters and other juicy terrorist targets.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Postgresql replication

2005-08-25 Thread David Goodenough
On Thursday 25 August 2005 13:03, William Yu wrote:
 As far as I know, nobody has a generic solution for multi-master
 replication where servers are not in close proximity. Single master
 replication? Doable. Application specific conflict resolution? Doable.
 Off the shelf package that somehow knows financial transactions on a
 server shouldn't be duplicated on another? Uhh...I'd be wary of trying
 it out myself.

The most obvious one that does exactly this (generic multi-master
replication) is Lotus Domino.  It is not a relational DB, but not sufficiently
far off to stop the analogy.

Domino marks each document with a binary value which identifies the
server (built from a hash of the server name and the time the DB was
created) and a timestamp when it was last modified, and also each document
(record) has an ID (like OIDs).  More recent versions also do this at a field
level to avoid conflicts and speed replication.  When two servers replicate
they look for all documents modified since the last replication time, and 
compare the list.  Those only modified on one server are copied across
to the other server replacing the old record and carrying the updated on 
server and timestamp with them.  When a document is deleted in Domino it
actually does not dissapear, it is reduced to a deletion stub, and this gets
replicated as it has the same ID as the original record.  Those that have been
modified on both sides are copied to the other DB, but both records remain and
it is left to the user to resolve conflicts.  Field level replication reduces
the need for this considerably.  Periodically the deletion stubs are purged,
once all known replicas have replicated.

Domino has absolutely no concept of a master DB.

Obviously this scheme would be difficult to do on a pure relational system.
But with triggers and a few standard fields it would not be impossible to
do for a limited application set.  How the user would resolve conflicts would
also be application specific I suspect and how one would relate having two
version of a record in the DB then they both have a field which is supposed to
be unique is also a problem that would have to be resolved (Domino does not 
have the concept of unique keys).

David
  

 Bohdan Linda wrote:
  I would have a slight offtopic question, this is issue only of pgsql or
  there are some other db solutions which have good performance when doing
  this kind of replication across the world.
 
  Regards,
  Bohdan
 
  On Thu, Aug 25, 2005 at 09:01:49AM +0200, William Yu wrote:
 It provides pseudo relief if all your servers are in the same building.
 Having a front-end pgpool connector pointing to servers across the world
 is not workable -- performance ends up being completely decrepit due to
 the high latency.
 
 Which is the problem we face. Great, you've got multiple servers for
 failover. Too bad it doesn't do much good if your building gets hit by
 fire/earthquake/hurricane/etc.

 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Brad Nicholson

Chris Travers wrote:


Carlos Henrique Reimer wrote:

I read some documents about replication and realized that if you plan 
on using asynchronous replication, your application should be 
designed from the outset with that in mind because asynchronous 
replication is not something that can be easily “added on” after the 
fact.

Am I right?



Depending on your needs, you may find pgpool and Slony to be a 
workable combination. This is better when you have a lot of reads and 
only occasional writes. This way writes get redirected back to the 
master, and read-only transactions get run on the slaves.



As long as you don't have any functions that write to the db.  pgpool 
could (and likely would) redirect some of these to the subscriber.  
Slony would prevent the data from being written (which  would prevent 
the subscriber from being corrupted).


--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Brad Nicholson

Bohdan Linda wrote:


I would have a slight offtopic question, this is issue only of pgsql or
there are some other db solutions which have good performance when doing
this kind of replication across the world.

 

It all depends on the quality of the connection Node A to Node B.  If 
connectivity is poor, then it is impossible to have good performance 
doing anything across that connection.


--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Tino Wildenhain

William Yu schrieb:

Tino Wildenhain wrote:

Which is the problem we face. Great, you've got multiple servers for 
failover. Too bad it doesn't do much good if your building gets hit 
by fire/earthquake/hurricane/etc.




This would remove the application using that data too, or not? ;)



Yes and no. If your DB is an internal app for a company where the users 
are in the same building as the servers, doesn't matter really I guess. 
Meteor hitting the building would kill the users in addition to the 
server so nobody will be calling you to complain about system downtime.


If your app is used by external customers who are all across the 
country, they want to continue to still use your software even though 
you and data center #1 are 6 feet under due to an 8.0 earthquake. They 
want auto-failover to data center #2 which is in close proximity to CIA 
headquarters and other juicy terrorist targets.


Sure, but in this case a simple async master-slave (slony-1)
and the usual failover (also DNS-failover) should be sufficient.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Chris Browne
[EMAIL PROTECTED] (Aly Dharshi) writes:
 I know I am wadding into this discussion as an beginner compared to
 the rest who have answered this thread, but doesn't something like
 pgpool provide relief for pseudo-multimaster replication, and what
 about software like sqlrelay wouldn't these suites help to some extent
 ? Looking forward to be enlightened.

pgpool and sqlrelay provide you a loose equivalent to synchronous
replication, as they have to submit the queries to *all* of the nodes.

If you can live with the long latency times that result if the nodes
are widely separated, that's great.

Unfortunately, the reason for people to want *asynchronous*
multimaster replication is that they /cannot/ afford that latency
time.  They want to submit updates only to the local database, and
have the updates head to the other server some time later.

That's why pgpool/sqlrelay aren't an answer.
-- 
output = (cbbrowne @ acm.org)
http://cbbrowne.com/info/languages.html
Rules of the  Evil Overlord #46. If an advisor says  to me My liege,
he  is but  one man.  What can  one man  possibly do?,  I  will reply
This. and kill the advisor. http://www.eviloverlord.com/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Chris Browne
[EMAIL PROTECTED] (David Goodenough) writes:
 On Thursday 25 August 2005 13:03, William Yu wrote:
 As far as I know, nobody has a generic solution for multi-master
 replication where servers are not in close proximity. Single master
 replication? Doable. Application specific conflict resolution? 
 Doable.  Off the shelf package that somehow knows financial
 transactions on a server shouldn't be duplicated on another? 
 Uhh...I'd be wary of trying it out myself.

 The most obvious one that does exactly this (generic multi-master
 replication) is Lotus Domino.  It is not a relational DB, but not
 sufficiently far off to stop the analogy.

There's a better known case, and that's PalmOS PalmSync.

It does fairly much the same thing as Lotus Notes(tm).

In both cases, they are doing something that works reasonably well for
data that looks somewhat like documents.  Conflict resolution at the
document level is something that users can handle pretty well by
hand.

Unfortunately, for data that's more at the hordes of little facts
level, conflict resolution is a tough problem :-(.
-- 
let name=cbbrowne and tld=cbbrowne.com in String.concat @ [name;tld];;
http://www.ntlug.org/~cbbrowne/internet.html
Rules of the  Evil Overlord #223. I will  install a fire extinguisher
in  every room  --  three, if  the  room contains  vital equipment  or
volatile chemicals. http://www.eviloverlord.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Chris Browne
[EMAIL PROTECTED] (Bohdan Linda) writes:
 I would have a slight offtopic question, this is issue only of pgsql or
 there are some other db solutions which have good performance when doing
 this kind of replication across the world.

Asynchronous multimaster replication is pretty much a generally tough
problem.

Oracle, Sybase, and DB2 all have methods of doing it; none are either
simple, straightforward, or transparent to use.

It's a tough problem, in general.
-- 
(reverse (concatenate 'string gro.mca @ enworbbc))
http://cbbrowne.com/info/rdbms.html
I think we might have been better off with a slide rule.
-- Zaphod Beeblebrox

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Jeff Davis
Jim C. Nasby wrote:
 Or, for something far easier, try
 http://pgfoundry.org/projects/pgcluster/ which provides syncronous
 multi-master clustering.
 

He specifically said that pgcluster did not work for him because the
databases would be at physically seperate locations. PGCluster requires
that there be a load balancer and a replicator centrally located
managing the cluster. If a network problem happens at the centralized
location, it would bring down all locations completely.

I think he's looking for an async solution because of that. In my
solution, if one location goes down, the others keep going.

Regards,
Jeff Davis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Carlos Henrique Reimer
ExactlyJeff Davis [EMAIL PROTECTED] escreveu:
Jim C. Nasby wrote: Or, for something far easier, try http://pgfoundry.org/projects/pgcluster/ which provides syncronous multi-master clustering. He specifically said that pgcluster did not work for him because thedatabases would be at physically seperate locations. PGCluster requiresthat there be a load balancer and a replicator centrally locatedmanaging the cluster. If a network problem happens at the centralizedlocation, it would bring down all locations completely.I think he's looking for an async solution because of that. In mysolution, if one location goes down, the others keep going.Regards,Jeff Davis
		Yahoo! Acesso Grátis: Internet rápida e grátis. Instale o discador agora!

Re: [GENERAL] Postgresql replication

2005-08-25 Thread Matt Miller
  http://pgfoundry.org/projects/pgcluster/ which provides syncronous
  multi-master clustering.

 He specifically said that pgcluster did not work for him
 because ...PGCluster requires that there be a load balancer and a
 replicator centrally located managing the cluster. If a network
 problem happens at the centralized location, it would bring down all
 locations completely.

I think the load balancer is an optional component.  Clients can connect
either to a load balancer, or directly to a specific machine.

Also, I think pgcluster can operate async.  If the system runs in
normal mode then the client gets a response as soon as the transaction
is committed on the local machine.  In reliable mode the client waits
for the commit to happen on all machines.

See
http://pgcluster.projects.postgresql.org/structure_of_replication.html

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Steve Atkins
On Thu, Aug 25, 2005 at 12:20:52PM -0400, Chris Browne wrote:

 [EMAIL PROTECTED] (Bohdan Linda) writes:
  I would have a slight offtopic question, this is issue only of pgsql or
  there are some other db solutions which have good performance when doing
  this kind of replication across the world.
 
 Asynchronous multimaster replication is pretty much a generally tough
 problem.
 
 Oracle, Sybase, and DB2 all have methods of doing it; none are either
 simple, straightforward, or transparent to use.
 
 It's a tough problem, in general.

I've been involved in one project that did multi-master async replication
across geographically separated servers.

It worked but was operationally so horrible that I've sworn never to
architect a system that requires multi-master replication at the
database ever again. Application level? Possibly. But not at the DB
level.

Cheers,
  Steve

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Postgresql replication

2005-08-25 Thread William Yu

David Goodenough wrote:

The most obvious one that does exactly this (generic multi-master
replication) is Lotus Domino.  It is not a relational DB, but not sufficiently
far off to stop the analogy.

Domino marks each document with a binary value which identifies the
server (built from a hash of the server name and the time the DB was
created) and a timestamp when it was last modified, and also each document
(record) has an ID (like OIDs).  More recent versions also do this at a field
level to avoid conflicts and speed replication.  When two servers replicate


This system sounds ok for documents and general data that can always be 
revived via version control/history. But I can't see how this would work 
for financial transactions where you're dealing with money and bank 
accounts. Suppose I have $100 in my account. I decided to login to 
multiple servers and wire transfer $100 to another account on every 
server. And I hit submit exactly at the same time for every server so 
check. Sure they can resolve the conflict afterwards in terms of saying 
in terms of which transfer to kill off. But the fact is that my other 
account has that N x $100 already and I've just fleeced the bank.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Jeff Davis
Carlos Henrique Reimer wrote:
 Exactly
 

Was there something lacking in my suggested solution at:
http://archives.postgresql.org/pgsql-general/2005-08/msg01240.php

It's a little complicated to administer, but it seems well-suited to a
company that has several locations that want to share data without being
too dependent upon eachother.

Regards,
Jeff Davis

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Chris Browne
[EMAIL PROTECTED] (Brad Nicholson) writes:
 Bohdan Linda wrote:

I would have a slight offtopic question, this is issue only of pgsql or
there are some other db solutions which have good performance when doing
this kind of replication across the world.



 It all depends on the quality of the connection Node A to Node B.  If
 connectivity is poor, then it is impossible to have good performance
 doing anything across that connection.

The nifty magic of asynchronous multimaster would be that you'd be 
able to have fast access at Site Alpha to Node Alpha as well as 
fast access at Site Beta to Node Beta. 
 
That would, indeed, be valuable.  That's why it's one of those Holy 
Grail things...  It's Rather Hard, which is why there isn't a 
Slony-III: The Wrath of Async MultiMaster just yet. 
 
It would be valuable for someone to look into a replication system to 
support that sort of scenario.  With the stipulation that there be 
some big, fiery dragons there ready to eat you and your data :-). 
-- 
(format nil [EMAIL PROTECTED] cbbrowne ntlug.org)
http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
I'm guilty of a lot of things, but I didn't ever do that.''  
-- Bill Clinton, on why he had astroturf lining the back of his pickup
   truck
[In fact, he _DID_ do this, thus making life creepier than fiction...]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Chris Browne
[EMAIL PROTECTED] (Steve Atkins) writes:
 On Thu, Aug 25, 2005 at 12:20:52PM -0400, Chris Browne wrote:
 [EMAIL PROTECTED] (Bohdan Linda) writes:
  I would have a slight offtopic question, this is issue only of pgsql or
  there are some other db solutions which have good performance when doing
  this kind of replication across the world.
 
 Asynchronous multimaster replication is pretty much a generally tough
 problem.
 
 Oracle, Sybase, and DB2 all have methods of doing it; none are either
 simple, straightforward, or transparent to use.
 
 It's a tough problem, in general.

 I've been involved in one project that did multi-master async replication
 across geographically separated servers.

 It worked but was operationally so horrible that I've sworn never to
 architect a system that requires multi-master replication at the
 database ever again. Application level? Possibly. But not at the DB
 level.

I have heard something of the same argument being thrown at attempts
to use things like CORBA to allow you to pretend that there is no
difference between local and remote access to things.

There are some good arguments to be made, there.

You really need to access remote data in different ways than local
data because the latency *will* kill you if you ignore it.

It's a well and neat idea to try to find ways to hide those
differences under the carpet; it is quite likely that the better way
will involve addressing that somewhere in the application, as opposed
to trying to get the DB (or the ORB or other 'distributed thingie') to
do it for you.

Tables that contain balances (e.g. - credit balances, account
balances, and such) would be a good example.  You do NOT want to
distribute the process of updating balances across some
hideous-latency Great Divide link.  That heads, of course, to
application design, not to pure DB level activity...
-- 
(format nil [EMAIL PROTECTED] cbbrowne cbbrowne.com)
http://www.ntlug.org/~cbbrowne/advocacy.html
Tools that are no good require more skill.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Chris Browne
William Yu [EMAIL PROTECTED] writes:
 David Goodenough wrote:
 The most obvious one that does exactly this (generic multi-master
 replication) is Lotus Domino.  It is not a relational DB, but not 
 sufficiently
 far off to stop the analogy.
 Domino marks each document with a binary value which identifies the
 server (built from a hash of the server name and the time the DB was
 created) and a timestamp when it was last modified, and also each document
 (record) has an ID (like OIDs).  More recent versions also do this at a field
 level to avoid conflicts and speed replication.  When two servers replicate

 This system sounds ok for documents and general data that can always
 be revived via version control/history. But I can't see how this
 would work for financial transactions where you're dealing with
 money and bank accounts. Suppose I have $100 in my account. I
 decided to login to multiple servers and wire transfer $100 to
 another account on every server. And I hit submit exactly at the
 same time for every server so check. Sure they can resolve the
 conflict afterwards in terms of saying in terms of which transfer to
 kill off. But the fact is that my other account has that N x $100
 already and I've just fleeced the bank.

There are two pretty reasonable ways to address this:

1.  Your application does not replicate balances; those are always
computed locally.  Only *transactions* are relayed.

2.  Balance updates take place as a special kind of delta update
where replication transfers around (old.balance - new.balance) instead
of either of the values.

Either of these leads to the nifty case where that bank discovers that
you have overdrawn your account, and then takes whatever actions they
deem are appropriate next.

Such as billing you $29 for each overdraft transaction.

You're hardly going to be retiring to the Bahamas on Nx$100, are you
:-).
-- 
select 'cbbrowne' || '@' || 'ntlug.org';
http://cbbrowne.com/info/internet.html
Rules of the Evil Overlord  #211. If my chief engineer displeases me,
he will be shot, not imprisoned  in the dungeon or beyond the traps he
helped design. http://www.eviloverlord.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Chris Travers

William Yu wrote:

Another tidbit I'd like to add. What has helped a lot in implementing 
high-latency master-master replication writing our software with a 
business process model in mind where data is not posted directly to 
the final tables. Instead, users are generally allowed to enter 
anything -- could be incorrect, incomplete or the user does not have 
rights -- the data is still dumped into pending tables for people 
with rights to fix/review/approve later. Only after that process is 
the data posted to the final tables. (Good data entered on the first 
try still gets pended -- validation phase simply assumes the user who 
entered the data is also the one who fixed/reviewed/approved.)



In this case, why have multimaster replication at all?

Why not have the people who have rights to review this all write to the 
master database and have that replicated back?  It seems like latency is 
not really an issue.  Replication here is only going to complicate 
things.  If it were me, I would be having my approval app pull data from 
*all* of the databases independently and not rely on the replication for 
this part.  The replication could then be used to replicate *approved* 
data back to the slaves.




There may be a delay for the user if he/she is working on a server 
that doesn't have rights to post his data. However, the pending-post 
model gets users used to the idea of (1) entering all data in large 
swoop and validating/posting it afterwards and (2) data can/will sit 
in pending for a period of time until it is acted upon with 
somebody/some server with the proper authority. Hence users aren't 
expecting results to pop up on the screen the moment they press the 
submit button.



I still don't understand the purpose of replicating the pending data...

Best Wishes,
Chris Travers
Metatron Technology Consulting

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Postgresql replication

2005-08-25 Thread Jeff Davis
Matt Miller wrote:
http://pgfoundry.org/projects/pgcluster/ which provides syncronous
multi-master clustering.

He specifically said that pgcluster did not work for him
because ...PGCluster requires that there be a load balancer and a
replicator centrally located managing the cluster. If a network
problem happens at the centralized location, it would bring down all
locations completely.
 
 
 I think the load balancer is an optional component.  Clients can connect
 either to a load balancer, or directly to a specific machine.

The replicator surely is not optional, and must be centralized.

 Also, I think pgcluster can operate async.  If the system runs in
 normal mode then the client gets a response as soon as the transaction
 is committed on the local machine.  In reliable mode the client waits
 for the commit to happen on all machines.
 

Interesting. I suppose whatever works for your application is the right
answer for replication. PGCluster is query-based, right? I suppose the
question would then be, would he rather use a trigger-based replication
 solution, like Slony, or a query-based replication solution.

Regards,
Jeff Davis

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Postgresql replication

2005-08-24 Thread Carlos Henrique Reimer
Hello,

Currently we have only one database accessed by the headquarter and two branches but the performance in the branches is very poor and I was invited to discover a way to increase it.

One possible solution is replicate the headquarter DB into the two branches.

I read about slony-i, but then the replicated DBs will be read-only.

Pgcluster is a sync solution and I think is not fited for us becase the replicated DBs will be located remotely and we have a lot of updates on the DBs.

I think I´m looking for a master-slave assynchronous solution. I know pgReplicator can do it, but I think the project is not active any more.

Are there another solutions?

Thanks in advance!

Reimer__Converse com seus amigos em tempo real com o Yahoo! Messenger http://br.download.yahoo.com/messenger/ 

Re: [GENERAL] Postgresql replication

2005-08-24 Thread Michael Meskes
Am Mittwoch, 24. August 2005 14:21 schrieb Carlos Henrique Reimer:
 One possible solution is replicate the headquarter DB into the two
 branches.

 I read about slony-i, but then the replicated DBs will be read-only.

That's because it's a master-slave replication. If you could sync the slave 
back to the master it would be a master itself.

 I think I´m looking for a master-slave assynchronous solution. I know
 pgReplicator can do it, but I think the project is not active any more.

But Slony does master/slave replication. 

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Postgresql replication

2005-08-24 Thread Welty, Richard
[mailto:[EMAIL PROTECTED] Behalf Of Michael Meskes

Am Mittwoch, 24. August 2005 14:21 schrieb Carlos Henrique Reimer:
 One possible solution is replicate the headquarter DB into the two
 branches.

 I read about slony-i, but then the replicated DBs will be read-only.

That's because it's a master-slave replication. If you could sync the slave 
back to the master it would be a master itself.

 I think I´m looking for a master-slave assynchronous solution. I know
 pgReplicator can do it, but I think the project is not active any more.

But Slony does master/slave replication. 

i think carlos is confused about master-slave vs multi-master.

carlos,
master-slave async is much easier than multi-master async. yes, pgreplicator
can in theory do multi-master async in a restricted sort of way (the
generalized multi-master async problem is fairly intractable), but you're
right in that pgreplicator is not maintained anymore (and moreover it depends
on a no-longer-maintained dialect of tcl.) if i had infinite time and energy,
i'd be working on a reimplementation of pgreplicator in C, but i don't have
either and i don't see anyone offering to pay me to do it, so it'll stay on
the list of wanna-do projects for the time being.

richard

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Postgresql replication

2005-08-24 Thread Chris Browne
[EMAIL PROTECTED] (Carlos Henrique Reimer) writes:
 Currently we have only one database accessed by the headquarter and
 two branches but the performance in the  branches is very poor  and
 I was invited to discover a way to increase it.

 One possible solution is replicate the headquarter DB into the two
 branches.

 I read about slony-i, but then the replicated DBs will be read-only.

Correct.

 Pgcluster is a sync solution and I think is not fited for us becase
 the replicated DBs will be located remotely and we have a lot of
 updates on the DBs.

Unfortunately, pgcluster isn't much maintained anymore.

 I think I´m looking for a master-slave assynchronous solution. I
 know pgReplicator can do it, but I think the project is not active
 any more.

Slony-I is a master/slave asynchronous replication system; if you
already considered it unsuitable, then I see little likelihood of
other systems with the same sorts of properties being suitable.

What could conceivably be of use to you would be a *multimaster*
asynchronous replication system.  Unfortunately, multimaster
*anything* is a really tough nut to crack.

There is a Slony-II project ongoing that is trying to construct a
more-or-less synchronous multimaster replication system (where part of
the cleverness involves trying to get as much taking place in an
asynchronous fashion as possible) that would almost certainly be of no
use to your use case.

The most successful multimaster asynchronous replication system that
I am aware of is the PalmComputing PalmSync system.  

It would presumably be possible to use some of the components of
Slony-I to construct a multimaster async replication system.  A
pre-requisite would be the creation of some form of distributed
sequence which would try to minimize the conflicts that arise out of
auto-generation of sequence numbers.

But beyond that lies the larger challenge of conflict resolution.

Slony-I, as a single-master system, does not need to address
conflicts, as changes must be made on the master and propagate
elsewhere.

Synchronous multimaster systems address conflicts by detecting them
when they occur and rejecting one or another of the conflicting
transactions.

Asynchronous multimaster systems require some sort of conflict
management/resolution system for situations where tuples are being
concurrently updated on multiple nodes.  How that is managed is, well,
troublesome :-(.  The PalmSync approach is that if it finds conflicts,
it duplicates records and leaves you, the user, to clean things up.
That may not be suitable for every kind of application...
-- 
(reverse (concatenate 'string gro.gultn @ enworbbc))
http://cbbrowne.com/info/slony.html
((LAMBDA (X) (X X)) (LAMBDA (X) (X X)))

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Postgresql replication

2005-08-24 Thread Carlos Henrique Reimer
I read some documents about replicationand realized that if you plan on using asynchronous replication, your application should be designed from the outset with that in mind because asynchronous replication is not something that can be easily “added on” after the fact. 

Am I right?

Reimer

__Converse com seus amigos em tempo real com o Yahoo! Messenger http://br.download.yahoo.com/messenger/ 

Re: [GENERAL] Postgresql replication

2005-08-24 Thread Welty, Richard



Carlos Henrique Reimer 
wrote:
I read some documents about 
replicationand realized
 that if you plan on using asynchronous 
replication, your
 application should be designed from the outset 
with that
 in mind because asynchronousreplication is 
not something
 that can be easily added on after the 
fact.

Am I right?

certainly, if your goal is a pgreplicator style multi-master async, 
this
is 
correct, as you have to make decisions about the direction 
of
data 
flow, id generation, and conflict resolution up front.

if you 
want slony-I style single master/multi slave, you don't have 
to
do so 
much advancethinking as records are only being inserted 
into
the 
system on the single master.

richard



Re: [GENERAL] Postgresql replication

2005-08-24 Thread Jeff Davis
Chris Browne wrote:
 
 Slony-I is a master/slave asynchronous replication system; if you
 already considered it unsuitable, then I see little likelihood of
 other systems with the same sorts of properties being suitable.
 
 What could conceivably be of use to you would be a *multimaster*
 asynchronous replication system.  Unfortunately, multimaster
 *anything* is a really tough nut to crack.
 

In general that's a difficult problem, but in practice there may be a
solution.

For instance, perhaps the following configuration would be helpful:

Make a database for each physical server, called db1 ... dbN. Let your
logical tables in each database be table1 ... tableM. Now, for each
logical tableX (where 1 = X = M), make N physical tables, tableX_1 ...
tableX_N. Now, make a view called tableX that is the UNION of tableX_1
... tableX_N (tableX is not a real table, it's just a logical table).

Now, use Slony-I. For each dbY (where 1 = Y = N), make dbY a master
for tableX_Y (for all X where 1 = X = M) and a slave for tableX_Z (for
all X,Z where 1 = X = M, Z != Y).

Now, use a rule that replaces all INSERTs to tableX (where 1 = X = M)
on dbY (where 1 = Y = N) with INSERTs to tableX_Y.

That was my attempt at being unambiguous. In general what I mean is that
each database is master of one piece of a table, and slave to all the
other pieces of that table, and then you have a view which is the union
of those pieces. That view is the logical table. Then have a RULE which
makes INSERTs go to the physical table for which that database is master.

The advantages: if one machine goes down, the rest keep going, and
merely miss the updates from that one site to that table. If one machine
makes an insert to the table, it quickly propogates to the other
machines and transparently becomes a part of the logical tables on those
machines.

The disadvantages: UPDATEs are difficult, and might end up with a
complicated set of rules/procedures/triggers. You may have to program
the application defensively in case the database is unable to update a
remote database for various reasons (if the record to be updated is a
part of a table for which another database is master). Also, since the
solution is asynchronous, the databases may provide different results to
the same query.

In general, this solution does not account for all kinds of data
constraints. The conflict resolution is very simplified because it's
basically just the union of data. If that union could cause a constraint
violation itself, this solution might not be right for you. For
instance, let's say you're tracking video rentals, and store policy says
that you only rent one video per person. However, maybe they go to store
1 and rent a video, and run to store 2 and rent a video before store 1
sends the INSERT record over to store 2. Now, when they finally do
attempt to UNION the data for the view, you have an inconsistant state.

Many applications can get by just fine by UNIONing the data like that,
and if not, perhaps work around it.

I hope this is helpful. Let me know if there's some reason my plan won't
work.

Regards,
Jeff Davis




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Postgresql replication

2005-08-24 Thread Welty, Richard
Jeff Davis writes:
I hope this is helpful. Let me know if there's some reason my plan won't
work.

 
 look at the solution in pgreplicator. site ids are embedded in the
 id columns in the tables, so there only m tables, and a bit less insanity.
 

That doesn't work with Slony-I unfortunately. I don't know much about
pgreplicator, but if it does something similar to what I'm talking
about, maybe it's a good thing to look into.

it'd be an excellent thing to look into if it were in any way supported or
maintained. it's a dead project (unfortunately.)

richard

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Postgresql replication

2005-08-24 Thread Jeff Davis
Welty, Richard wrote:
 Jeff Davis writes:
 
 
The disadvantages:
 
 
 one more: if you actually have m tables and n servers, you have
 m x n tables in reality, which is pretty miserable scaling behavior.
 i should think that rules, triggers, and embedded procedures would
 explode in complexity rather rapidly.
 
 i know i wouldn't want to administer one of these if there were a lot
 of sites.
 

True, but in practice n will usually be fairly reasonable. In
particular, his setup sounded like it would be only a few.

Also, you're really talking about scalability of administration. I don't
 think performance will be significantly impacted.

 
I hope this is helpful. Let me know if there's some reason my plan won't
work.
 
 
 look at the solution in pgreplicator. site ids are embedded in the
 id columns in the tables, so there only m tables, and a bit less insanity.
 

That doesn't work with Slony-I unfortunately. I don't know much about
pgreplicator, but if it does something similar to what I'm talking
about, maybe it's a good thing to look into.

Regards,
Jeff Davis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


  1   2   >