[GENERAL] SSDs - SandForce or not?

2012-11-13 Thread Toby Corkindale

Hi,
I'm wondering which type of SSDs would be better for use with PostgreSQL.

Background:
At the moment, SSD drives fall into two categories..
Those that use internal-compression on the SandForce controller, which 
gives very fast speeds for compressible data; and those that don't.


In benchmarks, the compressing style of drive do extremely well at 
random writes as long as there's semi-compressible-data involved. They 
still do well if uncompressible data is used, just usually not quite as 
well as the competitors.


When it comes to reading data, there's no real difference.

So I just wondered how this might apply to PostgreSQL's workload?

I think the on-disk data is going to consist of a lot of random reads 
and writes, with what I suspect is data that *does* compress quite well. 
(At least on my data sets, that is. If I use gzip or lzma on the 
postgres data directly, it gets MUCH smaller)


So on the face of it, I think the Sandforce-based drives are probably a 
winner here, so I should look at the Intel 520s for evaluation, and 
whatever the enterprise equivalent are for production.


I wondered if anyone else wiser than I has thought about this yet 
though.. are there any downsides to that combination?


cheers,
Toby


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] general fear question about move PGDATA from one Disc to another

2012-11-13 Thread Craig Ringer
On 11/13/2012 11:26 PM, Aleksandar Lazic wrote:
> Dear listmembers,
>
> I need to move
>
> /var/lib/postgresql/8.4/main
>
> from the / partion to another disc.

If so, you're probably using `pg_wrapper` for cluster management.
Confirm that with `pg_lsclusters`. If the command exists and it shows an
8.4 installation with the data directory you mentioned above, then
you're using pg_wrapper to manage Pg.

pg_wrapper reads /etc/postgresql/[version]/[clustername]/postgresql.conf
to locate the cluster. For example, yours will be
/etc/postgresql/8.4/main/postgresql.conf . This means you don't have to
edit any init script settings on your system to get Pg to start properly
next boot.

> 0.) Mount new pgroot, are there any hints for the mount command?
>
> mount -t ext4 -o noatime,nodiratime,nodev,noexec,nosuid /dev/sde1
> /pgroot
>
> output from mount
>
> /dev/sde1 on /pgroot type ext4
> (rw,nosuid,nodev,noexec,noatime,nodiratime,barrier=1,data=ordered)

Unless you add this to fstab as well, the file system won't mount at the
next boot and PostgreSQL will fail to start.
> 3.) copy the current /var/lib/postgresql/8.4/main to the new dir as
> user postgres
> cd /var/lib/postgresql/
> tar cvf - . | (cd /pgroot/pgdata && tar xvf -)

What an odd way to do the copy. I'd use `cp -aR`, or at least use the
preserve flag (-p) to tar.


If you like you can have the new file system (assuming it's dedicated to
just PostgreSQL) mount where the old data directory was, so there's no
change visible in the system.

edit /etc/fstab and add a line like:

UUID=b4d54649-a9b5-4a57-aa22-291791ad7a3c /var/lib/postgresql/ ext4
defaults,noatime  0 0

Replace the UUID shown with the UUID of your new file system, determined
with the vol_id command on older systems, or blkid on newer ones. Or
just use the device node for the partition, like /dev/sdx9

Make a full pg_dumpall backup.

Now stop all your PostgreSQL clusters with pg_ctlcluster and:

mv /var/lib/postgresql/ mv /var/lib/postgresql.old
mkdir /var/lib/postgresql
mount /var/lib/postgresql
chown postgres:postgres /var/lib/postgresql
shopt -s dotglob
cp -aR /var/lib/postgresql.old/* /var/lib/postgresql/
Start the cluster with pg_ctlcluster

You've just migrated the files from the old file system to the new one
without having to change the logical location, by mounting the new file
system where the system expected it to be already.

Again, you can remove /var/lib/postgresql.old when you're sure it's all
gone fine.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] window functions in an UPDATE

2012-11-13 Thread Paul Jungwirth
Is there a trick to using window functions to SET columns in an UPDATE?
Here is the query I'd like to run:

  UPDATE  profiles
  SET score_tier = percent_rank()
OVER (PARTITION BY site_id ORDER BY score ASC)
  WHERE   score IS NOT NULL

But that gives me an error on Postgres 9.1:

ERROR:  cannot use window function in UPDATE

This alternate version works, but is messier and slower:

  UPDATE  profiles p
  SET score_tier = x.perc
  FROM(SELECT id,
  percent_rank() OVER (PARTITION BY site_id ORDER BY
score ASC) AS perc
   FROM   profiles p2
   WHERE  score IS NOT NULL) AS x
  WHERE   p.id = x.id
  AND p.score IS NOT NULL

That second version is also prone to deadlocks if another job is updating
the profiles table at the same time, even with a query like this:

UPDATE "profiles" SET "updated_at" = '2012-11-13 21:53:23.840976' WHERE
"profiles"."id" = 219474

Is there any way to reformulate this query so it is cleaner, faster, and
not prone to deadlocks?

Thanks,
Paul

-- 
_
Pulchritudo splendor veritatis.


Re: [GENERAL] Running out of memory while making a join

2012-11-13 Thread Tom Lane
Carlos Henrique Reimer  writes:
> That is what I got from gdb:

>   ExecutorState: 11586756656 total in 1391 blocks; 4938408 free (6
> chunks); 11581818248 used

So, query-lifespan memory leak.  After poking at this for a bit, I think
the problem has nothing to do with joins; more likely it's because you
are returning a composite column:

select wm_nfsp from "5611_isarq".wm_nfsp ...

I found out that record_out() leaks sizable amounts of memory, which
won't be recovered till end of query.  You could work around that by
returning "select wm_nfsp.*" instead, but if you really want the result
in composite-column notation, I'd suggest applying this patch:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c027d84c81d5e07e58cd25ea38805d6f1ae4dfcd

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

2012-11-13 Thread Jeff Janes
On Mon, Nov 12, 2012 at 12:04 PM, Lists  wrote:
>
>
> Should I increase the max_workers field from the default of 3 to (perhaps)
> 10?

I would not.  You report that the reason you turned off autovac is
because it made your database crawl when it kicked in.  That suggests
that if anything you should reduce that parameter (assuming you are
still doing manual vacuums at off-peak hours, leaving autovacuum to
only mop up what is left).

>
> Why would I want to reduce the cost delay to 0, and how does this relate to
> cost_limit? Careful reading of the docs:
> http://www.postgresql.org/docs/9.1/static/runtime-config-resource.html makes
> me believe that, given my substantial I/O subsystem, I'd want to drop
> cost_delay to near zero and set the cost_limit really high, which is a rough
> restatement of the last quoted paragraph above. (I think)

Given that autovac kicking in destroys your performance, I think that
your I/O subsystem may not be all that you think it is.  Do you have
test/dev/QA system with the same subsystem that you can use for
investigation?  If so, do you have a vaguely realistic load generator
to drive those systems?

> Assuming that I make these suggestions and notice a subsequent system load
> problem, what information should I be gathering in order to provide better
> post-incident forensics?

If you are going to be focusing your undivided attention on monitoring
the system during the period, just keeping a window open with "top"
running is invaluable. (On most implementations, if you hit 'c' it
will toggle the command display so you can see the results of
"update_process_title=on")

Also, "sar" is useful, and on most systems has the advantage that its
stats are always being gathered without you having to do anything, so
it works well for unexpected problems arising.  I often just have
"vmstat 1 -t" running in the background streaming into a log file, for
the same reason.

For internal to pgsql, set log_min_duration_statement to a value which
few statements will exceed under normal operations, but many will when
things bog down.  That way you can figure out exactly when things
bogged down after unattended operation, to correlate it with the
sar/vmstat/etc reports.

I'd also set for the probationary period (if you haven't already):

log_lock_waits = on
log_checkpoints = on
log_autovacuum_min_duration = 0 (or some smallish positive value)

The last one only logs when it finishes vacuuming a table.  I wish
there was a way to make it log when it started as well, but I don't
think there is.

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Using window functions to get the unpaginated count for paginated queries

2012-11-13 Thread Clemens Park
Hi all,

Recently, during a performance improvement sweep for an application at my
company, one of the hotspots that was discovered was pagination.

In order to display the correct pagination links on the page, the
pagination library we used (most pagination libraries for that matter) ran
the query with OFFSET and LIMIT to get the paginated results, and then
re-ran the query without the OFFSET and LIMIT and wrapped them in a SELECT
COUNT(*) FROM main_query to get the total number of rows.

In an attempt to optimize this, we used a window function as follows:

Given a query that looked as follows:

SELECT a,b,c
FROM table
WHERE clauses
OFFSET x LIMIT y;

add total_entries_count column as follows:

SELECT COUNT(table.id) OVER () AS total_entries_count, a,b,c
FROM table
WHERE clauses
OFFSET x LIMIT y;

This calculates the total number of unpaginated rows correctly, without
affecting the runtime of the query.  At least as far as I can tell.

The questions I have are:

1) Are there any adverse effects that the above window function can have?
2) Are there any cases where the count would return incorrectly?
3) In general, is this an appropriate use-case for using window functions?

Thanks,
Clemens


Re: [GENERAL] Understanding streaming replication

2012-11-13 Thread Philippe Amelant


Le 13/11/2012 14:57, Albe Laurenz a écrit :

Philippe Amelant wrote:


So i was thinking it was just a reconnect to the sender (and I can see
the standby trying to reconnect in the log)

Hmmm.  I think I was too quick when I said no.

If you ship the WAL archives including the "history" file to the
standby, then the standby should be able to recover across the
timeline change from the archives (if you have recovery_target_timeline
set to "latest" in recovery.conf) and then reestablish streaming
replication.

I never tried that though.

(The patch I quoted above would allow the timeline change via
streaming replication.)

Yours,
Laurenz Albe


You're right
I added
recovery_target_timeline='latest'

in the recovery.conf then I promoted the standby.

The replication on the second standby stopped with a message
complaining about timeline.

Then I copied the archived wal from the new master to the (stopped) 
standby (in pg_xlog)


The standby restarted on the new timeline and the datas seem to be ok.

I also tried to just copy the last 00X.history in pg_xlog and it 
work too.

I suppose this could fail if max_wal_keep_segment is too low

Thanks you very much for your help.
Could you just point me where you found this information in the doc ?

Regards


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] general fear question about move PGDATA from one Disc to another

2012-11-13 Thread Aleksandar Lazic

Dear listmembers,

I need to move

/var/lib/postgresql/8.4/main

from the / partion to another disc.

Based on the

http://wiki.postgresql.org/wiki/Simple_Configuration_Recommendation#File_System_Layouts

I plan the following.

0.) Mount new pgroot, are there any hints for the mount command?

mount -t ext4 -o noatime,nodiratime,nodev,noexec,nosuid /dev/sde1 
/pgroot


output from mount

/dev/sde1 on /pgroot type ext4 
(rw,nosuid,nodev,noexec,noatime,nodiratime,barrier=1,data=ordered)



1.) create a full backup of the current DB

/usr/bin/pg_dumpall --file=/tmp/backup_before_part_move.tar 
--format=tar --oids --verbose --username=postgres


2.) Stop the database and the Apps which use the DB
/etc/init.d/postgresql stop


3.) copy the current /var/lib/postgresql/8.4/main to the new dir as 
user postgres

cd /var/lib/postgresql/
tar cvf - . | (cd /pgroot/pgdata && tar xvf -)

4.) sync filesystems
sync

5.) change data_directory to /pgroot/pgdata/8.4/main

6.) Start postgres
/etc/init.d/postgresql start

OS: ubuntu 11.04
PG: postgresql-8.4 8.4.14-0ubuntu11.04

Please can anybody take a look about my planned process and tell me if 
I have

forgotten something and maybe point me to the right Doc, thanks.

Thanks for feedback.

Best regards
Aleks


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PG under OpenVZ?

2012-11-13 Thread Benjamin Henrion
On Tue, Nov 13, 2012 at 2:53 PM, François Beausoleil
 wrote:
> Hi!
>
> I've found an old thread on OpenVZ:
>
> (2008): http://archives.postgresql.org/pgsql-performance/2008-03/msg00076.php
>
> And a more recent question that scared me a bit:
>
> (2011): http://serverfault.com/questions/281783/running-mongodb-with-openvz
>
> On the PostgreSQL general mailing list, I've only found 54 results when 
> searching for OpenVZ. I'm wondering if OpenVZ is simply unpopular, or not 
> used at all for PG. What experiences do you have with OpenVZ? Any performance 
> problems?
>
> We're buying bare metal to run our clusters on, and the supplier is late 
> delivering the machines. They suggested lending us a machine and run 
> PostgreSQL under OpenVZ. When the real hardware is ready, we'd migrate the VZ 
> over to the new physical servers. Thoughts on this?
>
> I have no experience with OpenVZ itself, so if you have general comments 
> about it's stability and/or performance, even unrelated to PostgreSQL, I'd 
> appreciate.

I run all my PG productions/tests servers on openvz, you just have to
be careful to have swap activated, and SHMPAGES tuned, as well as
SHMMAX.

Otherwise it runs like a charm.

--
Benjamin Henrion 
FFII Brussels - +32-484-566109 - +32-2-3500762
"In July 2005, after several failed attempts to legalise software
patents in Europe, the patent establishment changed its strategy.
Instead of explicitly seeking to sanction the patentability of
software, they are now seeking to create a central European patent
court, which would establish and enforce patentability rules in their
favor, without any possibility of correction by competing courts or
democratically elected legislators."


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Detect the side effect of 'using' clause and adding coulms

2012-11-13 Thread salah jubeh
Hello Guys;

Today, I have noticed that my dumps are not working due to bad practice in 
writing SQL queries. 


In the past,  I there was a scenario where I there are two tables,  one of them 
is completely dependent on the other.  i.e.   the foreign key and the primary 
key constraint assigned to the same column.  Please have a look on this is a 
fictional example, 


CREATE TABLE a
(
  a_id serial NOT NULL,
  a_name text,
  CONSTRAINT a_pkey PRIMARY KEY (a_id)
)
WITH (
  OIDS=FALSE
);

INSERT INTO a VALUES (1, 'Big design up front');
INSERT INTO a VALUES (2, 'iterative and incremental');
INSERT INTO a VALUES (3, 'OR mappers are slow');

DROP TABLE IF EXISTS b CASCADE;
CREATE TABLE b
(
  b_id serial NOT NULL,
  b_name text,
  CONSTRAINT b_pkey PRIMARY KEY (b_id),
  CONSTRAINT b_b_id_fkey FOREIGN KEY (b_id)
  REFERENCES a (a_id) MATCH SIMPLE
  ON UPDATE NO
 ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY IMMEDIATE
)
WITH (
  OIDS=FALSE
);

INSERT INTO B VALUES (1, 'waterfall');
INSERT INTO B VALUES (2, 'XP'); 


Now I have a table which maps also  A and B in many to many relation  such as 


CREATE TABLE c
(
  c_id serial NOT NULL,
  a_id integer NOT NULL,
  b_id integer NOT NULL,
  CONSTRAINT c_pkey PRIMARY KEY (c_id),
  CONSTRAINT c_a_id_fkey FOREIGN KEY (a_id)
  REFERENCES a (a_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT c_b_id_fkey FOREIGN KEY (b_id)
  REFERENCES b (b_id) MATCH SIMPLE
  ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
  OIDS=FALSE
);

INSERT INTO c VALUES (1,1,3);
INSERT INTO c VALUES
 (2,2,3);
INSERT INTO c VALUES (2,2,1); -- iterative and incremental waterfall model

The problem I had is that, many queries are written using the 'using clause' 
such as 


REATE VIEW c_a_b AS 
    SELECT * FROM 
    C JOIN B USING (b_id)
    JOIN A USING  (a_id);

Up till now no problems, But,  if I change the relationship between A and B by 
having another column called a_id in the B table which references the a (a_id)  
-Please see the code below- , I get problems in restore because I am joining 
using a filed which is ambiguous -Exists in two tables- . 


ALTER TABLE B ADD COlUMN a_id INTEGER;
ALTER TABLE B DROP CONSTRAINT b_b_id_fkey;
ALTER TABLE B ADD CONSTRAINT b_a_id_fkey FOREIGN KEY (a_id) REFERENCES a (a_id) 
MATCH SIMPLE   ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE;


 This is the error 

pg_restore: [Archivierer (DB)] Fehler in Inhaltsverzeichniseintrag  VIEW c_a_b 
postgres
pg_restore: [Archivierer (DB)] could not execute query: ERROR:  common column 
name "a_id" appears more than once in left table


I have a question:

1. How we can detect these errors, and how views are stored and manipulated in 
the database server . If I run  SELECT * FROM a_b_c , everything will go fine.  
I discover this only by using dump and restore.


Regards

Re: [GENERAL] Enabling Autovacuum Postgres 9.1 (was Unexpectedly high disk space usage)

2012-11-13 Thread Kevin Grittner
Greg Williamson wrote:

 running transactions can cause autovacuum processes to stall
 out or be autocancelled. "Long running transactions" - is now
 long? In our system it's rare to have a transaction (even a
 prepared transaction) last much longer than a few minutes. Is
 that enough time to cause problems with AutoVacuum?
>>
>> The only situation where I would expect that to be a problem is in
>> a very small table which is updated hundreds of times per second.
> 
> Could you elaborate on this, or point me to a previous thread ?

I had a situation where there was a need for consecutive numbers
(i.e., no gaps, to satisfy financial auditors), so we needed
assignment of these numbers to be transactional rather than using
SEQUENCE or SERIAL objects. There was a very small table for
assigning these numbers, which was very frequently updated.  In a
quiescent state all rows in the table would fit in one page. Before
tuning autovacuum to be more aggressive, the table bloated, causing
performance to fall off. Then autovacuum would kick in and it would
get even worse. So naturally, my first response was to make
autovacuum less aggressive, which caused the table to bloat even more
under normal load, and caused autovacuum to have an even bigger
impact when it did kick in. The table bloated to thousands of pages.

Then I tried the opposite approach: I cranked up autovacuum to be
very aggressive. Under normal load the table settled in at five to
ten pages and performance was great. However, any long-running
transaction could cause some bloat, so a big report could still cause
this one table to become a performance problem. I found that a
CLUSTER ran sub-second, because autovacuum did pare the index down to
just the needed entries, so I set up a crontab job to CLUSTER this
one table a few times per day. That worked out great for me.

I think the trick is to try to make autovacuum keep up as much as
possible, identify any issues it is not handling, and narrowly target
those particular areas with extraordinary maintenance.

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of Shared Memory: max_locks_per_transaction

2012-11-13 Thread Merlin Moncure
On Fri, Nov 9, 2012 at 2:50 PM, Eliot Gable
 wrote:
>>> one thing that can cause this unfortunately is advisory locks eating
>>> up exactly the amount of shared memory you have.  that's another thing
>>> to rule out.
>>
>> How would I rule this out?
>
> It really was filling the locks table.
>
> Using your suggestion, I managed to catch it in the process of the bad
> behavior, before it exhausted all lock entries. After some sleuthing through
> the resulting pg_locks output and my other code, I was able to isolate and
> resolve the issue. Basically, there was a call going on which tried to
> materialize a stats table based on thousands of records instead of 10 at a
> time. It was supposed to just be materializing the base rows in that table,
> all zeroed out, not based on any of the records. However, it does so using
> the same function which actually crunches numbers for the records, and it
> was coded to try all records from start of day until the function was run!

awesome...glad I could help.  in the case of advisory locks, to help
reduce the likelihood of things like this happening, it's always
better to use the recently added 'xact' flavor of the functions that
release the lock at 'end of transaction' when possible.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding streaming replication

2012-11-13 Thread Albe Laurenz
Philippe Amelant wrote:
>>> I'm setting up a 3 nodes cluster and after some tests
>>> I just discover that the cascading slave does not recover.

>> Right, switching timeline over streaming replication
>> is not supported yet.  There's a patch by Heikki in
>> the pipeline for this, so it will probably work in 9.3.
> 
> So if I understand it, I need to rebuild the cascading slave if I
> promote the first standby.
> Is there a way to follow the new master without rebuild ?

>>> As far as I can see in the 9.2 documentation it should work after
>>> an automatic reconnect to the new master.
>> Where did you see that?
> 
> I found this
> 
> http://www.postgresql.org/docs/9.2/static/warm-standby.html
> 25.2.6. Cascading Replication
> Promoting a cascading standby terminates the immediate downstream
> replication connections which it serves. This is because the
> timeline becomes different between standbys, and they can no longer
> continue replication. The affected standby(s) may reconnect to
> reestablish streaming replication.
> 
> 
> So i was thinking it was just a reconnect to the sender (and I can see
> the standby trying to reconnect in the log)

Hmmm.  I think I was too quick when I said no.

If you ship the WAL archives including the "history" file to the
standby, then the standby should be able to recover across the
timeline change from the archives (if you have recovery_target_timeline
set to "latest" in recovery.conf) and then reestablish streaming
replication.

I never tried that though.

(The patch I quoted above would allow the timeline change via
streaming replication.)

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PG under OpenVZ?

2012-11-13 Thread François Beausoleil
Hi!

I've found an old thread on OpenVZ:

(2008): http://archives.postgresql.org/pgsql-performance/2008-03/msg00076.php

And a more recent question that scared me a bit:

(2011): http://serverfault.com/questions/281783/running-mongodb-with-openvz

On the PostgreSQL general mailing list, I've only found 54 results when 
searching for OpenVZ. I'm wondering if OpenVZ is simply unpopular, or not used 
at all for PG. What experiences do you have with OpenVZ? Any performance 
problems?

We're buying bare metal to run our clusters on, and the supplier is late 
delivering the machines. They suggested lending us a machine and run PostgreSQL 
under OpenVZ. When the real hardware is ready, we'd migrate the VZ over to the 
new physical servers. Thoughts on this?

I have no experience with OpenVZ itself, so if you have general comments about 
it's stability and/or performance, even unrelated to PostgreSQL, I'd appreciate.

Thanks!
François

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding streaming replication

2012-11-13 Thread Philippe Amelant

Hello,
Thank for all this informations

Le 13/11/2012 09:31, Albe Laurenz a écrit :

Philippe Amelant wrote:

I'm setting up a 3 nodes cluster and after some tests
I just discover that the cascading slave does not recover.

Right, switching timeline over streaming replication
is not supported yet.  There's a patch by Heikki in
the pipeline for this, so it will probably work in 9.3.


So if I understand it, I need to rebuild the cascading slave if I 
promote the first standby.

Is there a way to follow the new master without rebuild ?

As far as I can see in the 9.2 documentation it should work after
an automatic reconnect to the new master.

Where did you see that?


I found this

   http://www.postgresql.org/docs/9.2/static/warm-standby.html
   25.2.6. Cascading Replication
   Promoting a cascading standby terminates the immediate downstream
   replication connections which it serves. This is because the
   timeline becomes different between standbys, and they can no longer
   continue replication. The affected standby(s) may reconnect to
   reestablish streaming replication.


So i was thinking it was just a reconnect to the sender (and I can see 
the standby trying to reconnect in the log)



Is there any chance to get this fixed in 9.2.x ?

No.  It is a new feature, and those aren't backpatched.


In case of disaster on master and on standby, can I just restart the
cascading slave
after removing recovery.conf ?

The correct way it to "pg_ctl promote".


Would it  be better to copy all archives log from the master in pg_xlog
on the third node
and then restart it ?
What is the best way to get back this node with minimal loss?

You can copy the archives, then wait until replication has
caught up, then promote the standby.


Ok thanks, I will work on this.

Regards



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding streaming replication

2012-11-13 Thread Albe Laurenz
Pawel Veselov wrote:
>> From your later comments, it's also apparent that these archived WALs will 
>> be useless after
>> failover (for the purpose of recovery), so there is no reason to send them 
>> to all the nodes after all.
> 
> I obviously lost it here. The archives do need to be synchronized, for the 
> purpose of recovering
> slaves. If a slave dies, and I want to recover it, it may need the archived 
> WALs, and for this, the
> archives should be available on the node. So, rsync (or something like that) 
> is necessary. But it's a
> bad idea to run the rsync from the archive command itself.

Right, that's exactly what I tried to say.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding streaming replication

2012-11-13 Thread Albe Laurenz
Pawel Veselov wrote:
>>> QUESTION: If multiple pgpools are running, and if there are no network 
>>> problems,
>>> and configuration
>>> files are identical, is there any guarantee that the same stand-by node 
>>> will be selected for
>>> promotion? Concern here is that with configuration of (M-SB0-SB1) one 
>>> pgpool decides
>>> to promote SB0
>>> and another - SB1, causing both of them to enter master mode, and splitting 
>>> the cluster.
>>> It does look
>>> that pgpool will always select next "alive" node for promotion, but I 
>>> couldn't find
>>> a definitive statement on that.
>> 
>> I don't know about pgpool and its abilities to handle
>> cluster failover, but I wouldn't go this way at all.
>> Even if the answer were that in the circumstances you
>> describe things would work, you can depend on it that
>> things will go wrong in ways different from what you
>> expect, e.g. a broken network card.
>> The consequences would be worse than I'd like to imagine.
> 
> I would imagine this situation will happen in any case, I don't logically see 
> how it's avoidable. If
> you only have one agent that has power to promote a node to be a new master, 
> you have SPF. If you have
> multiple agents that can do the promotion, there is always a risk that they 
> fall out of sync.

Cluster software usually has the cluster nodes communicate
regularly, and if anything fails, the nodes try to form
groups where everybody can reach everybody else.
The group that is bigger than half of the original
nodes wins, turns off the others and takes over their
services.

>> If you want reliable automatic failover, consider cluster
>> software.
> 
> Anything you could please recommend?

The only thing I have seen is RedHat's Cluster Suite, which
is commercial.  I would recommend to have at least three nodes
though, because the two node cluster we had was subject to
spurious failovers on short quorum disk hiccups.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Understanding streaming replication

2012-11-13 Thread Albe Laurenz
Philippe Amelant wrote:
> I'm setting up a 3 nodes cluster and after some tests
> I just discover that the cascading slave does not recover.

Right, switching timeline over streaming replication
is not supported yet.  There's a patch by Heikki in
the pipeline for this, so it will probably work in 9.3.

> As far as I can see in the 9.2 documentation it should work after
> an automatic reconnect to the new master.

Where did you see that?

> Is there any chance to get this fixed in 9.2.x ?

No.  It is a new feature, and those aren't backpatched.

> In case of disaster on master and on standby, can I just restart the
> cascading slave
> after removing recovery.conf ?

The correct way it to "pg_ctl promote".

> Would it  be better to copy all archives log from the master in pg_xlog
> on the third node
> and then restart it ?
> What is the best way to get back this node with minimal loss?

You can copy the archives, then wait until replication has
caught up, then promote the standby.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general