Re: [GENERAL] Streaming replication on 9.1-beta2 after pg_restore is very slow

2011-07-08 Thread David Hartveld

Op 07-07-11 18:23, Tom Lane schreef:

Karsten Hilbertkarsten.hilb...@gmx.net  writes:

On Thu, Jul 07, 2011 at 11:14:05AM -0400, Tom Lane wrote:

I'm betting Debian hasn't fixed that bug yet either and so you need this
post-beta2 patch:

http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=c2ba0121c73b7461331104a46d140156e847572a



Do they know about this gcc bug ?


Can't say about Debian in particular, but upstream gcc certainly knows
about it.
https://bugzilla.redhat.com/show_bug.cgi?id=712480
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49390


I'm guessing this is probably the cause of my problem, then. I'll try to 
rebuild with an earlier version of GCC, and let you know the results.


Thanks to all of you!
David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming replication on 9.1-beta2 after pg_restore is very slow

2011-07-08 Thread David Hartveld

Op 08-07-11 10:19, David Hartveld schreef:

Op 07-07-11 18:23, Tom Lane schreef:

Karsten Hilbertkarsten.hilb...@gmx.net writes:

On Thu, Jul 07, 2011 at 11:14:05AM -0400, Tom Lane wrote:

I'm betting Debian hasn't fixed that bug yet either and so you need
this
post-beta2 patch:

http://git.postgresql.org/gitweb/?p=postgresql.gita=commitdiffh=c2ba0121c73b7461331104a46d140156e847572a




Do they know about this gcc bug ?


Can't say about Debian in particular, but upstream gcc certainly knows
about it.
https://bugzilla.redhat.com/show_bug.cgi?id=712480
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49390


I'm guessing this is probably the cause of my problem, then. I'll try to
rebuild with an earlier version of GCC, and let you know the results.


I've rebuilt with debian gcc 4.4.5-8 and am running 9.1 now without any 
of the above problems. I've reported a bug for debian postgresql-9.1. 
I've also seen that the current gcc-4.6 version for debian is 4.6.1, so 
that shouldn't create any problems anymore (or at least the gcc bug is 
marked as applied in 4.6.1 in the gcc bugtracker).


Thanks for all your time, guys!
Greetings, David

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming replication on 9.1-beta2 after pg_restore is very slow

2011-07-07 Thread David Hartveld
 -Oorspronkelijk bericht-
 Van: Simon Riggs [mailto:si...@2ndquadrant.com]
 Verzonden: donderdag 7 juli 2011 01:07
 Aan: David Hartveld
 CC: pgsql-general@postgresql.org
 Onderwerp: Re: [GENERAL] Streaming replication on 9.1-beta2 after pg_restore
 is very slow
 
 On Wed, Jul 6, 2011 at 4:54 PM, David Hartveld david.hartv...@mendix.com
 wrote:
 
  Is there possibly a known issue with the beta, or do I have to
  configure my cluster differently for 9.1?
 
 Thanks for trying 9.1beta
 
 No known bugs, no differences in configuration.
 
 You haven't enabled any of the new 9.1 features either so they aren't likely 
 to be
 at issue.
 
 So there's something pretty badly screwed up somewhere, though that looks like
 pilot error at the moment, sorry.
 
 I'd suggest starting again and see if you can get a reproduceable bug.
 I'd be very grateful if you can narrow things down to produce a tight bug 
 report.

I've just submitted bug report 6094, with a complete description of what I have 
done. The replication stream is reproducibly very slow. If you need more 
information, let me know.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Streaming replication on 9.1-beta2 after pg_restore is very slow

2011-07-07 Thread David Hartveld
  Your output indicates that there is a problem in your replication
  setup and this is why the slave does not catch up.
 
  This is not a performance issue. It is either a bug in replication, or
  a user configuration issue. Since few things have changed in 9.1 in
  this area, at the moment the balance of probability is towards user
  error. If you can provide a more isolated bug report we may be able to
 investigate.
 
 Apologies for the double post, I thought to have understood that in your
 previous message.
 
 I've read the online 9.1 manual and configured the clusters based on that
 information (and on the defaults provided by debian). I've attached the
 postgresql.conf files I'm using for master and slave. Do you need other
 information from my final setup? Log files, configuration files, the SQL 
 script fed
 to psql, which shows the slow replication...?

I've been looking at my log files on master and slave a bit better, after 
having set log_min_messages = debug5. I can see that somehow the master and 
slave don't properly work together: the slave attempts to send some data 
('sending write/flush/apply') (I'm assuming this is the slaves current location 
in the WAL?) and then 'terminates process due to administrator command', while 
the master is sending data ('write/flush/apply') (the next part of the WAL?), 
and then 'could not send data to the client: Connection reset by peer', after 
which the server process exits. I'm hoping this provides you with more 
information on what is going on. Do point me in the right direction if you need 
me to investigate further. I have attached two pieces of the master and slave 
log files, which should correspond w.r.t. their interaction, where you can see 
the above behavior.

Hoping that this will bring me a bit closer to a solution or a proper bug 
report,
David Hartveld


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Streaming replication on 9.1-beta2 after pg_restore is very slow

2011-07-06 Thread David Hartveld
Hi all,

I am experimenting with (synchronous) streaming replication on postgresql 9.1 
beta 2 and am having performance problems. I have initially set up an 
(asynchronous) streaming replication master cluster with postgresql 9.0, which 
streamed to a single slave cluster. This seemed to work quite well. Then I've 
mostly copied the configuration to a 9.1 beta 2 cluster (master and slave) to 
see how synchronous replication would behave.

The master cluster, when empty after an initdb (pg_createcluster on debian) 
seems to properly stream changes to one or more slave clusters when correctly 
set up. I watch the master and slave with pg_current_xlog_location() on master 
and pg_last_xlog_receive_location() and pg_last_xlog_replay_location(). It 
seems that slaves pick up changes, such as a simple database creation, or 
updating a role password, or adding a role. But when I then do a restoredb on 
the master,  the slaves quickly fall behind and only very slowly catch up 
(after maybe an hour or something...)

The log on the slave is filled with statements similar to the following:
LOG: streaming replication successfully connected to primary
LOG:  record with zero length at 0/9B7A010
FATAL:  terminating walreceiver process due to administrator command

The slave log file also contains the following line a number of times (with the 
numbers of course a bit different every time):
   LOG: invalid magic number  in log file 0, segment 9, offset 
10878976

The log on the master contains several lines with:
   LOG:  could not send data to client: Connection reset by peer

Is there possibly a known issue with the beta, or do I have to configure my 
cluster differently for 9.1? I'm a bit lost, and would appreciate any comments. 
Below, I've added server configurations. I'm running postgresql from debian 
packages from the experimental suite.

Thanks,
David Hartveld
--
== Master configuration ==
version;PostgreSQL 9.1beta2 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real 
(Debian 4.6.0-12) 4.6.1 20110608 (prerelease), 64-bit
archive_command;cp %p /walshipping/9.1/sr-master/%f
archive_mode;on
bytea_output;escape
client_encoding;UNICODE
external_pid_file;/var/run/postgresql/9.1-sr-master.pid
lc_collate;en_US.UTF-8
lc_ctype;en_US.UTF-8
listen_addresses;*
log_line_prefix;%t 
max_connections;100
max_stack_depth;2MB
max_wal_senders;3
port;5434
server_encoding;UTF8
shared_buffers;96MB
ssl;on
synchronous_standby_names;*
TimeZone;localtime
unix_socket_directory;/var/run/postgresql
wal_buffers;3MB
wal_keep_segments;32
wal_level;hot_standby

== Slave configuration ==
version;PostgreSQL 9.1beta2 on x86_64-pc-linux-gnu, compiled by gcc-4.6.real 
(Debian 4.6.0-12) 4.6.1 20110608 (prerelease), 64-bit
bytea_output;escape
client_encoding;UNICODE
external_pid_file;/var/run/postgresql/9.1-sr-slave0.pid
hot_standby;on
lc_collate;en_US.UTF-8
lc_ctype;en_US.UTF-8
listen_addresses;*
log_line_prefix;%t 
max_connections;100
max_stack_depth;2MB
port;5434
server_encoding;UTF8
shared_buffers;96MB
ssl;on
TimeZone;localtime
unix_socket_directory;/var/run/postgresql
wal_buffers;3MB