Re: [GENERAL] Shared Buffer Size

2011-05-28 Thread Carl von Clausewitz
Hi Preetika,

a few months ago, when I installed my first PostgreSQL, I have had the same
problem. I've try to get any information about optimal memory config, and
working, but there wasn't any optimal memory setting calculator on the
internet, just some guide in the posgre documentation (
http://www.postgresql.org/docs/9.0/interactive/kernel-resources.html#SYSVIPC).
I got FreeBSD 8.2 AMD64, with 8 GB of memory (this server is just for
PostgreSQL and a little PHP app with 2 user), and I have theese setting in
postgresql.conf (which are not the default):

listen_addresses = '192.168.1.1' # what IP address(es) to listen on;
port = 5432 # (change requires restart)
max_connections = 200 # (There are 20 user, with Microsoft Access client and
ODBC connections... (min 6 connection / user))

shared_buffers = 1900MB # min 128kB
temp_buffers = 64MB # min 800kB
work_mem = 64MB # min 64kB
maintenance_work_mem = 1024MB # min 1MB
max_stack_depth = 64MB # min 100kB

shared_preload_libraries = '$libdir/plpgsql.so' # (change requires restart)

checkpoint_segments = 32 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 15min # range 30s-1h
checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0

effective_cache_size = 4096MB

track_activities = on
track_counts = on
#track_functions = none # none, pl, all
#track_activity_query_size = 1024 # (change requires restart)

update_process_title = off
deadlock_timeout = 1s
max_locks_per_transaction = 256 # min 10

And the sysctl.conf from BSD, which are relevant for theese postgre
settings:
kern.ipc.shmall=524288
kern.ipc.shmmax=2147483648
kern.ipc.semmap=512
kern.ipc.shm_use_phys=1

And the last one is the loader.conf from BSD, which are relevant for theese
postgre settings:
kern.ipc.semmni=512
kern.ipc.semmns=1024
kern.ipc.semmnu=512

Theese settings based on my experience, with lot of reboot and restart and
reload config - I hope this can help you, and I accept any comment, if I
need to set everything else :-)

Thanks,
Carl
2011/5/27 preetika tyagi preetikaty...@gmail.com

 Hi Derrick,

 Thank you for your response.
 I saw this document and trying to understand Interaction with the
 Operating System Cache which is mentioned in this document.

 I have the following question-
 Hows does the shared buffer in Postgres rely on the Operating System
 cache?
 Suppose my RAM is 8 GB and shared_buffer is 24 MB in postgres. And there
 are some dirty pages in shared_buffer and I need to write a dirty page back
 to the disk to bring in a new page. What happens in this case? The dirty
 page will be written to the disk considering the shared_buffer size as 24
 MB? or it will not be written and will stay in RAM which is 8 GB?

 Thanks,
 Preetika


 On Fri, May 27, 2011 at 2:11 PM, Derrick Rice derrick.r...@gmail.comwrote:

 Check out the Inside the PostgreSQL Buffer Cache link here:

 http://projects.2ndquadrant.com/talks

 Thanks to Greg Smith (active here).

 Derrick


 On Fri, May 27, 2011 at 3:36 PM, preetika tyagi 
 preetikaty...@gmail.comwrote:

 Hi All,

 I am little confused about the internal working of PostgreSQL. There is a
 parameter shared_buffer in postgres.conf and I am assuming that it is used
 for buffer management in PostgreSQL. If there is a need to bring in a new
 page in the buffer and size exceeds the shared_buffer limit, a victim dirty
 page will be written back to the disk.

 However, I have read on many links that PostgreSQL depends on the OS for
 caching. (http://www.varlena.com/GeneralBits/Tidbits/perf.html#shbuf)

 So my question is, the actual limit of the shared buffer will be defined
 by OS or the shared_buffer parameter in the postgres.conf to figure whether
 a victim dirty page needs to be selected for disk write or not?

 Thanks!






Re: [GENERAL] Postgres questions

2011-05-28 Thread Trenta sis
Dbi link was configured with instruction form dbi-link. Where I can find
this scripts:
Can you connect to Cache from a stand-alone Perl script using Perl DBI?

If so, can you connect to Cache from a PL/Perlu script using Perl DBI?

Is MS-SQL performance OK when connecting from a stand-alone Perl script
outside the database using Perl DBI?

Is MS-SQL performance OK when connecting from a custom test plperlu script
using Perl DBI?


Queries in mssql are sql select from talbe on mssql with about 400.000 rows.

About replication now I can't upgrade to 9.0 with 8.3 what possible options
exists, if is active/passive and is not real ha also it can be OK

Thanks

2011/5/24 Craig Ringer cr...@postnewspapers.com.au

 On 05/24/2011 05:15 PM, Trenta sis wrote:

  That I need is to connect to another databse (Cache Intersystems) to use
 select from Postgres. I have tried to configure dbilink but is not
 working with this database, with sql server seems to work but with poor
 peroformance.


 You really need to be specific about things like not working and poor
 performance. Nobody can help you if that's all you say, you need to give
 error messages, versions, commands, configurations, etc. You haven't shown
 any queries, any EXPLAIN ANALYZE output, any schema, any timing data, or
 pretty much anything else that'd allow anyone to help you. It might be worth
 remedying that. See:

 In addition to that, some obvious things to test and report back on
 include:

 Can you connect to Cache from a stand-alone Perl script using Perl DBI?

 If so, can you connect to Cache from a PL/Perlu script using Perl DBI?

 Is MS-SQL performance OK when connecting from a stand-alone Perl script
 outside the database using Perl DBI?

 Is MS-SQL performance OK when connecting from a custom test plperlu script
 using Perl DBI?

 What do you expect performance to be like? Why? What kind of queries are
 you executing? Which ones perform badly?

  /- What kind of guarantees do I need about data loss windows at failover

  time? Can I afford to lose the last x transactions / seconds worth
  of transactions? Or must absolutely every transaction be retained
  at all costs?/
 A windows data loss could be some secodn/minutes but if is not very
 complex no data loss will be excellent.


 It sounds like your requirements can probably be satisfied by PostgreSQL
 9.0's built-in replication combined with the use of repmgr and heartbeat.

 For anything like this, though, I STRONGLY suggest that you hire a
 consultant who has done what you need before and has learned the traps and
 pitfalls. See:

 http://www.postgresql.org/support/professional_support

 I don't do PostgreSQL support commercially and am not experienced with
 failover/HA setups, so I can't help you much there and suggest you find
 someone who _really_ knows what they are doing. Getting HA right is *hard*.

 --
 Craig Ringer



Re: [GENERAL] Postgres questions

2011-05-28 Thread Craig Ringer

On 05/28/2011 06:08 PM, Trenta sis wrote:


Dbi link was configured with instruction form dbi-link. Where I can find
this scripts:


Write them.


Queries in mssql are sql select from talbe on mssql with about 400.000 rows.


Well, no wonder it's slow. As far as I know DBI-Link can't push 
predicates down into queries to the remote database. If you write:


SELECT * FROM some_remote_table t WHERE t.some_field  42;

then (as far as I know) the whole of some_remote_table is retrieved, 
then the filter is applied. It executes:

SELECT * FROM underlying_table
on the remote server, not
SELECT * FROM underlying_table WHERE underlying_table.some_field  42

For more information, read:

http://stackoverflow.com/questions/1328601/postgresql-and-sql-server-2008-dbi-link

A full SQL/MED implementation *might* be able to push down predicates, 
but even then it's hard to do reliably because of possible differences 
in locales, collation rules, etc.


For that kind of data size you're usually better off periodically 
copying the data from the remote table to a local table with the same 
structure. Then query the remote table. Think of it as a materialized 
view of the remote table. You can use DBI-link to refresh the local copy 
and PgAgent to schedule periodic refreshes.



About replication now I can't upgrade to 9.0 with 8.3 what possible
options exists, if is active/passive and is not real ha also it can be OK


Use Google. Type in postgresql replication and the first result should be:

http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Regular disk activity of an idle DBMS

2011-05-28 Thread Andrej Podzimek

Hello,

after configuring a new home server with PostgreSQL 9.0.4, I observe some 
regular disk activity, even though the server is completely idle (disconnected 
from the network, no users but one logged in). There are very short write 
bursts once in about 3 seconds. This does not affect performance in any way, 
but I would like to know whether this is normal. Perhaps I misconfigured 
something. With the 8.4.x version I used before, there were no such regular 
disk writes.

I used the following approach to detect who is writing to disk: 
http://www.xaprb.com/blog/2009/08/23/how-to-find-per-process-io-statistics-on-linux/
 This is what I obtained after about 3 minutes of observation:

TASK   PID  TOTAL   READ  WRITE  DIRTY 
DEVICES
postgres 10437  10736  0  10736  0 
dm-2
md127_raid5630648  0648  0 
sdc, sda, sdb, sdd
flush-253:3  29302553  0553  0 
dm-3
jbd2/dm-2-8   3411 62  0 62  0 
dm-2
flush-253:2   3835 35  0 35  0 
dm-2
jbd2/dm-3-8   3413 20  0 20  0 
dm-3
jbd2/dm-1-8   3409 12  0 12  0 
dm-1
flush-253:1465 11  0 11  0 
dm-1
postgres 10434  9  0  9  0 
dm-2
jbd2/dm-5-8789  6  0  6  0 
dm-5
postgres   850  4  0  4  0 
dm-2
bash   400  4  0  4  0 
dm-5
flush-253:5398  4  0  4  0 
dm-5

These are my (non-default) PostgreSQL settings:

# grep -Pv '^[ \t]*#|^[ \t]*$' /var/lib/postgres/data/postgresql.conf
listen_addresses = 
'::1,2002:53f0:5de8::1,2002:53f0:5de8:1::1,2002:53f0:5de8:2::1,2002:53f0:5de8:3::1,127.0.0.1,83.240.93.232,10.0.1.1,10.0.2.1,10.0.3.1'
max_connections = 128   # (change requires restart)
ssl = on# (change requires restart)
shared_buffers = 512MB  # min 128kB
temp_buffers = 64MB # min 800kB
max_prepared_transactions = 128 # zero disables the feature
work_mem = 16MB # min 64kB
maintenance_work_mem = 128MB# min 1MB
max_stack_depth = 16MB  # min 100kB
effective_io_concurrency = 3# 1-1000. 0 disables prefetching
checkpoint_segments = 16# in logfile segments, min 1, 
16MB each
log_destination = 'syslog'  # Valid values are combinations 
of
autovacuum_max_workers = 8  # max number of autovacuum 
subprocesses
datestyle = 'iso, dmy'
lc_messages = 'cs_CZ.UTF-8' # locale for system 
error message
lc_monetary = 'cs_CZ.UTF-8' # locale for monetary 
formatting
lc_numeric = 'cs_CZ.UTF-8'  # locale for number 
formatting
lc_time = 'cs_CZ.UTF-8' # locale for time 
formatting
default_text_search_config = 'pg_catalog.cs'

The machine runs ArchLinux. It is a standard piece of x86_64 commodity 
hardware. There are four SATA drives configured as a RAID5 array. The file 
system is ext4.

Is there an easy way to detect what exactly causes PostgreSQL to write these 
small amounts of data on an idle machine on a regular basis? Stopping all 
daemons that connect to PostgreSQL (OpenFire, Apache, Courier-MTA) does not 
change anything. Any hints would be very helpful. There is actually no 
performance or usability issue. I just want to *understand* what is going on.

Andrej



smime.p7s
Description: S/MIME Cryptographic Signature


[GENERAL] How to check a table content efficiently? With LIMIT and OFFSET?

2011-05-28 Thread Stefan Keller
Hi,

I’d like to monitor a table in a read-only Postgres database by
writing a PL/pgSQL function.
The database gets periodically overwritten by a mirroring loading process.
The success criteria is, that the table contains at least some records
(actually at least more than 10).

The first idea which comes to one’s mind is probably a count(*):

SELECT CASE WHEN (count(*)  1) THEN 'yes' ELSE 'no' END
FROM planet_osm_point;

But my goal is to consume as few Postgres server resources as possible.
So my idea is to retrieve an arbitrary record, like this:

SELECT CASE WHEN (osm_id::text  '') THEN 'yes' ELSE 'no' END
FROM planet_osm_point LIMIT 1 OFFSET 100;

Is there anyone having an even better  better idea?

Yours, Stefan

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to check a table content efficiently? With LIMIT and OFFSET?

2011-05-28 Thread Stefan Keller
Hi,

That's my solution candidate:

CREATE OR REPLACE FUNCTION isnotempty() RETURNS boolean AS '
  SELECT (count(*) = 1)
  FROM (SELECT osm_id FROM planet_osm_point LIMIT 1 OFFSET 10) tmp
' LANGUAGE SQL;

Yours, Stefan

2011/5/28 Stefan Keller sfkel...@gmail.com:
 Hi,

 I’d like to monitor a table in a read-only Postgres database by
 writing a PL/pgSQL function.
 The database gets periodically overwritten by a mirroring loading process.
 The success criteria is, that the table contains at least some records
 (actually at least more than 10).

 The first idea which comes to one’s mind is probably a count(*):

 SELECT CASE WHEN (count(*)  1) THEN 'yes' ELSE 'no' END
 FROM planet_osm_point;

 But my goal is to consume as few Postgres server resources as possible.
 So my idea is to retrieve an arbitrary record, like this:

 SELECT CASE WHEN (osm_id::text  '') THEN 'yes' ELSE 'no' END
 FROM planet_osm_point LIMIT 1 OFFSET 100;

 Is there anyone having an even better  better idea?

 Yours, Stefan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Shared Buffer Size

2011-05-28 Thread Greg Smith

On 05/28/2011 04:42 AM, Carl von Clausewitz wrote:
I've try to get any information about optimal memory config, and 
working, but there wasn't any optimal memory setting calculator on 
the internet, just some guide in the posgre documentation 


There's a simple one at https://github.com/gregs1104/pgtune and the 
guide at http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server 
covers many of the most popular questions here too.  The right 
proportion of memory setting is very dependent on workload, which makes 
any sort of calculator hard to create.  pgtune takes a workload type as 
an input to help with that, but the settings that come out should be 
considered starting values only.  You'll need to monitoring how much 
memory is actually being used by the server, as well as the output from 
parameters like log_time_files, to know for sure if things are working well.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Shared Buffer Size

2011-05-28 Thread Greg Smith

On 05/27/2011 05:33 PM, preetika tyagi wrote:
Hows does the shared buffer in Postgres rely on the Operating System 
cache?
Suppose my RAM is 8 GB and shared_buffer is 24 MB in postgres. And 
there are some dirty pages in shared_buffer and I need to write a 
dirty page back to the disk to bring in a new page. What happens in 
this case? The dirty page will be written to the disk considering the 
shared_buffer size as 24 MB? or it will not be written and will stay 
in RAM which is 8 GB?




In this case, the dirty page will be written to the operating system, 
which will initially store it in its own write cache.  Eventually the 
operating system will write that page out to disk from that cache.  The 
delay before that happens can easily be 30 seconds or more on operating 
systems like Linux.  If PostgreSQL needs to read that data again, it may 
find it still in the OS cache, in which case the read will happen very 
quickly.  Eventually, the dirty data will be written to disk, and if 
it's not used for long enough the OS cache memory will be re-used for 
something else.


When you read a page into the database, and you don't dirty it, it might 
be evicted from the database cache without a write.  If the database 
needs that page again, it will ask the OS for it.  If the OS still has 
it in its own read cache, it may just read it from the cache again, 
without a real disk read happening.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to check a table content efficiently? With LIMIT and OFFSET?

2011-05-28 Thread Craig Ringer

On 05/29/2011 05:45 AM, Stefan Keller wrote:

Hi,

That's my solution candidate:

CREATE OR REPLACE FUNCTION isnotempty() RETURNS boolean AS '
   SELECT (count(*) = 1)
   FROM (SELECT osm_id FROM planet_osm_point LIMIT 1 OFFSET 10) tmp
' LANGUAGE SQL;


LIMIT and OFFSET are often no more efficient than count(*). You're still 
likely to need a full table scan.


Here's how I'd do it: I'd ANALYZE the table, then check the table 
statistics to see that they looked to be within reasonable bounds. That 
way you not only check the import, but in the process you ensure the 
statistics used by the query planner are up to date. Since ANALYZE only 
tests a sampling of records it does pretty much what you want, something 
that it's not so easy to do in SQL.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_connections proposal

2011-05-28 Thread Greg Smith

On 05/25/2011 10:58 PM, Craig Ringer wrote:

max_connections = 100   # (change requires restart)
# WARNING: If you're about to increase max_connections above 100, you
# should probably be using a connection pool instead. See:
# http://wiki.postgresql.org/max_connections
#
# Note:  Increasing max_connections costs ~400 bytes of shared memory
# per connection slot, plus lock space (see max_locks_per_transaction).


While tempting to do something this simple, the most useful path to 
follow is probably to nail this head-on and comprehensively in the docs 
instead.  Discussion of this topic on the hackers list seems to have 
concluded that connection pooling isn't as vital to do inside the 
database, as a high priority relative to other development, because it's 
addressed so well via external projects.  Pointing people toward them 
seems quite appropriate given that position.  Really addressing this 
well would take the following steps:


-Add a section to the external projects section of the documentation:  
http://www.postgresql.org/docs/current/interactive/external-projects.html introducing 
connection pooling as a useful type of additional software to add.  
Shouldn't be controversial to suggest pgbouncer and pgpool-II as 
examples there.


-Expand the documentation on max_connections to warn about how snapshot 
visibility overhead makes extremely large numbers of connections impractical


-Also expand the documentation to suggest that CPU switching 
inefficiency may make a much smaller number of connections than expected 
optimal, and point toward the external project section for more 
information about pooling.


-Add a warning to the postgresql.conf suggesting people read the 
documentation for max_connections before increasing this value.


This area pops up enough that I've made a discussion of it part of even 
my shortest talk about PostgreSQL performance issues to be wary of.  
There's a good documentation patch project for somebody here, I just 
haven't had time to get to it yet.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] max_connections proposal

2011-05-28 Thread Edison So
Can anyone tell me that if the max_connections is above 100, the server will
use pooling instead?

For all participants in this particular dsicuss, what is the reasonable
value for max_connections without causing any harm to the Postgres 9.0
server.

I am a nonvice Postgres user so any advice is always welcomed.

Thanks,

On Wed, May 25, 2011 at 10:58 PM, Craig Ringer
cr...@postnewspapers.com.auwrote:

 There might be a very cheap and simple way to help reduce the number of
 people running into problems because they set massive max_connections values
 that their server cannot cope with instead of using pooling.

 In the default postgresql.conf, change:

 max_connections = 100   # (change requires restart)
 # Note:  Increasing max_connections costs ~400 bytes of shared memory
 # per connection slot, plus lock space (see max_locks_per_transaction).

 to:

 max_connections = 100   # (change requires restart)
 # WARNING: If you're about to increase max_connections above 100, you
 # should probably be using a connection pool instead. See:
 # http://wiki.postgresql.org/max_connections
 #
 # Note:  Increasing max_connections costs ~400 bytes of shared memory
 # per connection slot, plus lock space (see max_locks_per_transaction).
 #


 ... where wiki.postgresql.org/max_connections (which doesn't yet exist)
 explains the throughput costs of too many backends and the advantages of
 configuring a connection pool instead.

 Sure, this somewhat contravenes the users don't read - ever principle,
 but we can hope that _some_ people will read a comment immediately beside
 the directive they're modifying.

 --
 Craig Ringer

 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general




-- 
Edison


Re: [GENERAL] Inspecting a DB - psql or system tables ?

2011-05-28 Thread Greg Smith

On 05/27/2011 01:24 PM, Andre Majorel wrote:

While parsing the output of psql is cumbersome, accessing the
system tables seems more likely to break whenever a new version
of PostgreSQL comes out.
   


I think you have this backwards.  If there's a change in this area big 
enough to justify changing the format of the system tables, odds are the 
text output from psql is going to be changed too.  psql gets tweaked to 
display information better more often than the internals are altered.


Approaches you can take here, from most robust in the face of changes to 
most fragile, in my mind are:


1) Use information_schema.  If all the info you need is in here, great; 
it may not be though.

2) Use the system catalog data directly
3) Parse text output from psql.

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SELECT COUNT(*) execution time on large tables (v9.0.4-1)

2011-05-28 Thread Greg Smith

On 05/27/2011 12:41 PM, Carlos Sotto Maior (SIM) wrote:

I have browsed catalog tables, digging for a real time Row.count but  so far
did not find any.
   


See http://wiki.postgresql.org/wiki/Slow_Counting which shows you where 
the one system count estimate is at, as well as suggesting links to 
where you can find alternate approaches here.  If you need an exact 
count and can't afford to generate a full query to find one, some sort 
of trigger-based approach is likely where you'll need to go.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regular disk activity of an idle DBMS

2011-05-28 Thread Greg Smith

On 05/28/2011 11:02 AM, Andrej Podzimek wrote:
after configuring a new home server with PostgreSQL 9.0.4, I observe 
some regular disk activity, even though the server is completely idle 
(disconnected from the network, no users but one logged in). There are 
very short write bursts once in about 3 seconds.


There are a couple of things that can cause unexpected disk activity:

-autovacuum running in the background.  Setting 
log_autovacuum_min_duration  may help you determine when this is happening.
-checkpoint activity.  Turning on log_checkpoints, as well as looking 
for changes in the pg_stat_bgwriter view, may help explain if this is 
the case.
-Hint bit updates.  Even if you are only reading from a table, in some 
situations write activity can be generated.  See 
http://wiki.postgresql.org/wiki/Hint_Bits for more information.
-Statistics collector updates.  If the one logged in user is doing 
anything at all, they might be generating something here.


Figuring out if the writes are happening from a regular PostgreSQL 
process, or if they are happening via the background writer, might also 
be useful here.  Saving the output from top -b -c can be useful for 
this.  The iotop command is very helpful for tracking down this sort of 
problem too.  The background writer process, which also handles 
checkpoints, will have the same process ID once it's started.  So will 
the statistics collector.  If you track I/O to one of those two, it 
should narrow possible causes quite a bit.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] timeouts on transactions etc?

2011-05-28 Thread Darren Duncan
Does Postgres have any mechanisms where one can set an activity timeout, say 
either dynamically thru SQL to affect a current session, or alternately in a 
configuration file so to take effect globally?


I mean for example so we can tell Postgres to automatically abort/rollback a 
current statement or transaction if it is still running after 5 seconds?  It 
would return an error / throw an exception at the same time, as if there was a 
failure or constraint violation for some other reason, so the user would know.


Or a generalization of this would be the DBMS enforcing particular resource 
limits, but I suspect that just clock time is a relatively easy one to do, as it 
could be implemented with ordinary timers and signals/interrupts.


Purposes of this feature include coping with applications that are not 
well-behaved such as by failing to explicitly end transactions or by asking the 
DBMS to do too much at once.


If so, where is this documented?  If not, how much work might it be to add this?

I'm looking for something enforced by the DBMS itself, not that an application 
or bridge layer should do.


-- Darren Duncan

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] timeouts on transactions etc?

2011-05-28 Thread Steve Atkins

On May 28, 2011, at 7:55 PM, Darren Duncan wrote:

 Does Postgres have any mechanisms where one can set an activity timeout, say 
 either dynamically thru SQL to affect a current session, or alternately in a 
 configuration file so to take effect globally?
 
 I mean for example so we can tell Postgres to automatically abort/rollback a 
 current statement or transaction if it is still running after 5 seconds?  It 
 would return an error / throw an exception at the same time, as if there was 
 a failure or constraint violation for some other reason, so the user would 
 know.
 
 Or a generalization of this would be the DBMS enforcing particular resource 
 limits, but I suspect that just clock time is a relatively easy one to do, as 
 it could be implemented with ordinary timers and signals/interrupts.
 
 Purposes of this feature include coping with applications that are not 
 well-behaved such as by failing to explicitly end transactions or by asking 
 the DBMS to do too much at once.
 
 If so, where is this documented?  If not, how much work might it be to add 
 this?
 
 I'm looking for something enforced by the DBMS itself, not that an 
 application or bridge layer should do.

You're looking for statement_timeout, I think. You can set that globally, but 
it's better to set it just in the sessions where you want it.

http://www.postgresql.org/docs/9.0/static/runtime-config-client.html

There's also the ability to log long statements, so you can identify and fix 
bad queries without breaking functionality - log_min_duration_statement and 
friends.

Cheers,
  Steve



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] timeouts on transactions etc?

2011-05-28 Thread Darren Duncan

Thank you, Steve,

That answers the part about statement times.

But what about timeouts for transactions as a whole, ensuring that any 
transaction, once started, is ended one way or another within X time?


-- Darren Duncan

Steve Atkins wrote:

On May 28, 2011, at 7:55 PM, Darren Duncan wrote:


Does Postgres have any mechanisms where one can set an activity timeout, say 
either dynamically thru SQL to affect a current session, or alternately in a 
configuration file so to take effect globally?

I mean for example so we can tell Postgres to automatically abort/rollback a 
current statement or transaction if it is still running after 5 seconds?  It 
would return an error / throw an exception at the same time, as if there was a 
failure or constraint violation for some other reason, so the user would know.

Or a generalization of this would be the DBMS enforcing particular resource 
limits, but I suspect that just clock time is a relatively easy one to do, as 
it could be implemented with ordinary timers and signals/interrupts.

Purposes of this feature include coping with applications that are not 
well-behaved such as by failing to explicitly end transactions or by asking the 
DBMS to do too much at once.

If so, where is this documented?  If not, how much work might it be to add this?

I'm looking for something enforced by the DBMS itself, not that an application 
or bridge layer should do.


You're looking for statement_timeout, I think. You can set that globally, but 
it's better to set it just in the sessions where you want it.

http://www.postgresql.org/docs/9.0/static/runtime-config-client.html

There's also the ability to log long statements, so you can identify and fix 
bad queries without breaking functionality - log_min_duration_statement and 
friends.

Cheers,
  Steve




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bidirectional replication

2011-05-28 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


 Yeah.  One nasty property that async multi master solutions share is
 that they change the definition of what 'COMMIT' means -- the database
 can't guarantee the transaction is valid because not all the
 supporting facts are necessarily known.  Even after libpq gives you
 the green light that transaction could fail an arbitrary length of
 time later, and you can't rely in the assumption it's valid until
 you've done some synchronizing with the other 'masters'.  Maybe you
 don't need to rely on that assumption so a 'fix it later, or possibly
 never' methodology works well.  Those cases unfortunately fairly rare
 in the real world.

 I don't quite follow you here. Are you talking about *synchronous* 
 multi-master?
 Async multi-master works just fine, as long as you are not expecting the
 servers to give the exact same answer at the exact same time. But certainly
 transactions are valid.

 Lets say you have a foreign key constraint on delete restrict. On one 
 master you delete the key as there are no child entities. On the other 
 master you add a child entity, which should prevent deleting the parent 
 record. Both masters allowed the transaction to be committed, which 
 means that the users have both been given acknowledgement that their 
 actions are valid. If the rules are that the guy who put in the child 
 wins that means the committed delete never happened. If the parent wins 
 that means that the insert of the child was illegal.

Well, that's one way to look at it, but you have to remember to treat the 
async replication as the invisible hand of another session, that may 
change what you have just committed, just like any other user may. If I 
add a child entry, then user X deletes said entry, and then user Y deletes 
the parent entry, that is for all intent and purposes the same as what happens 
in a replication scenario. The difference is that technically I add the child 
entry, user Y deletes said entry, and /then/ user R (replication) deletes both 
the parent and child (or inserts the parent back in). But in both cases, both 
the child creator and the parent deleter receive back a ok commit. If you 
have a very large async response time, and your application has a very tight 
control over things, it may cause a problem, but in real life the syncing 
happens quite quickly, and the window for even catching both writes, not to 
mention sorting it out, is quite small. And I would expect an application 
running against a MM database would be able to handle such events anyway.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201105282339
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAk3hwKIACgkQvJuQZxSWSsgu9gCgpBrlVa5xvmRNdIdcstlv60oJ
tQsAn0sPvDHNZI+CVIT46SP4mEP7aGLM
=4c4P
-END PGP SIGNATURE-



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general