[PERFORM] Re: [pgsql-cluster-hackers][performance] fast reads on a busy server

2012-06-27 Thread Hannu Krosing
On Wed, 2012-06-27 at 00:16 +0200, Willy-Bas Loos wrote:
 Hi,
 
 I've read this:
 http://wiki.postgresql.org/wiki/Prioritizing_databases_by_separating_into_multiple_clusters
 
 But it doesn't really say anything about memory.
 If i can fit an extra cluster into it's shared buffer, it should have
 fast reads, right?
 Even if i don't have seperate spindles and the disks are busy.

Check if you are CPU-bound. On a database which fits fully you may
already be.

 This is on a Debain server, postgres 8.4

And if possible, upgrade to latest pg (9.1). On some operations this
already may give you a considerable performance boost

 Cheers,
 
 WBL
 -- 
 Quality comes from focus and clarity of purpose -- Mark Shuttleworth
 

-- 
---
Hannu Krosing
PostgreSQL Unlimited Scalability and Performance Consultant
2ndQuadrant Nordic
PG Admin Book: http://www.2ndQuadrant.com/books/


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Testing Sandforce SSD

2010-08-04 Thread Hannu Krosing
On Tue, 2010-08-03 at 10:40 +0200, Yeb Havinga wrote:
 se note that the 10% was on a slower CPU. On a more recent CPU the 
 difference was 47%, based on tests that ran for an hour.

I am not surprised at all that reading and writing almost twice as much
data from/to disk takes 47% longer. If less time is spent on seeking the
amount of data starts playing bigger role.

  That's why I 
 absolutely agree with Merlin Moncure that more testing in this 
 department is welcome, preferably by others since after all I could be 
 on the pay roll of OCZ :-)

:)


 I looked a bit into Bonnie++ but fail to see how I could do a test that 
 somehow matches the PostgreSQL setup during the pgbench tests (db that 
 fits in memory, 

Did it fit in shared_buffers, or system cache ?

Once we are in high tps ground, the time it takes to move pages between
userspace and system cache starts to play bigger role.

I first noticed this several years ago, when doing a COPY to a large
table with indexes took noticably longer (2-3 times longer) when the
indexes were in system cache than when they were in shared_buffers.

 so the test is actually how fast the ssd can capture 
 sequential WAL writes and fsync without barriers, mixed with an 
 occasional checkpoint with random write IO on another partition). Since 
 the WAL writing is the same for both block_size setups, I decided to 
 compare random writes to a file of 5GB with Oracle's Orion tool:

Are you sure that you are not writing full WAL pages ?

Do you have any stats on how much WAL is written for 8kb and 4kb test
cases ?

And for other disk i/o during the tests ?



-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2010-08-04 Thread Hannu Krosing
On Mon, 2010-08-02 at 14:23 -0700, Peter Hussey wrote:
 I already had effective_cache_size set to 500MB.
 
 I experimented with lowering  random_page_cost to 3 then 2.  

In case of fully cached database it is closer to 1.

 2) Why is the setting of work_mem something left to the admin and/or
 developer?  Couldn't the optimizer say how much it thinks it needs to
 build a hash table based on size of the keys and estimated number of
 rows?

Yes, It can say how much it thinks it needs to build a hash table, the
part it can't figure out is how much it can afford, based on things like
number concurrent queries and how much work-mem these are using, and any
work-mem used will be substracted from total memory pool, affecting also
how much of the files the system caches.

 It is difficult for a software development platform like ours to take
 advantage of suggestions to set work_mem, or to change the cost
 function, or turn on/off join strategies for individual queries.  The
 SQL we issue is formed by user interaction with the product and rarely
 static.  How would we know when to turn something on or off?  That's
 why I'm looking for a configuration solution that I can set on a
 database-wide basis and have it work well for all queries.

Keep trying. The close you get with your conf to real conditions, the
better choices the optimiser can make ;)



-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2010-08-04 Thread Hannu Krosing
On Wed, 2010-08-04 at 09:14 -0400, Robert Haas wrote:
 On Tue, Aug 3, 2010 at 3:03 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
  In case of fully cached database it is closer to 1.
 
 In the case of a fully cached database I believe the correct answer
 begins with a decimal point.

The number 1 here was suggested in relation to seq_page_cost, which is
1. 

For fully cached db there is no additional seek time for random access,
so seq_page_cost == random_page_cost.

Of course there are more variables than just *_page_cost, so if you nail
down any other one, you may end with less than 1 for both page costs.

I have always used seq_page_cost = 1 in my thinking and adjusted others
relative to it.

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2010-08-04 Thread Hannu Krosing
On Wed, 2010-08-04 at 14:00 -0400, Tom Lane wrote:
 Hannu Krosing ha...@2ndquadrant.com writes:
  Of course there are more variables than just *_page_cost, so if you nail
  down any other one, you may end with less than 1 for both page costs.
 
  I have always used seq_page_cost = 1 in my thinking and adjusted others
  relative to it.
 
 Right, seq_page_cost = 1 is sort of the traditional reference point,
 but you don't have to do it that way.  The main point here is that for
 an all-in-RAM database, the standard page access costs are too high
 relative to the CPU effort costs:
 
 regression=# select name, setting from pg_settings where name like '%cost';
  name | setting 
 --+-
  cpu_index_tuple_cost | 0.005
  cpu_operator_cost| 0.0025
  cpu_tuple_cost   | 0.01
  random_page_cost | 4
  seq_page_cost| 1
 (5 rows)
 
 To model an all-in-RAM database, you can either dial down both
 random_page_cost and seq_page_cost to 0.1 or so, or set random_page_cost
 to 1 and increase all the CPU costs.  The former is less effort ;-)
 
 It should be noted also that there's not all that much evidence backing
 up the default values of the cpu_xxx_cost variables.  In the past those
 didn't matter much because I/O costs always swamped CPU costs anyway.
 But I can foresee us having to twiddle those defaults and maybe refine
 the CPU cost model more, as all-in-RAM cases get more common.

Especially the context switch + copy between shared buffers and system
disk cache will become noticeable at these speeds.

An easy way to test it is loading a table with a few indexes, once with
a shared_buffers value, which is senough for only the main table and
once with one that fits both table and indexes,


   regards, tom lane


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2010-08-04 Thread Hannu Krosing
On Wed, 2010-08-04 at 21:41 +0300, Hannu Krosing wrote:
 On Wed, 2010-08-04 at 14:00 -0400, Tom Lane wrote:

  regression=# select name, setting from pg_settings where name like '%cost';
   name | setting 
  --+-
   cpu_index_tuple_cost | 0.005
   cpu_operator_cost| 0.0025
   cpu_tuple_cost   | 0.01
   random_page_cost | 4
   seq_page_cost| 1
  (5 rows)
  
  To model an all-in-RAM database, you can either dial down both
  random_page_cost and seq_page_cost to 0.1 or so, or set random_page_cost
  to 1 and increase all the CPU costs.  The former is less effort ;-)
  
  It should be noted also that there's not all that much evidence backing
  up the default values of the cpu_xxx_cost variables.  In the past those
  didn't matter much because I/O costs always swamped CPU costs anyway.
  But I can foresee us having to twiddle those defaults and maybe refine
  the CPU cost model more, as all-in-RAM cases get more common.
 
 Especially the context switch + copy between shared buffers and system
 disk cache will become noticeable at these speeds.
 
 An easy way to test it is loading a table with a few indexes, once with
 a shared_buffers value, which is senough for only the main table and
 once with one that fits both table and indexes,

ok, just to back this up I ran the following test with 28MB and 128MB
shared buffers.

create table sbuf_test(f1 float, f2 float, f3 float);
create index sbuf_test1 on sbuf_test(f1);
create index sbuf_test2 on sbuf_test(f2);
create index sbuf_test3 on sbuf_test(f3);

and then did 3 times the following for each shared_buffers setting

truncate sbuf_test;
insert into sbuf_test 
select random(), random(), random() from generate_series(1,60);

the main table size was 31MB, indexes were 18MB each for total size of
85MB

in case of 128MB shared buffers, the insert run in 14sec (+/- 1 sec)

in case of 28MB shared buffers, the insert run between 346 and 431 sec,
that is 20-30 _times_ slower.

There was ample space for keeping the indexes in linux cache (it has 1GB
cached currently) though the system may have decided to start writing it
to disk, so I suspect that most of the time was spent copying random
index pages back and forth between shared buffers and disk cache.

I did not verify this, so there may be some other factors involved, but
this seems like the most obvious suspect.

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2010-08-04 Thread Hannu Krosing
On Wed, 2010-08-04 at 22:03 +0300, Hannu Krosing wrote:
 On Wed, 2010-08-04 at 21:41 +0300, Hannu Krosing wrote:
  On Wed, 2010-08-04 at 14:00 -0400, Tom Lane wrote:
 
   regression=# select name, setting from pg_settings where name like 
   '%cost';
name | setting 
   --+-
cpu_index_tuple_cost | 0.005
cpu_operator_cost| 0.0025
cpu_tuple_cost   | 0.01
random_page_cost | 4
seq_page_cost| 1
   (5 rows)
   
   To model an all-in-RAM database, you can either dial down both
   random_page_cost and seq_page_cost to 0.1 or so, or set random_page_cost
   to 1 and increase all the CPU costs.  The former is less effort ;-)
   
   It should be noted also that there's not all that much evidence backing
   up the default values of the cpu_xxx_cost variables.  In the past those
   didn't matter much because I/O costs always swamped CPU costs anyway.
   But I can foresee us having to twiddle those defaults and maybe refine
   the CPU cost model more, as all-in-RAM cases get more common.
  
  Especially the context switch + copy between shared buffers and system
  disk cache will become noticeable at these speeds.
  
  An easy way to test it is loading a table with a few indexes, once with
  a shared_buffers value, which is senough for only the main table and
  once with one that fits both table and indexes,

I re-ran the test, and checked idx_blks_read for 28MB case

hannu=# select * from pg_statio_user_indexes where relname =
'sbuf_test';
| schemaname |  relname  | indexrelname | idx_blks_read | idx_blks_hit 
++---+--+---+--
| hannu  | sbuf_test | sbuf_test1   | 71376 |  1620908
| hannu  | sbuf_test | sbuf_test2   | 71300 |  1620365
| hannu  | sbuf_test | sbuf_test3   | 71436 |  1619619


this means that there were a total of 214112 index blocks read back from
disk cache (obviously at least some of these had to be copied the other
way as well).

This seems to indicate about 1 ms for moving pages over user/system
boundary. (Intel Core2 Duo T7500 @ 2.20GHz, Ubuntu 9.10, 4GB RAM)

for 128MB shared buffers the total idx_blks_read for 3 indexes was about
6300 .


 ok, just to back this up I ran the following test with 28MB and 128MB
 shared buffers.
 
 create table sbuf_test(f1 float, f2 float, f3 float);
 create index sbuf_test1 on sbuf_test(f1);
 create index sbuf_test2 on sbuf_test(f2);
 create index sbuf_test3 on sbuf_test(f3);
 
 and then did 3 times the following for each shared_buffers setting
 
 truncate sbuf_test;
 insert into sbuf_test 
 select random(), random(), random() from generate_series(1,60);
 
 the main table size was 31MB, indexes were 18MB each for total size of
 85MB
 
 in case of 128MB shared buffers, the insert run in 14sec (+/- 1 sec)
 
 in case of 28MB shared buffers, the insert run between 346 and 431 sec,
 that is 20-30 _times_ slower.
 
 There was ample space for keeping the indexes in linux cache (it has 1GB
 cached currently) though the system may have decided to start writing it
 to disk, so I suspect that most of the time was spent copying random
 index pages back and forth between shared buffers and disk cache.
 
 I did not verify this, so there may be some other factors involved, but
 this seems like the most obvious suspect.
 
 -- 
 Hannu Krosing   http://www.2ndQuadrant.com
 PostgreSQL Scalability and Availability 
Services, Consulting and Training
 
 
 



-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2010-08-04 Thread Hannu Krosing
On Wed, 2010-08-04 at 15:16 -0400, Greg Smith wrote:
 Hannu Krosing wrote:
  There was ample space for keeping the indexes in linux cache (it has 1GB
  cached currently) though the system may have decided to start writing it
  to disk, so I suspect that most of the time was spent copying random
  index pages back and forth between shared buffers and disk cache.

 
 Low shared_buffers settings will result in the same pages more often 
 being written multiple times per checkpoint,

Do you mean written to disk, or written out from shared_buffers to
disk cache ?

  particularly index pages, 
 which is less efficient than keeping in the database cache and updating 
 them there.  This is a slightly different issue than just the overhead 
 of copying them back and forth; by keeping them in cache, you actually 
 reduce writes to the OS cache. 

That's what I meant. Both writes to and read from the OS cache take a
significant amount of time once you are not doing real disk I/O.

 What I do to quantify that is...well, 
 the attached shows it better than I can describe; only works on 9.0 or 
 later as it depends on a feature I added for this purpose there.  It 
 measures exactly how much buffer cache churn happened during a test, in 
 this case creating a pgbench database.
 
 -- 
 Greg Smith  2ndQuadrant US  Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com   www.2ndQuadrant.us
 
 



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Questions on query planner, join types, and work_mem

2010-08-04 Thread Hannu Krosing
On Wed, 2010-08-04 at 22:03 +0300, Hannu Krosing wrote:
 On Wed, 2010-08-04 at 21:41 +0300, Hannu Krosing wrote:
  On Wed, 2010-08-04 at 14:00 -0400, Tom Lane wrote:
 
   regression=# select name, setting from pg_settings where name like 
   '%cost';
name | setting 
   --+-
cpu_index_tuple_cost | 0.005
cpu_operator_cost| 0.0025
cpu_tuple_cost   | 0.01
random_page_cost | 4
seq_page_cost| 1
   (5 rows)
   
   To model an all-in-RAM database, you can either dial down both
   random_page_cost and seq_page_cost to 0.1 or so, or set random_page_cost
   to 1 and increase all the CPU costs.  The former is less effort ;-)
   
   It should be noted also that there's not all that much evidence backing
   up the default values of the cpu_xxx_cost variables.  In the past those
   didn't matter much because I/O costs always swamped CPU costs anyway.
   But I can foresee us having to twiddle those defaults and maybe refine
   the CPU cost model more, as all-in-RAM cases get more common.
  
  Especially the context switch + copy between shared buffers and system
  disk cache will become noticeable at these speeds.
  
  An easy way to test it is loading a table with a few indexes, once with
  a shared_buffers value, which is senough for only the main table and
  once with one that fits both table and indexes,

I re-ran the test, and checked idx_blks_read for 28MB case

hannu=# select * from pg_statio_user_indexes where relname =
'sbuf_test';
| schemaname |  relname  | indexrelname | idx_blks_read | idx_blks_hit 
++---+--+---+--
| hannu  | sbuf_test | sbuf_test1   | 71376 |  1620908
| hannu  | sbuf_test | sbuf_test2   | 71300 |  1620365
| hannu  | sbuf_test | sbuf_test3   | 71436 |  1619619


this means that there were a total of 214112 index blocks read back from
disk cache (obviously at least some of these had to be copied the other
way as well).

This seems to indicate about 1 ms for moving pages over user/system
boundary. (Intel Core2 Duo T7500 @ 2.20GHz, Ubuntu 9.10, 4GB RAM)

for 128MB shared buffers the total idx_blks_read for 3 indexes was about
6300 .


 ok, just to back this up I ran the following test with 28MB and 128MB
 shared buffers.
 
 create table sbuf_test(f1 float, f2 float, f3 float);
 create index sbuf_test1 on sbuf_test(f1);
 create index sbuf_test2 on sbuf_test(f2);
 create index sbuf_test3 on sbuf_test(f3);
 
 and then did 3 times the following for each shared_buffers setting
 
 truncate sbuf_test;
 insert into sbuf_test 
 select random(), random(), random() from generate_series(1,60);
 
 the main table size was 31MB, indexes were 18MB each for total size of
 85MB
 
 in case of 128MB shared buffers, the insert run in 14sec (+/- 1 sec)
 
 in case of 28MB shared buffers, the insert run between 346 and 431 sec,
 that is 20-30 _times_ slower.
 
 There was ample space for keeping the indexes in linux cache (it has 1GB
 cached currently) though the system may have decided to start writing it
 to disk, so I suspect that most of the time was spent copying random
 index pages back and forth between shared buffers and disk cache.
 
 I did not verify this, so there may be some other factors involved, but
 this seems like the most obvious suspect.
 
 -- 
 Hannu Krosing   http://www.2ndQuadrant.com
 PostgreSQL Scalability and Availability 
Services, Consulting and Training
 
 
 



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Hannu Krosing
On Fri, 2010-07-23 at 09:52 -0700, Joshua D. Drake wrote:
 On Thu, 2010-07-22 at 20:56 +0100, Hannu Krosing wrote:
   
   Let's extend this shall we:
   
   Avoid adding yet another network hop
  
  postgreSQL is multi-process, so you either have a separate pooler
  process or need to put pooler functionality in postmaster, bothw ways
  you still have a two-hop scenario for connect. you may be able to pass
  the socket to child process and also keep it, but doing this for both
  client and db sides seems really convoluted. 
 
 Which means, right now there is three hops. Reducing one is good.

No, it is still two, as postmaster passes the socket to spwaned child
postgresql process after login. 

the process is as follows

Client --connects-- postmaster --spawns-- postgreSQL server process

then socket is passed to be used directly so the use is


Client --talks-to--- postgreSQL server process

when using spooler it becomes


Client --connects-to-- Spooler --passes-requests-to--  postgreSQL 

I see no way to have spooler select the postgreSQL process, pass the
client connection in a way that taks directly to postgrSQL server
process AND be able to get the server connection back once the client is
finishe with either the request, transaction or connection (depending on
pooling mode).



 
  Or is there a prortable way to pass sockets back and forth between
  parent and child processes ?
  
  If so, then pgbouncer could use it as well.
  
   Remove of a point of failure
  
  rather move the point of failure from external pooler to internal
  pooler ;)
 
 Yes but at that point, it doesn't matter. 
 
  
   Reduction of administrative overhead
  
  Possibly. But once you start actually using it, you still need to
  configure and monitor it and do other administrator-y tasks.
 
 Yes, but it is inclusive.
 
  
   Integration into our core authentication mechanisms
  
  True, although for example having SSL on client side connection will be
  so slow that it hides any performance gains from pooling, at least for
  short-lived connections.
 
 Yes, but right now you can't use *any* pooler with LDAP for example. We
 could if pooling was in core. Your SSL argument doesn't really work
 because its true with or without pooling.

As main slowdown in SSL is connection setup, so you can get the network
security and pooling speedup if you run pool on client side and make the
pooler-server connection over SSL.


   Greater flexibility in connection control
  
  Yes, poolers can be much more flexible than default postgresql. See for
  example pgbouncers PAUSE , RECONFIGURE and RESUME commands 
 
 :D
 
  
   And, having connection pooling in core does not eliminate the use of an
   external pool where it makes since.
  
  Probably the easiest way to achieve pooling in core would be adding an
  option to start pgbouncer under postmaster control.
 
 Yeah but that won't happen. 

I guess it could happen as part of opening up the postgresql controlled
process part to be configurable and able to run third party stuff. 

Another thing to run under postmaster control would be pgqd . 

 Also I think we may have a libevent
 dependency that we have to work out.
 
  
  You probably can't get much leaner than pgbouncer.
 
 Oh don't get me wrong. I love pgbouncer. It is my recommended pooler but
 even it has limitations (such as auth).

As pgbouncer is single-threaded and the main goal has been performance
there is not much enthusiasm about having _any_ auth method included
which cant be completed in a few cpu cycles. It may be possible to add
threads to wait for LDAP/Kerberos/... response or do SSL handshakes, but
i have not seen any interest from Marko to do it himself.

Maybe there is a way to modularise the auth part of postmaster in a way
that could be used from third party products through some nice API which
postmaster-controlled pgbouncer can start using.


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Hannu Krosing
On Sat, 2010-07-24 at 14:36 +0800, Craig Ringer wrote:
 On 24/07/10 13:23, Greg Smith wrote:
  Joshua Tolley wrote:
  Relatively minor, but it would be convenient to avoid having to query
  $external_pooler to determine the client_addr of an incoming connection.

  
  You suggest this as a minor concern, but I consider it to be one of the
  most compelling arguments in favor of in-core pooling.  A constant pain
  with external poolers is the need to then combine two sources of data in
  order to track connections fully, which is something that everyone runs
  into eventually and finds annoying.  It's one of the few things that
  doesn't go away no matter how much fiddling you do with pgBouncer, it's
  always getting in the way a bit.  And it seems to seriously bother
  systems administrators and developers, not just the DBAs.
 
 
 Putting a pooler in core won't inherently fix this, and won't remove the
 need to solve it for cases where the pooler can't be on the same machine.
 
 9.0 has application_name to let apps identify themselves. Perhaps a
 pooled_client_ip, to be set by a pooler rather than the app, could be
 added to address this problem in a way that can be used by all poolers
 new and existing, not just any new in-core pooling system.
 
 If a privileged set of pooler functions is was considered, as per my
 other recent mail, the pooler could use a management connection to set
 the client ip before handing the connection to the client, so the client
 couldn't change pooled_client_ip its self by accident or through malice.
 But even without that, it'd be awfully handy.

Or maybe we can add some command extensions to the protocol for passing
extra info, so that instead of sending just the (run_query:query)
command over socket we could send both the extra info and execute
(set_params:(proxy_client_ip:a.b.c.d)(proxy_client_post:n)(something
else))(run_query:query) in one packet (for performance) and have these
things be available in logging and pg_stat_activity

I see no need to try to somehow restrict these if you can always be sure
that they are set by the direct client. proxy can decide to pass some of
these from the real client but it would be a decision made by proxy, not
mandated by some proxying rules.






-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Testing Sandforce SSD

2010-07-27 Thread Hannu Krosing
On Mon, 2010-07-26 at 14:34 -0400, Greg Smith wrote:
 Matthew Wakeling wrote:
  Yeb also made the point - there are far too many points on that graph 
  to really tell what the average latency is. It'd be instructive to 
  have a few figures, like only x% of requests took longer than y.
 
 Average latency is the inverse of TPS.  So if the result is, say, 1200 
 TPS, that means the average latency is 1 / (1200 transactions/second) = 
 0.83 milliseconds/transaction. 

This is probably only true if you run all transactions sequentially in
one connection? 

If you run 10 parallel threads and get 1200 sec, the average transaction
time (latency?) is probably closer to 8.3 ms ?

  The average TPS figure is normally on a 
 more useful scale as far as being able to compare them in ways that make 
 sense to people.
 
 pgbench-tools derives average, worst-case, and 90th percentile figures 
 for latency from the logs.  I have 37MB worth of graphs from a system 
 showing how all this typically works for regular hard drives I've been 
 given permission to publish; just need to find a place to host it at 
 internally and I'll make the whole stack available to the world.  So far 
 Yeb's data is showing that a single SSD is competitive with a small 
 array on average, but with better worst-case behavior than I'm used to 
 seeing.
 
 -- 
 Greg Smith  2ndQuadrant US  Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com   www.2ndQuadrant.us
 
 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Hannu Krosing
On Thu, 2010-07-22 at 20:57 -0400, Robert Haas wrote:
 On Thu, Jul 22, 2010 at 3:15 PM, Hannu Krosing ha...@krosing.net wrote:
  On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote:
  On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
  cr...@postnewspapers.com.au wrote:
   So rather than asking should core have a connection pool perhaps
   what's needed is to ask what can an in-core pool do that an external
   pool cannot do?
 
  Avoid sending every connection through an extra hop.
 
  not really. in-core != magically-in-right-backend-process
 
 Well, how about if we arrange it so it IS in the right backend
 process?  I don't believe magic is required.

Do you have any design in mind, how you can make it so ?

---
Hannu




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Hannu Krosing
On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote:
 On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
 cr...@postnewspapers.com.au wrote:
  So rather than asking should core have a connection pool perhaps
  what's needed is to ask what can an in-core pool do that an external
  pool cannot do?
 
 Avoid sending every connection through an extra hop.

not really. in-core != magically-in-right-backend-process


there will still be an extra hop,only it will be local, between pooler
and backend process.

similar to what currently happens with pgbouncer when you deploy it on
same server and use unix sockets

 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company
 



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Hannu Krosing
On Thu, 2010-07-22 at 12:15 -0700, Joshua D. Drake wrote:
 On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote: 
  On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
  cr...@postnewspapers.com.au wrote:
   So rather than asking should core have a connection pool perhaps
   what's needed is to ask what can an in-core pool do that an external
   pool cannot do?
  
  Avoid sending every connection through an extra hop.
 
 Let's extend this shall we:
 
 Avoid adding yet another network hop

postgreSQL is multi-process, so you either have a separate pooler
process or need to put pooler functionality in postmaster, bothw ways
you still have a two-hop scenario for connect. you may be able to pass
the socket to child process and also keep it, but doing this for both
client and db sides seems really convoluted. 

Or is there a prortable way to pass sockets back and forth between
parent and child processes ?

If so, then pgbouncer could use it as well.

 Remove of a point of failure

rather move the point of failure from external pooler to internal
pooler ;)

 Reduction of administrative overhead

Possibly. But once you start actually using it, you still need to
configure and monitor it and do other administrator-y tasks.

 Integration into our core authentication mechanisms

True, although for example having SSL on client side connection will be
so slow that it hides any performance gains from pooling, at least for
short-lived connections.

 Greater flexibility in connection control

Yes, poolers can be much more flexible than default postgresql. See for
example pgbouncers PAUSE , RECONFIGURE and RESUME commands 

 And, having connection pooling in core does not eliminate the use of an
 external pool where it makes since.

Probably the easiest way to achieve pooling in core would be adding an
option to start pgbouncer under postmaster control.

You probably can't get much leaner than pgbouncer.

 -- 
 PostgreSQL.org Major Contributor
 Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
 Consulting, Training, Support, Custom Development, Engineering
 http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
 
 
 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Hannu Krosing
On Sun, 2010-07-18 at 21:48 +0530, Rajesh Kumar Mallah wrote:
 Hi,
 
 Sorry, if posting here was not proper instead of starting new thread
 (I am really not sure if its bad thing to do)
 
 I would like to share my recent experience on implementation of  
 client side pooling using  pgbouncer. By client side i mean that 
 the the pgbouncer process in not on same machine as postgresql server.
 In first trial pgbouncer and postgresql were in same machine 
 phbouncer
 was connecting to postgresql using unix domain sockets. But i shifted
 it
 laters owing to high CPU usage  50%. ( using top)
 Now i have shifted pgbouncer into a virtual machine (openvz container)
 in the application server hardware 

Why in VM (openvz container) ?

Did you also try it in the same OS as your appserver ?

Perhaps even connecting from appserver via unix seckets ?

 and all my applications on other virtual machines
 (web applications) connect to pgbouncer on this virtual machine.
 
 I tested the setup with pgbench in two scenarios 
 
 1. connecting to DB server directly
 2. connecting to DB via pgbouncer
 
 the no of clients was 10 ( -c 10)  carrying out 1 transactions
 each (-t 1) .
 pgbench db was initilised with scaling  factor -s 100. 
 
 since client count was less there was no queuing of requests in
 pgbouncer
 i would prefer to say  it was in 'passthrough' mode. 
 
 the result was that 
 
 1. direct ~ 2000 tps
 2. via pgbouncer ~ 1200 tps

Are you sure you are not measuring how much sunning pgbouncer slows down
pgbench directly, by competing for CPU resources and not by adding
latency to requests ?


 --
 Experience on deploying to production environment with real world
 load/usage pattern
 --
 
 Pgbouncer was put in same machine as postgresql connecting via unix
 domain
 to server and tcp sockets with clients.
 
 1. There was drastic reduction in CPU loads  from  30 to 10 ldavg
 2. There were no clients waiting, pool size was 150 and number of
 active
 connections was 100-120.
 3. Application performance was worse (inspite of 0 clients waiting ) 
 
 
 I am still waiting to see what is the effect of shifting out pgbounce
 from dbserver
 to appserver, but with pgbench results i am not very hopeful. I am
 curious why inspite of 0 clients waiting pgbounce introduces a drop in
 tps.

If you have less clients than pgbouncer connections, you can't have any
clients waiting in pgbouncer, as each of them is allocated it's own
connection right away.

What you were measuring was 

1. pgbench and pgbouncer competeing for the same CPU
2. overhead from 2 hops to db (app-proxy-db) instead of 1 (app-db)

 Warm Regds
 Rajesh Kumar Mallah.
 CTO - tradeindia.com.
 
 
 
 Keywords: pgbouncer performance
 
 
 
 
 
 
 
 
 
 
 On Mon, Jul 12, 2010 at 6:11 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Craig Ringer cr...@postnewspapers.com.au wrote:
 
 
  So rather than asking should core have a connection pool
 perhaps
  what's needed is to ask what can an in-core pool do that an
  external pool cannot do?
 
 
 (1)  It can prevent the most pessimal performance problems
 resulting
 from lack of an external connection pool (or a badly
 configured one)
 by setting a single GUC.  Configuration tools could suggest a
 good
 value during initial setup.
 
 (2)  It can be used without installing and configuring a more
 sophisticated and complex product.
 
 (3)  It might reduce latency because it avoids having to
 receive,
 parse, and resend data in both directions -- eliminating one
 hop.
 I know the performance benefit would usually accrue to the
 external
 connection pooler, but there might be some circumstances where
 a
 built-in pool could win.
 
 (4)  It's one more checkbox which can be ticked off on some
 RFPs.
 
 That said, I fully agree that if we can include good
 documentation
 on the external poolers and we can get packagers to include
 poolers
 in their distribution, that gets us a much bigger benefit.  A
 built-in solution would only be worthwhile if it was simple
 enough
 and lightweight enough not to be a burden on execution time or
 maintenance.  Maybe that's too big an if.
 
 -Kevin
 
 
 --
 Sent via pgsql-performance mailing list
 (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 

Re: [PERFORM] Need help in performance tuning.

2010-07-14 Thread Hannu Krosing
On Fri, 2010-07-09 at 00:42 -0400, Tom Lane wrote:
 Samuel Gendler sgend...@ideasculptor.com writes:
  On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
  cr...@postnewspapers.com.au wrote:
  If you're not using a connection pool, start using one.
 
  I see this issue and subsequent advice cross this list awfully
  frequently.  Is there in architectural reason why postgres itself
  cannot pool incoming connections in order to eliminate the requirement
  for an external pool?
 
 Perhaps not, but there's no obvious benefit either.  Since there's
 More Than One Way To Do It, it seems more practical to keep that as a
 separate problem that can be solved by a choice of add-on packages.

One example where you need a separate connection pool is pooling really
large number of connections, which you may want to do on another host
than the database itself is running.

For example pgbouncer had to add option to use incoming unix sockets,
because they run into the IP socket port number limit (a little above
31k, or more exactly 63k/2.
And unix sockets can be used only on local host .

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-14 Thread Hannu Krosing
On Wed, 2010-07-14 at 08:58 -0500, Kevin Grittner wrote:
 Scott Marlowe scott.marl...@gmail.com wrote:
  Hannu Krosing ha...@2ndquadrant.com wrote:
  One example where you need a separate connection pool is pooling
  really large number of connections, which you may want to do on
  another host than the database itself is running.
  
  Definitely.  Often it's best placed on the individual webservers
  that are making requests, each running its own pool.
  
 Each running its own pool?  You've just made a case for an
 admissions policy based on active database transactions or active
 queries (or both) on the server having a benefit when used with this
 pooling arrangement.  This collection of pools can't know when the
 CPUs have enough to keep them busy and adding more will degrade
 performance.

I guess this setup is for OLTP load (read lots of short transactions
with low timeout limits), where you can just open 2-5 connections per
CPU for mostly-in-memory database, maybe a little more when disk
accesses are involved. If you have more, then they just wait a few
milliseconds, if you have less, you don't have anything else to run
anyway.


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-12 Thread Hannu Krosing
On Mon, 2010-07-12 at 18:58 +0800, Craig Ringer wrote:
 On 12/07/10 17:45, Matthew Wakeling wrote:
  
  I'm surprised. Doesn't apache httpd do this? Does it have to do a whole
  load of non-portable stuff? It seems to work on a whole load of platforms.
 
 A lot of what Apache HTTPd does is handled via the Apache Portable
 Runtime (APR). It contains a lot of per-platform handlers for various
 functionality.
 
 http://apr.apache.org/docs/apr/1.4/modules.html
 
 I don't know if the socket passing is provided as part of APR or is part
 of Apache HTTPd its self, but I wouldn't be at all surprised if it was
 in APR.
 
 Personally I'm now swayed by arguments presented here that trying to
 push pooling into core isn't really desirable, and that better
 packaging/bundling of existing solutions would be better.

better packaging/bundling of existing solutions is good in it's own
right,weather there will eventually be some support for pooling in core
or not.

 Perhaps documenting the pluses/minuses of the current pooling options
 and providing clear recommendations on which to use for different use
 cases would help, since half the trouble is users not knowing they need
 a pool or being confused as to which to select.

 This discussion reminds me a bit of Hibernate's built-in client-side
 connection pool. It has one, but it's a unloved stepchild that even the
 Hibernate devs suggest should be avoided in favour of a couple of
 external 3rd party options.

Yes, pooling _is_ often better handled as a (set of) separate options,
just because of the reason that here one size does definitely not fit
all;

And efficient in-core pooler probably will look very much like pgbouncer
running in a separate thread spawned by postmaster anyway.

Let's hope there will be some support in core for having user defined
helper processes soon(ish), so tweaking pgbouncer to run as one will be
reasonably easy :)

 A built-in pool seems like a great idea, but there are multiple existing
 ones because they solve different problems in different ways. Unless a
 built-in one could solve ALL those needs, or be so vastly simpler (due
 to code re-use, easier configuration, etc) that it's worth building one
 that won't fit everyone's needs, then it's best to stick to the existing
 external options.
 
 So rather than asking should core have a connection pool perhaps
 what's needed is to ask what can an in-core pool do that an external
 pool cannot do?

Probably nothing. OTOH there are some things that an external pool can
do that a built-in one can't, like running on a separate host and
pooling more than 32000 client connections there.

Cascaded pooling seems also impossible with built-in pooling

 Admission control / resource limit features would be great to have in
 core, and can't really be done fully in external modules ... but could
 be designed in ways that would allow external poolers to add
 functionality on top. Josh Berkus has made some good points on why this
 isn't as easy as it looks, though:
 
 
 http://it.toolbox.com/blogs/database-soup/admission-control-and-its-discontents-39895
 
 --
 Craig Ringer
 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-20 Thread Hannu Krosing
On Fri, 2010-04-16 at 11:02 +0300, Віталій Тимчишин wrote:
 Hello.
 
 
 I have a query that performs very poor because there is a limit on
 join column that is not applied to other columns:
 
 
 select * from company this_ left outer join company_tag this_1_ on
 this_.id=this_1_.company_id left outer join company_measures
 companymea2_ on this_.id=companymea2_.company_id left outer join
 company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id
 = 7 and this_.id5000 
 and this_1_.company_id5000
 order by this_.id asc limit 1000;
 
 
 (plan1.txt)
 Total runtime: 7794.692 ms
 
 
 At the same time if I apply the limit (5000) to other columns in
 query itself it works like a charm:
 
 
 select * from company this_ left outer join company_tag this_1_ on
 this_.id=this_1_.company_id left outer join company_measures
 companymea2_ on this_.id=companymea2_.company_id left outer join
 company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id
 = 7 and this_.id5000 
 and this_1_.company_id5000
 and companymea2_.company_id5000 and ces3_.company_id5000
 order by this_.id asc limit 1000;

The queries are not the same.

2nd variant will not return the rows where there are no matching rows
inthis_1_ , companymea2_ or ces3_.company_id

A query equivalent to first one would be:


select * from company this_ 
  left outer join company_tag this_1_ 
   on (this_.id=this_1_.company_id 
   and this_1_.company_id5000)
  left outer join company_measures companymea2_ 
   on (this_.id=companymea2_.company_id 
   and companymea2_.company_id5000)
  left outer join company_descr ces3_ 
   on (this_.id=ces3_.company_id 
   and ces3_.company_id5000)
 where this_1_.tag_id = 7 
   and this_.id5000 
 order by this_.id asc 
 limit 1000;


I'm not sure that planner considers the above form of plan rewrite, nor
that it would make much sense to do so unless there was a really small
number of rows where x_.company_id5000 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] planer chooses very bad plan

2010-04-12 Thread Hannu Krosing
On Sun, 2010-04-11 at 23:12 +0200, Corin wrote:
 Hi,
 
 I'm having a query where the planer chooses a very bad plan.
 
 explain analyze SELECT * FROM telegrams WHERE ((recipient_id=508933 
 AND recipient_deleted=FALSE) OR (user_id=508933 AND user_deleted=FALSE)) 
 ORDER BY id DESC LIMIT 10 OFFSET 0
 
 Limit  (cost=0.00..1557.67 rows=10 width=78) (actual 
 time=0.096..2750.058 rows=5 loops=1)
   -  Index Scan Backward using telegrams_pkey on telegrams  
 (cost=0.00..156545.47 rows=1005 width=78) (actual time=0.093..2750.052 
 rows=5 loops=1)
 Filter: (((recipient_id = 508933) AND (NOT recipient_deleted)) 
 OR ((user_id = 508933) AND (NOT user_deleted)))
 Total runtime: 2750.124 ms

You could check if creating special deleted_x indexes helps

do

CREATE INDEX tgrm_deleted_recipent_index ON telegrams(recipient_id)
 WHERE recipient_deleted=FALSE;

CREATE INDEX tgrm_deleted_user_index ON telegrams(user_id) 
 WHERE user_deleted=FALSE;

(if on live system, use CREATE INDEX CONCURRENTLY ...)

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Does the psql executable support a fetch many approach when dumping large queries to stdout?

2010-04-06 Thread Hannu Krosing
On Fri, 2010-04-02 at 16:28 -0400, Beaver, John E wrote:
...

 I know that the query used here could have been a COPY statement, which I 
 assume would
  be better-behaved, but I'm more concerned about the case in which the query 
 is more complex.

COPY can copy out results of a SELECT query as well.

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] How to fast the REINDEX

2010-04-06 Thread Hannu Krosing
On Thu, 2010-04-01 at 04:27 +0530, raghavendra t wrote:
 I'm sorry I couldn't come up with more, but what you've
 provided so
 far is roughly equivalent to me telling you that it takes over
 four
 hours to travel to see my Uncle Jim, and then asking you how I
 can
 find out how he's doing in less time than that.  There's just
 not
 much to go on.  :-(
 
 If you proceed with the course suggested in the URL I
 referenced,
 people on the list have a chance to be more helpful to you.
 Instead of looking into the priority of the question or where it has
 to be posted, it would be appreciated to keep a discussion to the
 point mentioned.  Truely this question belong to some other place as
 you have mentioned in the URL. But answer for Q1 might be expected
 alteast. 

Ok, here is my answer to your Q1:

Q1. What are the parameters will effect, when issuing the REINDEX
command

A: Assuming you meant what parameters affect performance of REINDEX
command.

Most parameters that affect general performance affect also REINDEX
command.

Some that affect more are:

* amount of RAM in your server - the most important thing

* speed of disk subsystem - next most important in case not all of
active data fits in memory 

Tunables

*  maintenance_work_mem - affects how much of sorting can be done in
memory, if you can afford to have maintenance_work_mem  largest index
size then sorting for index creation can be done in RAM only and is
significantly faster than when doing tape sort with intermediate files
on disks.

* wal_buffers - the bigger the better here, but competes with how big
you can make maintenance_work_mem . If more of heap and created indexes
can be kept in shared memory, everything runs faster.

* checkpoint_segments - affects how often whole wal_buffers is synced to
disk, if done too often then wastes lot of disk bandwidth for no good
reason.

* other chekpoint_* - tune to avoid excessive checkpointing.

 Hope i could get the information from the other Thread in other
 catagory.

Nah, actually [PERFORM] is the right place to ask. 

Just most people got the impression that you may be doing unnecessary
REINDEXing, and the best way to speed up unneeded things is not to do
them ;)

 Thank you
  
 Regards
 Raghavendra



-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] mysql to postgresql, performance questions

2010-03-25 Thread Hannu Krosing
On Wed, 2010-03-24 at 09:55 +0100, Yeb Havinga wrote:
 Greg Smith wrote:
  Tom Lane wrote:
  So has anyone looked at porting MythTV to PG?

 
  Periodically someone hacks together something that works, last big 
  effort I'm aware of was in 2006, and then it bit rots away.  I'm sure 
  we'd get some user uptake on the result--MySQL corruption is one of 
  the top ten cause of a MythTV system crashing.
 It would be the same with PG, unless the pg cluster configuration with 
 mythtv would come with a properly configured WAL - I had corrupted 
 tables (and a personal wiki entry  (the other mysql database in my 
 house) *only* when I sometimes took the risk of not shutting down the 
 machine properly when e.g. the remote was missing).

Pulling the plug should not corrupt a postgreSQL database, unless it was
using disks which lie about write caching.

Now need for WAL replica for that

 regards,
 Yeb Havinga
 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Building multiple indexes concurrently

2010-03-22 Thread Hannu Krosing
On Thu, 2010-03-18 at 16:12 -0400, Justin Pitts wrote:
 It seems to me that a separate partition / tablespace would be a much simpler 
 approach.

Do you mean a separate partition/ tablespace for _each_ index built
concurrently ?

 On Mar 17, 2010, at 5:18 PM, Hannu Krosing wrote:
 
  On Wed, 2010-03-17 at 16:49 -0400, Greg Smith wrote:
  Alvaro Herrera wrote:
  Andres Freund escribió:
  
  
  I find it way much easier to believe such issues exist on a tables in 
  constrast to indexes. The likelihood to get sequential accesses on an 
  index is 
  small enough on a big table to make it unlikely to matter much.
  
  
  Vacuum walks indexes sequentially, for one.
  
  
  That and index-based range scans were the main two use-cases I was 
  concerned would be degraded by interleaving index builds, compared with 
  doing them in succession. 
  
  I guess that tweaking file systems to allocate in bigger chunks help
  here ? I know that xfs can be tuned in that regard, but how about other
  common file systems like ext3 ?
  
  - 
  Hannu Krosing   http://www.2ndQuadrant.com
  PostgreSQL Scalability and Availability 
Services, Consulting and Training
  
  
  
  -- 
  Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance
 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Building multiple indexes concurrently

2010-03-18 Thread Hannu Krosing
On Wed, 2010-03-17 at 16:49 -0400, Greg Smith wrote:
 Alvaro Herrera wrote:
  Andres Freund escribió:
 

  I find it way much easier to believe such issues exist on a tables in 
  constrast to indexes. The likelihood to get sequential accesses on an 
  index is 
  small enough on a big table to make it unlikely to matter much.
  
 
  Vacuum walks indexes sequentially, for one.

 
 That and index-based range scans were the main two use-cases I was 
 concerned would be degraded by interleaving index builds, compared with 
 doing them in succession. 

I guess that tweaking file systems to allocate in bigger chunks help
here ? I know that xfs can be tuned in that regard, but how about other
common file systems like ext3 ?

- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Is DBLINK transactional

2010-03-15 Thread Hannu Krosing
On Sat, 2010-03-13 at 20:10 +0800, Craig Ringer wrote:
 On 13/03/2010 5:54 AM, Jeff Davis wrote:
  On Fri, 2010-03-12 at 12:07 -0500, Merlin Moncure wrote:
  of course.  You can always explicitly open a transaction on the remote
  side over dblink, do work, and commit it at the last possible moment.
  Your transactions aren't perfectly synchronized...if you crash in the
  precise moment between committing the remote and the local you can get
  in trouble.  The chances of this are extremely remote though.
 
  If you want a better guarantee than that, consider using 2PC.
 
 Translation in case you don't know: 2PC = two phase commit.
 
 Note that you have to monitor lost transactions that were prepared for 
 commit then abandoned by the controlling app and periodically get rid of 
 them or you'll start having issues.

And you still have the problem of committing one 2PC transaction and
then crashing before committing the other and then crashing the
transaction monitor before being able to record what crashed :P, though
this possibility is even more remote than just crashing between the 2
original commits (dblink and local).

To get around this fundamental problem, you can actually do async queues
and remember, what got replayed on the remote side, so if you have
crashes on either side, you can simply replay again.

  The problem with things that are extremely remote possibilities are
  that they tend to be less remote than we expect ;)
 
 ... and they know just when they can happen despite all the odds to 
 maximise the pain and chaos caused.
 
 --
 Craig Ringer
 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PG 8.3 and large shared buffer settings

2009-10-03 Thread Hannu Krosing
On Mon, 2009-09-28 at 10:36 -0700, Josh Berkus wrote:
 On 9/26/09 8:19 AM, Greg Smith wrote:
  This means that the question you want an answer to is if the OS cache
  isn't really available, where does giving memory to shared_buffers
  becomes less efficient than not caching things at all?  My guess is
  that this number is much larger than 10GB, but I don't think anyone has
  done any tests to try to quantify exactly where it is.  Typically when
  people are talking about systems as large as yours, they're dedicated
  database servers at that point, so the OS cache gets considered at the
  same time. If it's effectively out of the picture, the spot where
  caching still helps even when it's somewhat inefficient due to buffer
  contention isn't well explored.
 
 It also depends on the filesystem.  In testing at Sun and on this list,
 people have found that very large s_b (60% of RAM) plus directIO was
 actually a win on Solaris UFS, partly because UFS isn't very agressive
 or smart about readahead and caching.  On Linux/Ext3, however, it was
 never a win.

Again, it depends. 

On my recent testing of a simple seqscan on 1-int table, that are
entirely in cache (either syscache or shared buffers), the shared
buffers only scan was 6% to 10% percent faster than when the relation
was entirely in system cache and each page had to be switched in via
syscall / context switch. 

This was on Linux/Ext3 but I suspect this to be mostly independent of
file system.

Also, in ancient times, when I used Slony, and an early version of Slony
at that, which did not drop and recreate indexes around initial copy,
the copy time could be 2 to 3 _times_ slower for large tables with lots
of indexes when indexes were in system cache vs. when they were in
shared buffers (if I remember correctly, it was 1G shared buffers vs. 3G
on a 4G machine). It was probably due to all kinds of index page splits
etc which shuffled index pages back and forth a lot between userspace
and syscache. So this is not entirely read-only thing either.

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-05 Thread Hannu Krosing
On E, 2005-10-03 at 14:16 -0700, Josh Berkus wrote:
 Jeff,
 
   Nope, LOTS of testing, at OSDL, GreenPlum and Sun.   For comparison, A
   Big-Name Proprietary Database doesn't get much more than that either.
 
  I find this claim very suspicious.  I get single-threaded reads in
  excess of 1GB/sec with XFS and  250MB/sec with ext3.
 
 Database reads?  Or raw FS reads?  It's not the same thing.

Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and
it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k  disks in
RAID10, reiserfs). A little less than 100MB sec.

After this I ran count(*) over a 2.4GB file from another tablespace on
another device (4x142GB 10k disks in RAID10) and it run 22.5 sec on
first run and 12.5 on second.

db=# show shared_buffers ;
 shared_buffers

 196608
(1 row)

db=# select version();
  version

 PostgreSQL 8.0.3 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 3.3.6
(Debian 1:3.3.6-7)
(1 row)


-- 
Hannu Krosing [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] What needs to be done for real Partitioning?

2005-03-22 Thread Hannu Krosing
On E, 2005-03-21 at 09:55 -0800, Josh Berkus wrote:
 Stacy,
 
  Luckily they that had the chance to work with a truly fantastic DBA (the
  author of an Oracle Press performance tuning book even) before they could
  switch back.  He convinced them to make some of their indexes global.
  Performance dramatically improved (compared with both the unpartitioned
  schema, and the partitioned-and-locally-indexed schema), and they've since
  stayed with partitioned tables and a mix of local and global indexes.
 
 Hmmm.  Wouldn't Greg's suggestion of a bitmap index which holds information 
 on 
 what values are found in what partition also solve this?Without 1/2 of 
 the overhead imposed by global indexes?

 I can actually see such a bitmap as being universally useful to the 
 partitioning concept ... for one, it would resolve the whole partition on 
 {value} issue.

I once (maybe about a year ago) tried to elaborate using bitmap 
index(es) with page granularity as a tool for simultaneous clustering
and lookup for data warehousing using postgres. the main idea was to
determine storage location from AND of all clustered bitmap indexes
and corresponding fast and clustered lookups.

This could/should/maybe :) possibly be combined with clustering as well.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] What needs to be done for real Partitioning?

2005-03-22 Thread Hannu Krosing
On P, 2005-03-20 at 00:52 +0100, PFC wrote:
  tableoid would accomplish that already, assuming that the partitioned
  table is effectively a view on separate physical tables.
 
  regards, tom lane
 
   Very good.
 
   Also note the possibility to mark a partition READ ONLY. Or even a 
 table.
   It does not seem very useful but just think that for instance the 
 1999,  
 2000 ... 2004 partitions of a big archive probably never change.  
 READLONY means we're sure they never change, thus no need to backup them  
 every time. Keeping the example of some DB arranged by years / current  
 year / current month, Just backup the current month part every day and  
 the current year every month when you switch partitions.
   This could be achieved also by storing the time of last modification of 
 a  
 table somewhere.

Would we still need regular VACUUMing of read-only table to avoid 
OID-wraparound ?

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] What needs to be done for real Partitioning?

2005-03-22 Thread Hannu Krosing
On L, 2005-03-19 at 19:03 -0500, Tom Lane wrote:
 Josh Berkus josh@agliodbs.com writes:
  -- INSERT INTO should automatically create new partitions where necessary
  -- DELETE FROM should automatically drop empty partitions
  
  I am not sure I agree with either of those, and the reason is that they
  would turn low-lock operations into high-lock operations.  
 
  For INSERT, I think that's a problem we need to work through.
 
 Possibly, but I'm concerned about locking and deadlock issues.  The
 reason that this is iffy is you would start the operation with only
 an INSERT-grade lock, and then discover that you needed to add a
 partition, which is surely something that needs an exclusive-grade
 lock (consider two sessions trying to add the same partition at the
 same time).  So I don't see how to do it without lock upgrading,
 and lock upgrading is always a recipe for deadlocks.
 
 The DELETE case is even worse because you can't physically release
 storage until you're sure nothing in it is needed anymore by any open
 transaction --- that introduces VACUUM-like issues as well as the
 deadlock problem.
 

If we go with my proposal (other post in this thread) of doing most of
the partitioning in the level between logical file and physikal 1Gb
storage files, then adding a partition should be nearly the same as
crossing the 1G boundary is now.

removing the partition would be just plain vacuum (if we can make pg
shring each 1G subfile independently)

 ---(end of broadcast)---
 TIP 8: explain analyze is your friend
-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] What needs to be done for real Partitioning?

2005-03-22 Thread Hannu Krosing
On L, 2005-03-19 at 23:47 -0500, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Hmm.  You are right, but without that we won't be able to enforce
  uniqueness on the partitioned table (we could only enforce it on each
  partition, which would mean we can't partition on anything else than
  primary keys if the tables have one).  IMHO this is something to
  consider.
 
 Well, partitioning on the primary key would be Good Enough for 95% or
 99% of the real problems out there.  I'm not excited about adding a
 large chunk of complexity to cover another few percent.

That automatically means that partitioning expression has to be a range
over PK. (you dont want to have every tuple in separate tabel :)

And it also means that you have to automatically create new partitions.

Are you sure that partitioning on anything else than PK would be
significantly harder ?

I have a case where I do manual partitioning over start_time
(timestamp), but the PK is an id from a sequence. They are almost, but
not exactly in the same order. And I don't think that moving the PK to
be (start_time, id) just because of partitioning on PK only would be a
good design in any way.

So please don't design the system to partition on PK only.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] One tuple per transaction

2005-03-18 Thread Hannu Krosing
On L, 2005-03-12 at 14:05 -0800, Josh Berkus wrote:
 Tambet,
 
  In one of our applications we have a database function, which
  recalculates COGS (cost of good sold) for certain period. This involves
  deleting bunch of rows from one table, inserting them again in correct
  order and updating them one-by-one (sometimes one row twice) to reflect
  current state. The problem is, that this generates an enormous amount of
  tuples in that table.
 
 Sounds like you have an application design problem ...  how about re-writing 
 your function so it's a little more sensible?

Also, you could at least use a temp table for intermediate steps. This
will at least save WAL traffic.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-26 Thread Hannu Krosing
Ühel kenal päeval (teisipäev, 25. jaanuar 2005, 10:41-0500), kirjutas
Tom Lane:
 Hannu Krosing [EMAIL PROTECTED] writes:
  Why is removing index entries essential ?
 
 Because once you re-use the tuple slot, any leftover index entries would
 be pointing to the wrong rows.

That much I understood ;)

But can't clearing up the index be left for later ? 

Indexscan has to check the data tuple anyway, at least for visibility.
would adding the check for field sameness in index and data tuples be
too big performance hit ?

   regards, tom lane
 
 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq
-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-25 Thread Hannu Krosing
Ühel kenal päeval (esmaspäev, 24. jaanuar 2005, 11:52+0900), kirjutas
Tatsuo Ishii:
  Tatsuo Ishii [EMAIL PROTECTED] writes:
   Probably VACUUM works well for small to medium size tables, but not
   for huge ones. I'm considering about to implement on the spot
   salvaging dead tuples.
  
  That's impossible on its face, except for the special case where the
  same transaction inserts and deletes a tuple.  In all other cases, the
  transaction deleting a tuple cannot know whether it will commit.
 
 Of course. We need to keep a list of such that tuples until commit or
 abort.

what about other transactions, which may have started before current one
and be still running when current one commites ?


I once proposed an extra parameter added to VACUUM FULL which determines
how much free space to leave in each page vacuumed. If there were room
the new tuple could be placed near the old one in most cases and thus
avoid lots of disk head movement when updating huge tables in one go.



Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-25 Thread Hannu Krosing
Ühel kenal päeval (pühapäev, 23. jaanuar 2005, 15:40-0500), kirjutas Tom
Lane:
 Simon Riggs [EMAIL PROTECTED] writes:
  Changing the idea slightly might be better: if a row update would cause
  a block split, then if there is more than one row version then we vacuum
  the whole block first, then re-attempt the update.
 
 Block split?  I think you are confusing tables with indexes.
 
 Chasing down prior versions of the same row is not very practical
 anyway, since there is no direct way to find them.
 
 One possibility is, if you tried to insert a row on a given page but
 there's not room, to look through the other rows on the same page to see
 if any are deletable (xmax below the GlobalXmin event horizon).  This
 strikes me as a fairly expensive operation though, especially when you
 take into account the need to get rid of their index entries first.

Why is removing index entries essential ?

In pg yuo always have to visit data page, so finding the wrong tuple
there could just produce the same result as deleted tuple (which in this
case it actually is). The cleaning of index entries could be left to the
real vacuum.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-25 Thread Hannu Krosing
Ühel kenal päeval (neljapäev, 20. jaanuar 2005, 16:00+0100), kirjutas
Hervé Piedvache:

  Will both do what you want. Replicator is easier to setup but
  Slony is free.
 
 No ... as I have said ... how I'll manage a database getting a table of may 
 be 
 250 000 000 records ? I'll need incredible servers ... to get quick access or 
 index reading ... no ?
 
 So what we would like to get is a pool of small servers able to make one 
 virtual server ... for that is called a Cluster ... no ?
 
 I know they are not using PostgreSQL ... but how a company like Google do to 
 get an incredible database in size and so quick access ?

They use lots of boxes and lots custom software to implement a very
specific kind of cluster.

 regards,
-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-25 Thread Hannu Krosing
Ühel kenal päeval (neljapäev, 20. jaanuar 2005, 11:02-0500), kirjutas
Rod Taylor:


 Slony has some other issues with databases  200GB in size as well
 (well, it hates long running transactions -- and pg_dump is a regular
 long running transaction)

IIRC it hates pg_dump mainly on master. If you are able to run pg_dump
from slave, it should be ok.

-- 
Hannu Krosing [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] pg_restore taking 4 hours!

2004-12-07 Thread Hannu Krosing
On P, 2004-12-05 at 21:43, Rodrigo Carvalhaes wrote:
 Hi !
 
 Thanks for the lots of tips that I received on this matter.
 
...
 There is something more that I can try to improve this performance?

check the speed of your ide drive. maybe tweak some params with
/sbin/hdparm . Sometimes the defaults result in 2MB/sec r/w speeds
(instead on(30-70 MB/sec)


Hannu


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Slow response of PostgreSQL

2004-02-19 Thread Hannu Krosing
Saleem Burhani Baloch kirjutas N, 19.02.2004 kell 11:01:
 Hi,
 
 Thanks every one for helping me. I have upgraded to 7.4.1 on 
 redhat 8 ( rh 9 require a lot of lib's) and set the configuration 
 sent by Chris. Now the query results in 6.3 sec waooo. I m thinking 
 that why the 7.1 process aggregate slowly. Anyway.
 
 I still have to go for 2 sec result 

What is the plan now ?


Hannu


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] database performance and query performance question

2004-01-22 Thread Hannu Krosing
Shea,Dan [CIS] kirjutas N, 22.01.2004 kell 22:35:
 Something that I do not understand is why if you use a valid_time =
 '2004-01-22 00:00:00' the query will use the index but if you do a
 valid_time   '2004-01-22 00:00:00' it does not use the index?

It probably can't tell if  is selective enough to justify using index.

Together with limit 10 it may be.

You could try 

explain analyze select * from forecastelement where valid_time between 
'2004-01-22'::date and '2004-01-22'::date limit 10;

to see if this is considered good enough.

--
Hannu


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] database performance and query performance question

2004-01-22 Thread Hannu Krosing
Hannu Krosing kirjutas N, 22.01.2004 kell 22:46:
 Shea,Dan [CIS] kirjutas N, 22.01.2004 kell 22:35:
  Something that I do not understand is why if you use a valid_time =
  '2004-01-22 00:00:00' the query will use the index but if you do a
  valid_time   '2004-01-22 00:00:00' it does not use the index?
 
 It probably can't tell if  is selective enough to justify using index.
 
 Together with limit 10 it may be.
 
 You could try 
 
 explain analyze select * from forecastelement where valid_time between 
 '2004-01-22'::date and '2004-01-22'::date limit 10;

Sorry, that should have been:

between '2004-01-22'::date and '2004-01-23'::date


 to see if this is considered good enough.
 
 --
 Hannu
 
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] database performance and query performance question

2004-01-22 Thread Hannu Krosing
Shea,Dan [CIS] kirjutas N, 22.01.2004 kell 23:32:
 The end date in the previous example was actually invalid  between
 '2004-01-12'::date and '2003-01-12'::date;
 There have been multiple inserts since I recreated the index but it took
 quite some time to complete the following
 PWFPM_DEV=# explain analyze select * from forecastelement where valid_time
 between '2004-01-12'::date and '2004-01-13'::date;

You could try ORDER BY to bias the optimiser towards using an index:

explain analyze
 select *
  from forecastelement
 where valid_time  '2004-01-12'::date
 order by valid_time
 limit 10;

This also may be more close to what you are expecting :)

--
Hannu


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] failures on machines using jfs

2004-01-13 Thread Hannu Krosing
Greg Spiegelberg kirjutas E, 12.01.2004 kell 19:03:
 Hannu Krosing wrote:
  Spiegelberg, Greg kirjutas P, 11.01.2004 kell 18:21:
  
 It would seem we're experiencing somthing similiar with our scratch
 volume (JFS mounted with noatime).
  
  
  Which files/directories do you keep on scratch volume ?
  
  All postgres files or just some (WAL, tmp) ?
 
 No Postgres files are kept in scratch only the files being loaded
 into the database via COPY or lo_import.

then the speedup does not make any sense !

Is reading from jfs filesystem also 5 times faster than reading from
ext3 ?

The only explanation I can give to filling database from jfs volume to
be so much faster could be some strange filesystem cache interactions.


Hannu



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] failures on machines using jfs

2004-01-12 Thread Hannu Krosing
Spiegelberg, Greg kirjutas P, 11.01.2004 kell 18:21:
 It would seem we're experiencing somthing similiar with our scratch
 volume (JFS mounted with noatime).

Which files/directories do you keep on scratch volume ?

All postgres files or just some (WAL, tmp) ?

-
Hannu


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Excessive rows/tuples seriously degrading query

2003-12-16 Thread Hannu Krosing
Chadwick, Russell kirjutas L, 13.12.2003 kell 00:40:
  
 Hello everyone.
 Can anyone explain why this table which has never had more than a
 couple rows in it shows  500k in the query planner even after running
 vacuum full.  Its terribly slow to return 2 rows of data.  The 2 rows
 in it are being updated a lot but I couldn't find any explanation for
 this behavior. 

It can be that there is an idle transaction somewhere that has locked a
lot of rows (i.e. all your updates have been running inside the same
transaction for hour or days)

try: 
$ ps ax| grep post

on my linux box this gives

 1683 ?  S0:00 /usr/bin/postmaster -p 5432
 1704 ?  S0:00 postgres: stats buffer process
 1705 ?  S0:00 postgres: stats collector process
 5520 ?  S0:00 postgres: hu hannu [local] idle in transaction
 5524 pts/2  S0:00 grep post

where backend 5520 seems to be the culprit.

  Anything I could try besides droping db and recreating?  

make sure that no other backend is connected to db and do your 
 vacuum full; analyze;


or if there seems to be something unidentifieable making your table
unusable, then just recreate that table:

begin;
create table stock_log_positions_tmp 
as select * from stock_log_positions;
drop table stock_log_positions;
alter table stock_log_positions_tmp
 rename to stock_log_positions;
-- if you have any constraints, indexes or foreign keys
-- then recreate them here as well
commit;

 Thanks - Russ
  
---
hannu


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] cross table indexes or something?

2003-11-26 Thread Hannu Krosing
Jeremiah Jahn kirjutas K, 26.11.2003 kell 22:14:
 I was wondering if there is something I can do that would act similar to
 a index over more than one table. 
 
 I have about 3 million people in my DB at the moment, they all have
 roles, and many of them have more than one name. 
 
 for example, a Judge will only have one name, but a Litigant could have
 multiple aliases. Things go far to slow when I do a query on a judge
 named smith.

If you dont need all the judges named smith you could try to use LIMIT.

Have you run ANALYZE ? Why does DB think that there is only one judge
with name like SMITH% ?

-
Hannu

P.S. 
Always send EXPLAIN ANALYZE output if asking for advice on [PERFORM]

-
Hannu

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Interbase/Firebird - any users out there - what's

2003-11-06 Thread Hannu Krosing
Christopher Browne kirjutas N, 06.11.2003 kell 20:23:
 Private [EMAIL PROTECTED] writes:
  Try this benchmark on PostgreSQL, MySQL, FireBird, Oracle:
 
  http://go.jitbot.com/dbbench-pg-fb-mys-orcl
 
 It looks like a good candidate for adding in a plpgsql stored
 procedure to get similar speedups to what was gotten with the Oracle
 benchmark.

It would also be interesting to see the same test run on Postgresql on
Linux/UNIX. PgSQL on win2000 (most likely using cygwin) is probably not
the best you can get out of that hardware.

---
Hannu


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] PostgreSQL data on a NAS device ?

2003-10-20 Thread Hannu Krosing
Alexander Priem kirjutas E, 20.10.2003 kell 16:04:
 Even better than the four-disk NAS I mentioned earlier is the following:
 
 Promise UltraTrak RM8000. This is a so-called SCSI-to-IDE RAID system.

While you are at it, you could also check out http://www.3ware.com/

I guess one of these with 1 rpm 36GB SATA drivest would be pretty
fast and possibly cheaper than SCSI raid.

--
Hannu

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Low Insert/Update Performance

2003-10-20 Thread Hannu Krosing
Rhaoni Chiu Pereira kirjutas E, 20.10.2003 kell 17:13:
 Hi List,
 
I got a P4 1.7Ghz , 512MB RAM , HD 7200 RPM, on RED HAT 9 running PostgreSQL 
 7.3.2-3 Database.
I have a Delphi aplication that updates the Oracle database using .dbf 
 file's information  ( converting the data from the old clipper aplication ) and 
 it takes about 3min and 45 seconds to update Jan/2003 .

Have you tried contrib/dbase to do the same ?

How fast does this run

 My problem is that I must substitute this Oracle for  a PostgreSQL database 
 and this same Delphi aplication takes 45 min to update Jan/2003.
 All delphi routines are converted and optmized to work with PgSQL.

Could it be that you try to run each insert in a separate transaction in
PgSQL version ?

Another possibility is that there is a primary key index created on
empty tables which is not used in subsequent UNIQUE tests when tables
start to fill and using index would be useful. An ANALYZE in a parallel
backend could help here. Same can be true for foreign keys and unique
constraints.

---
Hannu


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Compare rows

2003-10-09 Thread Hannu Krosing
Josh Berkus kirjutas N, 09.10.2003 kell 08:36:
 Chris,

  The need to do a lot of joins would likely hurt performance somewhat,
  as well as the way that it greatly increases the number of rows.
  Although you could always split it into several tables, one for each
  value_type, and UNION them into a view...
 
 It increases the number of rows, yes, but *decreases* the storage size of data 
 by eliminating thousands ... or millions ... of NULL fields. 

I'm not sure I buy that.

Null fields take exactly 1 *bit* to store (or more exactly, if you have
any null fields in tuple then one 32bit int for each 32 fields is used
for NULL bitmap), whereas the same fields in vertical table takes 4
bytes for primary key and 1-4 bytes for category key + tuple header per
value + neccessary indexes. So if you have more than one non-null field
per tuple you will certainly lose in storage. 

 How would splitting the vertical values into dozens of seperate tables help things?

If you put each category in a separate table you save 1-4 bytes for
category per value, but still store primary key and tuple header *per
value*.

Jou may stii get better performance for single-column comparisons as
fewer pages must be touched.

 Personally, I'd rather have a table with 3 columns and 8 million rows than a 
 table with 642 columns and 100,000 rows.  Much easier to deal with.

Same here ;)

--
Hannu


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] [HACKERS] Index/Function organized table layout (from Re:

2003-10-04 Thread Hannu Krosing
Christopher Browne kirjutas R, 03.10.2003 kell 00:57:
 [EMAIL PROTECTED] (Jean-Luc Lachance) writes:
  That's one of the draw back of MVCC.  
  I once suggested that the transaction number and other house keeping
  info be included in the index, but was told to forget it...
  It would solve once and for all the issue of seq_scan vs index_scan.
  It would simplify the aggregate problem.
 
 It would only simplify _one_ case, namely the case where someone cares
 about the cardinality of a relation, and it would do that at
 _considerable_ cost.
 
 A while back I outlined how this would have to be done, and for it to
 be done efficiently, it would be anything BUT simple.  

Could this be made a TODO item, perhaps with your attack plan. 
Of course as strictly optional feature useful only for special situations
(see below)

I cross-post this to [HACKERS] as it seem relevant to a problem recently
discussed there.

 It would be very hairy to implement it correctly, and all this would
 cover is the single case of SELECT COUNT(*) FROM SOME_TABLE;

Not really. Just yesterday there was a discussion on [HACKERS] about
implementing btree-organized tables, which would be much less needed if
the visibility info were kept in indexes. 

 If you had a single WHERE clause attached, you would have to revert to
 walking through the tuples looking for the ones that are live and
 committed, which is true for any DBMS.

If the WHERE clause could use the same index (or any index with
visibility info) then there would be no need for walking through the
tuples in data relation.

the typical usecase cited on [HACKERS] was time series data, where
inserts are roughly in (timestamp,id)order but queries in (id,timestamp)
order. Now if the index would include all relevant fields
(id,timestamp,data1,data2,...,dataN) then the query could run on index
only touching just a few pages and thus vastly improving performance. I
agree that this is not something everybody needs, but when it is needed
it is needed bad.

 And it still begs the same question, of why the result of this query
 would be particularly meaningful to anyone.  I don't see the
 usefulness; I don't see the value of going to the considerable effort
 of fixing this purported problem.

Being able to do fast count(*) is just a side benefit.


Hannu


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] COUNT(*) again (was Re: [HACKERS] Index/Function

2003-10-04 Thread Hannu Krosing
Tom Lane kirjutas L, 04.10.2003 kell 19:07:
 Hannu Krosing [EMAIL PROTECTED] writes:
  Christopher Browne kirjutas R, 03.10.2003 kell 00:57:
  A while back I outlined how this would have to be done, and for it to
  be done efficiently, it would be anything BUT simple.  
 
  Could this be made a TODO item, perhaps with your attack plan. 
 
 If I recall that discussion correctly, no one including Christopher
 thought the attack plan was actually reasonable.
 
 What this keeps coming down to is that an optimization that helps only
 COUNT(*)-of-one-table-with-no-WHERE-clause would be too expensive in
 development and maintenance effort to justify its existence.

Please read further in my email ;)

The point I was trying to make was that faster count(*)'s is just a side
effect. If we could (conditionally) keep visibility info in indexes,
then this would also solve the problem fo much more tricky question of
index-structured tables.

Count(*) is *not* the only query that could benefit from not needing to
go to actual data table for visibilty info, The much more needed case
would be the inveres time series type of queries, which would
otherways trash cache pages badly.


Hannu


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] inferior SCSI performance

2003-10-02 Thread Hannu Krosing
Christopher Browne kirjutas K, 01.10.2003 kell 19:21:

 
 The FS-related result appeared surprising, as the stories I had
 heard suggested that JFS hadn't been particularly heavily tuned on
 Linux, whereas XFS was supposed to be the speed demon.

Gentoo linux recommends XFS only for SAN+fibre channel + good ups for
anything but database use ;)

 It is entirely possible that the result I saw was one that would
 reverse partially or even totally on a system LACKING that cache.  XFS
 might play better when we're cacheless; the (perhaps only fabled)
 demerits of JFS being more than totally hidden if we add the cache.
 
 What I find disappointing is that it isn't possible to get SSD cards
 that are relatively inexpensive.  A similarly fabulous performance
 increase _ought_ to be attainable if you could stick pg_xlog and
 pg_clog on a 256MB (or bigger!) battery-backed SSD, ideally one that
 plugs into a PCI slot.

For really cheap and for small-size transactions you could experiment
with USB2 memory sticks (some of them claim 34MB/s write speed), perhaps
in striped/mirrored configuration. You would just need something
counting writes in the driver layer to alert you when you are reaching
the x00k writes limit and have to plug in new sticks :)

OTOH, articles I found through quick googling suggest only 2.4MB/s write
and 7MB/s read speeds for USB 2.0 memory sticks, so the 34MB is proably
just sales pitch and refers to bus speed, not actual write speed ;(

 Unfortunately, while there are companies hawking SSDs, they are in the
 you'll have to talk to our salescritter for pricing category, which
 means that they must be ferociously expensive.  :-(.

the cheapest I found was the one with external backup power was ~1.8k$
for 2GB PCI device

http://www.cdw.com/shop/search/Results.aspx?key=platypusx=0y=0

An external 16GB one with battery backup and
write-t-small-ide-drives-on-power-failure was ~25k$

-
Hannu

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Need advice about triggers

2003-09-10 Thread Hannu Krosing
Mindaugas Riauba kirjutas K, 10.09.2003 kell 13:21:

 
 router_db=# explain analyze update ifdata set ifspeed=256000,
 ifreason='12121', iflastupdate=CURRENT_TIMESTAMP WHERE clientid='#0003904#';
  QUERY PLAN
 
 
  Index Scan using ifdata_clientid_key on ifdata  (cost=0.00..5.64 rows=1
 width=116) (actual time=0.17..0.36 rows=1 loops=1)
Index Cond: (clientid = '#0003904#'::character varying)
  Total runtime: 1.70 msec
 (3 rows)

could you try the same query on similar table, where clientid is int4 ?

is it faster ?

does the performance degrade at a slower rate?

---
Hannu


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] SELECT's take a long time compared to other DBMS

2003-09-04 Thread Hannu Krosing
Relaxin kirjutas N, 04.09.2003 kell 03:28:
 I have a table with 102,384 records in it, each record is 934 bytes.

I created a test database on my Linux (RH9) laptop with 30GB/4200RPM ide
drive and P3-1133Mhz, 768MB, populated it with 128000 rows of 930 bytes
each and did 

[EMAIL PROTECTED] hannu]$ time psql test100k -c 'select * from test' 
/dev/null
 
real0m3.970s
user0m0.980s
sys 0m0.570s

so it seems definitely not a problem with postgres as such, but perhaps
with Cygwin and/or ODBC driver

I also ran the same query  using the standard pg adapter:

 import pg, time

 con = pg.connect('test100k')

 def getall():
... t1 = time.time()
... res = con.query('select * from test')
... t2 = time.time()
... list = res.getresult()
... t3 = time.time()
... print t2 - t1, t3-t2
...
 getall()
3.27637195587 1.10105705261
 getall()
3.07413101196 0.996125936508
 getall()
3.03377199173 1.07322502136

which gave similar results

--
Hannu 




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html