[PERFORM] t1000/t2000 sun-servers

2006-03-06 Thread Claus Guttesen
Hi.

Has anybody tried the new Sun cool-thread servers t1000/t2000 from
Sun? I'd love to see benchmarks with Solaris 10 and pg 8.1.

regards
Claus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Postgres on VPS - how much is enough?

2006-03-06 Thread Nagita Karunaratne
How big a VPS would I need to run a Postgres DB.

I need a Postgres database with about 15 tables that will run on a
single virtual private server.

The 15 tables will be spread over three tablespaces (5 tables per
tablespace) and be accessed by three different applications running on
different machines.

One application will add about 500 orders per day
Another will  access this data to create and send about 500 emails per day
A third will access this data to create an after-sales survey for at
most 500 times per day.

What type of VPS would I need to run a database with this type pf load?
Is 128 MB ram enough?
What percentage of a 2.8 GHz CPU would be required?

It's been a long time since I used Postgres.

Thanks for any help,
Nagita

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] t1000/t2000 sun-servers

2006-03-06 Thread Neil Saunders
I may be able to organize a test on a T2000 if someone could give
advice as to an appropriate test to run...

Cheers,

Neil

On 3/6/06, Claus Guttesen [EMAIL PROTECTED] wrote:
 Hi.

 Has anybody tried the new Sun cool-thread servers t1000/t2000 from
 Sun? I'd love to see benchmarks with Solaris 10 and pg 8.1.

 regards
 Claus

 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Hanging queries and I/O exceptions

2006-03-06 Thread Jan de Visser
Hello,

While doing performance tests on Windows Server 2003 we observed to following 
two problems.

Environment: J2EE application running in JBoss application server, against 
pgsql 8.1 database. Load is caused by a smallish number of (very) complex 
transactions, typically about 5-10 concurrently.

The first one, which bothers me the most, is that after about 6-8 hours the 
application stops processing. No errors are reported, neither by the JDBC 
driver nor by the server, but when I kill the application server, I see that 
all my connections hang in a SQL statements (which never seem to return):

2006-03-03 08:17:12 4504 6632560 LOG:  duration: 45087000.000 ms  statement: 
EXECUTE unnamed  [PREPARE:  SELECT objID FROM objects WHERE objID = $1 FOR 
UPDATE]

I think I can reliably reproduce this by loading the app, and waiting a couple 
of hours.



The second problem is less predictable:

JDBC exception:

An I/O error occured while sending to the backend.
org.postgresql.util.PSQLException: An I/O error occured while sending to the 
backend.
at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:214)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:430)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:346)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:250)


In my server log, I have:

2006-03-02 12:31:02 5692 6436342 LOG:  could not receive data from client: A 
non-blocking socket operation could not be completed immediately.

At the time my box is fairly heavy loaded, but still responsive. Server and 
JBoss appserver live on the same dual 2Ghz Opteron.

A quick Google told me that:

1. More people have seen this.
2. No solutions.
3. The server message appears to indicate an unhandled WSAEWOULDBLOCK winsock 
error on recv(), which MSDN said is to be expected and should be retried.

Is this a known bug?

jan 


-- 
--
Jan de Visser                     [EMAIL PROTECTED]

                Baruk Khazad! Khazad ai-menu!
--

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Postgres on VPS - how much is enough?

2006-03-06 Thread Matthew Nuzum
On 3/6/06, Nagita Karunaratne [EMAIL PROTECTED] wrote:
 How big a VPS would I need to run a Postgres DB.


 One application will add about 500 orders per day
 Another will  access this data to create and send about 500 emails per day
 A third will access this data to create an after-sales survey for at
 most 500 times per day.

 What type of VPS would I need to run a database with this type pf load?
 Is 128 MB ram enough?
 What percentage of a 2.8 GHz CPU would be required?

My problem with running PG inside of a VPS was that the VPS used a
virtual filesystem... basically, a single file that had been formatted
and loop mounted so that it looked like a regular hard drive.
Unfortunately, it was very slow. The difference between my application
and yours is that mine well more than filled the 1GB of RAM that I had
allocated. If your data will fit comfortably into RAM then you may be
fine.

If you really want to know how it will work, try running it yourself.
Two projects that make this really easy and free is the colinux
project[1] which allows you to run a linux VPS in Windows and the
linux-vserver project[2] which is free software that works on pretty
much any linux OS.

Try it out, tinker with the values and that way you won't have to
guess when making your purchase decission.

[1] http://www.colinux.org/ Coperative Linux
[2] http://linux-vserver.org/ Linux-vserver project

--
Matthew Nuzum
www.bearfruit.org

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgres on VPS - how much is enough?

2006-03-06 Thread Kevin

Nagita Karunaratne wrote:

How big a VPS would I need to run a Postgres DB.

I need a Postgres database with about 15 tables that will run on a
single virtual private server.

The 15 tables will be spread over three tablespaces (5 tables per
tablespace) and be accessed by three different applications running on
different machines.

One application will add about 500 orders per day
Another will  access this data to create and send about 500 emails per day
A third will access this data to create an after-sales survey for at
most 500 times per day.

What type of VPS would I need to run a database with this type pf load?
Is 128 MB ram enough?
What percentage of a 2.8 GHz CPU would be required?
  
If the database is going to be larger then the allocated memory, disk 
I/O is very important.  Not all VPS technologies are equal in this 
regard. (see link below)  Like someone else suggested, the best way to 
know what VPS specs you need is to do your own tests/benchamarks.


http://www.cl.cam.ac.uk/Research/SRG/netos/xen/performance.html

-Kevin


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Can anyone explain this pgbench results?

2006-03-06 Thread Joost Kraaijeveld
Hi,

Below are some results of running pgbench, run on a machine that is doing 
nothing else than running PostgreSQL woth pgbench. The strange thing is that 
the results are *constantly alternating* hight (750-850 transactions)and low 
(50-80 transactions), no matter how many test I run. If I wait a long time ( 5 
minutes) after running the test, I always get a hight score, followed by a low 
one, followed by a high one, low one etc. 

I was expecting a low(ish) score the first run (because the tables are not 
loaded in the cache yet), followed by continues high(ish) scores, but not an 
alternating pattern. I also did not expect so much difference, given the 
hardware I have (Dual Opteron, 4GB memory , 3Ware SATA RAID5 with 5 disks, 
seerate swap and pg_log disks).

Anyone any idea?

Results of pgbench:

[EMAIL PROTECTED]:/usr/lib/postgresql/8.1/bin$ ./pgbench  -c 10 -t 150 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 150
number of transactions actually processed: 1500/1500
tps = 50.651705 (including connections establishing)
tps = 50.736338 (excluding connections establishing)
[EMAIL PROTECTED]:/usr/lib/postgresql/8.1/bin$ ./pgbench  -c 10 -t 150 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 150
number of transactions actually processed: 1500/1500
tps = 816.972995 (including connections establishing)
tps = 836.951755 (excluding connections establishing)
[EMAIL PROTECTED]:/usr/lib/postgresql/8.1/bin$ ./pgbench  -c 10 -t 150 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 150
number of transactions actually processed: 1500/1500
tps = 42.924294 (including connections establishing)
tps = 42.986747 (excluding connections establishing)
[EMAIL PROTECTED]:/usr/lib/postgresql/8.1/bin$ ./pgbench  -c 10 -t 150 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 150
number of transactions actually processed: 1500/1500
tps = 730.651970 (including connections establishing)
tps = 748.538852 (excluding connections establishing)


TIA

Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
e-mail: [EMAIL PROTECTED]
web: www.askesis.nl 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Postgres on VPS - how much is enough?

2006-03-06 Thread Nagita Karunaratne
Thanks for the replies,

From personal experience, would you run Postgres on a linux machine
(NOT a  vps) with 512MB of ram?

Assumining I can keep all my data in memory.

Thanks,
Nagita

 My problem with running PG inside of a VPS was that the VPS used a
 virtual filesystem... basically, a single file that had been formatted
 and loop mounted so that it looked like a regular hard drive.
 Unfortunately, it was very slow. The difference between my application
 and yours is that mine well more than filled the 1GB of RAM that I had
 allocated. If your data will fit comfortably into RAM then you may be
 fine.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Postgres on VPS - how much is enough?

2006-03-06 Thread Marc G. Fournier

On Mon, 6 Mar 2006, Matthew Nuzum wrote:


On 3/6/06, Nagita Karunaratne [EMAIL PROTECTED] wrote:

How big a VPS would I need to run a Postgres DB.




One application will add about 500 orders per day
Another will  access this data to create and send about 500 emails per day
A third will access this data to create an after-sales survey for at
most 500 times per day.

What type of VPS would I need to run a database with this type pf load?
Is 128 MB ram enough?
What percentage of a 2.8 GHz CPU would be required?


My problem with running PG inside of a VPS was that the VPS used a
virtual filesystem... basically, a single file that had been formatted
and loop mounted so that it looked like a regular hard drive.
Unfortunately, it was very slow. The difference between my application
and yours is that mine well more than filled the 1GB of RAM that I had
allocated. If your data will fit comfortably into RAM then you may be
fine.


We host VPSs here (http://www.hub.org) and don't use the 'single file, 
virtual file system' to put them into ... it must depend on where you 
host?



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Can anyone explain this pgbench results?

2006-03-06 Thread Joost Kraaijeveld
Hi Michael,

Thanls for your response.

Michael Fuhr wrote:
 On Mon, Mar 06, 2006 at 04:29:49PM +0100, Joost Kraaijeveld wrote:
 Below are some results of running pgbench, run on a machine that
 is doing nothing else than running PostgreSQL woth pgbench. The
 strange thing is that the results are *constantly alternating* hight
 (750-850 transactions)and low (50-80 transactions), no matter how
 many test I run. If I wait a long time ( 5 minutes) after running
 the test, I always get a hight score, followed by a low one, followed
 by a high one, low one etc.
 
 The default checkpoint_timeout is 300 seconds (5 minutes).  Is it
 coincidence that the long time between fast results is about the
 same? 
I have not measured the long wait time. But I can run multiple test in 3 
minutes: the fast test lasts 3 sec, the long one 40 secs (see below). During 
the tests there is not much activity on the partition where the logfiles are 
(other controller and disk than the database and swap)

[EMAIL PROTECTED]:/usr/lib/postgresql/8.1/bin$ time ./pgbench  -c 10 -t 150 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 150
number of transactions actually processed: 1500/1500
tps = 531.067258 (including connections establishing)
tps = 541.694790 (excluding connections establishing)

real0m2.892s
user0m0.105s
sys 0m0.145s


[EMAIL PROTECTED]:/usr/lib/postgresql/8.1/bin$ time ./pgbench  -c 10 -t 150 test
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 10
number of transactions per client: 150
number of transactions actually processed: 1500/1500
tps = 37.064000 (including connections establishing)
tps = 37.114023 (excluding connections establishing)

real0m40.531s
user0m0.088s
sys 0m0.132s

What's your setting?  
Default.

 Are your test results more consistent
 if you execute CHECKPOINT between them?
Could you tell me how I could do that?



Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
e-mail: [EMAIL PROTECTED]
web: www.askesis.nl

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Postgres on VPS - how much is enough?

2006-03-06 Thread Matthew Nuzum
On 3/6/06, Marc G. Fournier [EMAIL PROTECTED] wrote:
 On Mon, 6 Mar 2006, Matthew Nuzum wrote:
  My problem with running PG inside of a VPS was that the VPS used a
  virtual filesystem... basically, a single file that had been formatted
  and loop mounted so that it looked like a regular hard drive.
  Unfortunately, it was very slow. The difference between my application
  and yours is that mine well more than filled the 1GB of RAM that I had
  allocated. If your data will fit comfortably into RAM then you may be
  fine.

 We host VPSs here (http://www.hub.org) and don't use the 'single file,
 virtual file system' to put them into ... it must depend on where you
 host?

That's true... I hope I didn't imply that I am anti-vps, I run my own
servers and one of them is dedicated to doing VPS for different
applications. I think they're wonderful.


On 3/6/06, Nagita Karunaratne [EMAIL PROTECTED] wrote:
 From personal experience, would you run Postgres on a linux machine
 (NOT a  vps) with 512MB of ram?

 Assumining I can keep all my data in memory.

Nagita,

It all depends on performance... I have one postgres database that
runs on a Pentium 350MHz with 128MB of RAM. It does 1 insert per
minute 24 hours per day. Because the load is so low, I can get away
with minimal hardware.

If your application has a lot of inserts/updates then disk speed is
important and can vary greatly from one VPS to another.

If your application is not time-critical than this may be a moot point anyway.

--
Matthew Nuzum
www.bearfruit.org

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Postgres and Ingres R3 / SAN

2006-03-06 Thread Jeremy Haile
Clustering solutions for PostgreSQL are currently pretty limited.  Slony
could be a good option in the future, but it currently only supports
Master-Slave replication (not true clustering) and in my experience is a
pain to set up and administer.  Bizgres MPP has a lot of promise,
especially for data warehouses, but it currently doesn't have the best
OLTP database performance.  

So, I had a couple of questions:
1) I have heard bad things from people on this list regarding SANs - but
is there a better alternative for a high performance database cluster?
(both for redundancy and performance)  I've heard internal storage
touted before, but then you have to do something like master-master
replication to get horizontal scalability and write performance will
suffer.

2) Has anyone on this list had experience using Ingres R3 in a clustered
environment?  I am considering using Ingres R3's built-in clustering
support with a SAN, but am interested to know other people's experiences
before we start toying with this possibility.  Any experience with the
Ingres support from Computer Associates?  Good/bad?

Jeremy

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] t1000/t2000 sun-servers

2006-03-06 Thread Jignesh K. Shah

Suggestions for benchmarks on Sun Fire T2000...

* Don't try DSS or TPC-H type of test with Postgres on Sun Fire T2000

Since such queries tend to have one connection, it will perform badly 
with Postgre since it will use only one hardware virtual CPU of the 
available 32 virtual CPU on Sun Fire T2000. (Oracle/DB2 have ways of 
breaking the queries into multiple processes and hence use multiple 
virtual CPUs on Sun Fire T2000, PostgreSQL cannot do the same in such cases)


* Use OLTP Type of benchmark

Where you have more than 30 simultaneous users/connections doing work on 
Postgres without bottlenecking on  datafiles of course :-)


* Use multiple databases or instances of Postgresql

Like migrate all your postgresql databases to one T2000. You might see 
that your average response time may not be faster but it can handle 
probably all your databases migrated to one T2000.


In essence, your single thread performance will not speed up on Sun Fire 
T2000  but you can certainly use it to replace all your individual 
postgresql servers in your organization or see higher scalability in 
terms of number of users handled with 1 server with Sun Fire T2000.



For your /etc/system use the parameters as mentioned in
http://www.sun.com/servers/coolthreads/tnb/parameters.jsp

For hints on setting it up for Postgresql refer to other databases setup on
http://www.sun.com/servers/coolthreads/tnb/applications.jsp

If you get specific performance problems  send email to 
pgsql-performance@postgresql.org


Regards,
Jignesh





Neil Saunders wrote:


I may be able to organize a test on a T2000 if someone could give
advice as to an appropriate test to run...

Cheers,

Neil

On 3/6/06, Claus Guttesen [EMAIL PROTECTED] wrote:
 


Hi.

Has anybody tried the new Sun cool-thread servers t1000/t2000 from
Sun? I'd love to see benchmarks with Solaris 10 and pg 8.1.

regards
Claus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

   



---(end of broadcast)---
TIP 6: explain analyze is your friend
 



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Can anyone explain this pgbench results?

2006-03-06 Thread Michael Fuhr
On Mon, Mar 06, 2006 at 07:46:05PM +0100, Joost Kraaijeveld wrote:
 Michael Fuhr wrote:
  What's your setting?  

 Default.

Have you tweaked postgresql.conf at all?  If so, what non-default
settings are you using?

  Are your test results more consistent
  if you execute CHECKPOINT between them?

 Could you tell me how I could do that?

Connect to the database as a superuser and execute a CHECKPOINT
statement.

http://www.postgresql.org/docs/8.1/interactive/sql-checkpoint.html

From the command line you could do something like

psql -c checkpoint
pgbench -c 10 -t 150 test
psql -c checkpoint
pgbench -c 10 -t 150 test
psql -c checkpoint
pgbench -c 10 -t 150 test

-- 
Michael Fuhr

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Sequencial scan instead of using index

2006-03-06 Thread Harry Hehl
There seems to be many posts on this issue but I not yet found an answer to the 
seq scan issue.

I am having an issue with a joins. I am using 8.0.3 on FC4 

Query: select * from ommemberrelation where srcobj='somevalue' and dstobj in 
(select objectid from omfilesysentry where name='dir15_file80');

Columns srcobj, dstobj  name are all indexed.

I ran test adding records to ommemberrelation and omfilesysentry up to 32K in 
each to simulate and measured query times.  The graph is O(n²) like.  i.e 
sequencial scan  

The columns in the where clauses are indexed, and yes I did VACUUM ANALYZE 
FULL. I even tried backup restore of the entire db. No difference. 

Turning sequencial scan off results in a O(n log n) like graph, 

Explain analyze confirms sequencial scan. A majority (70ms) of the 91ms query 
is as a result of  -  Seq Scan on ommemberrelation Timing is on.

  QUERY PLAN
  
--
 Nested Loop IN Join  (cost=486.19..101533.99 rows=33989 width=177) (actual 
time=5.493..90.682 rows=1 loops=1)
   Join Filter: (outer.dstobj = inner.objectid)
   -  Seq Scan on ommemberrelation  (cost=0.00..2394.72 rows=33989 width=177) 
(actual time=0.078..70.887 rows=100 loops=1)
 Filter: (srcobj = '3197a4e6-abf1-11da-a0f9-000fb05ab829'::text)
   -  Materialize  (cost=486.19..487.48 rows=129 width=16) (actual 
time=0.004..0.101 rows=26 loops=100)
 -  Append  (cost=0.00..486.06 rows=129 width=16) (actual 
time=0.063..1.419 rows=26 loops=1)
   -  Index Scan using omfilesysentry_name_idx on omfilesysentry  
(cost=0.00..8.30 rows=2 width=16) (actual time=0.019..0.019 rows=0 loops=1)
 Index Cond: (name = 'dir15_file80'::text)
   -  Index Scan using omfile_name_idx on omfile omfilesysentry  
(cost=0.00..393.85 rows=101 width=16) (actual time=0.033..0.291 rows=26 loops=1)
 Index Cond: (name = 'dir15_file80'::text)
   -  Seq Scan on omdirectory omfilesysentry  (cost=0.00..24.77 
rows=11 width=16) (actual time=0.831..0.831 rows=0 loops=1)
 Filter: (name = 'dir15_file80'::text)
   -  Index Scan using omfilesequence_name_idx on omfilesequence 
omfilesysentry  (cost=0.00..8.30 rows=2 width=16) (actual time=0.014..0.014 
rows=0 loops=1)
 Index Cond: (name = 'dir15_file80'::text)
   -  Index Scan using omclipfile_name_idx on omclipfile 
omfilesysentry  (cost=0.00..8.30 rows=2 width=16) (actual time=0.008..0.008 
rows=0 loops=1)
 Index Cond: (name = 'dir15_file80'::text)
   -  Index Scan using omimagefile_name_idx on omimagefile 
omfilesysentry  (cost=0.00..8.30 rows=2 width=16) (actual time=0.008..0.008 
rows=0 loops=1)
 Index Cond: (name = 'dir15_file80'::text)
   -  Index Scan using omcollection_name_idx on omcollection 
omfilesysentry  (cost=0.00..8.30 rows=2 width=16) (actual time=0.008..0.008 
rows=0 loops=1)
 Index Cond: (name = 'dir15_file80'::text)
   -  Index Scan using omhomedirectory_name_idx on omhomedirectory 
omfilesysentry  (cost=0.00..8.30 rows=2 width=16) (actual time=0.007..0.007 
rows=0 loops=1)
 Index Cond: (name = 'dir15_file80'::text)
   -  Seq Scan on omrootdirectory omfilesysentry  (cost=0.00..1.05 
rows=1 width=16) (actual time=0.013..0.013 rows=0 loops=1)
 Filter: (name = 'dir15_file80'::text)
   -  Index Scan using omwarehousedirectory_name_idx on 
omwarehousedirectory omfilesysentry  (cost=0.00..8.30 rows=2 width=16) (actual 
time=0.007..0.007 rows=0 loops=1)
 Index Cond: (name = 'dir15_file80'::text)
   -  Index Scan using omtask_name_idx on omtask omfilesysentry  
(cost=0.00..8.30 rows=2 width=16) (actual time=0.009..0.009 rows=0 loops=1)
 Index Cond: (name = 'dir15_file80'::text)  Total runtime: 
91.019 ms
(29 rows)

So why is the planner not using the index?  Everything I have read indicates 
sequencial scanning should be left on and the planner should do the right 
thing. 

This is a quote from 1 web site:

These options are pretty much only for use in query testing; frequently one 
sets enable_seqscan = false in order to determine if the planner is 
unnecessarily discarding an index, for example. However, it would require very 
unusual circumstances to change any of them to false in the .conf file.

So how do I determine why the planner is unnecessarily discarding the index? 

Thanks




---(end of broadcast)---
TIP 4: Have you searched our list 

Re: [PERFORM] Sequencial scan instead of using index

2006-03-06 Thread Ragnar
On mán, 2006-03-06 at 13:46 -0500, Harry Hehl wrote:
 Query: select * from ommemberrelation where srcobj='somevalue' 
 and dstobj in (select objectid from omfilesysentry where name='dir15_file80');
 
 Columns srcobj, dstobj  name are all indexed.

 --
  Nested Loop IN Join  (cost=486.19..101533.99 rows=33989 width=177) (actual 
 time=5.493..90.682 rows=1 loops=1)
Join Filter: (outer.dstobj = inner.objectid)
-  Seq Scan on ommemberrelation  (cost=0.00..2394.72 rows=33989 
 width=177) (actual time=0.078..70.887 rows=100 loops=1)
  Filter: (srcobj = '3197a4e6-abf1-11da-a0f9-000fb05ab829'::text)
-  Materialize  (cost=486.19..487.48 rows=129 width=16) (actual 
 time=0.004..0.101 rows=26 loops=100)

Looks like the planner is expecting 33989 rows, making 
an index scan a ppor choice, but in fact only 100 rows
actually match your srcobj value.

Could we see the explain analyze with enable_seqscan
= false please ?

Possibly you might want totry to increase the statistics
target for this columns , as in:
  ALTER TABLE ommemberrelation ALTER COLUMN srcobj
  SET STATISTICS 1000;
  ANALYZE;
and try again (with enable_seqscan=true)

A target of 1000 ismost probably overkill, but
start with this value, and if it improves matters,
you can experiment with lower settings.

gnari



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] t1000/t2000 sun-servers

2006-03-06 Thread Guido Neitzer

On 06.03.2006, at 21:10 Uhr, Jignesh K. Shah wrote:

Like migrate all your postgresql databases to one T2000. You might  
see that your average response time may not be faster but it can  
handle probably all your databases migrated to one T2000.


In essence, your single thread performance will not speed up on Sun  
Fire T2000  but you can certainly use it to replace all your  
individual postgresql servers in your organization or see higher  
scalability in terms of number of users handled with 1 server with  
Sun Fire T2000.


How good is a pgbench test for evaluating things like this? I have  
used it to compare several machines, operating systems and PostgreSQL  
versions - but it was more or less just out of curiosity. The real  
evaluation was made with real life tests - mostly scripts which  
also tested the application server itself.


But as it was it's easy to compare several machines with pgbench, I  
just did the tests and they were interesting and reflected the real  
world not as bad as I had thought from a benchmark.


So, personally I'm interested in a simple pgbench test - perhaps with  
some more (  50) clients simulated ...


cug

smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] t1000/t2000 sun-servers

2006-03-06 Thread Jignesh K. Shah


pgbench according to me is more io write intensive benchmark.

T2000 with its internal drive may not perform well with pgbench with a 
high load. If you are using external storage, try it out.


I havent tried it out yet but let me know what you see.


-Jignesh


Guido Neitzer wrote:


On 06.03.2006, at 21:10 Uhr, Jignesh K. Shah wrote:

Like migrate all your postgresql databases to one T2000. You might  
see that your average response time may not be faster but it can  
handle probably all your databases migrated to one T2000.


In essence, your single thread performance will not speed up on Sun  
Fire T2000  but you can certainly use it to replace all your  
individual postgresql servers in your organization or see higher  
scalability in terms of number of users handled with 1 server with  
Sun Fire T2000.



How good is a pgbench test for evaluating things like this? I have  
used it to compare several machines, operating systems and PostgreSQL  
versions - but it was more or less just out of curiosity. The real  
evaluation was made with real life tests - mostly scripts which  
also tested the application server itself.


But as it was it's easy to compare several machines with pgbench, I  
just did the tests and they were interesting and reflected the real  
world not as bad as I had thought from a benchmark.


So, personally I'm interested in a simple pgbench test - perhaps with  
some more (  50) clients simulated ...


cug



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Help understanding indexes, explain, and optimizing a query

2006-03-06 Thread i.v.r.

Hi everyone,

I'm experimenting with PostgreSQL, but since I'm no expert DBA, I'm 
experiencing some performance issues.


Please take a look at the following query:

SELECT
 /*groups.name AS t2_r1,
 groups.id AS t2_r3,
 groups.user_id AS t2_r0,
 groups.pretty_url AS t2_r2,
 locations.postal_code AS t0_r6,
 locations.pretty_url AS t0_r7,
 locations.id AS t0_r8,
 locations.colony_id AS t0_r0,
 locations.user_id AS t0_r1,
 locations.group_id AS t0_r2,
 locations.distinction AS t0_r3,
 locations.street AS t0_r4,
 locations.street_2 AS t0_r5,
 schools.updated AS t1_r10,
 schools.level_id AS t1_r4,
 schools.pretty_url AS t1_r11,
 schools.user_id AS t1_r5,
 schools.id AS t1_r12,
 schools.type_id AS t1_r6,
 schools.distinction AS t1_r7,
 schools.cct AS t1_r8,
 schools.created_on AS t1_r9,
 schools.location_id AS t1_r0,
 schools.service_id AS t1_r1,
 schools.sustentation_id AS t1_r2,
 schools.dependency_id AS t1_r3*/
 groups.*,
 locations.*,
 schools.*
FROM locations
LEFT OUTER JOIN groups ON groups.id = locations.group_id
LEFT OUTER JOIN schools ON schools.location_id = locations.id
WHERE (colony_id = 71501)
ORDER BY groups.name, locations.distinction, schools.distinction

As you can see, I've commented out some parts. I did that as an 
experiment, and it improved the query by 2x. I really don't understand 
how is that possible... I also tried changing the second join to an 
INNER join, and that improves it a little bit also.


Anyway, the main culprit seems to be that second join. Here's the output 
from EXPLAIN:


Sort  (cost=94315.15..94318.02 rows=1149 width=852)
  Sort Key: groups.name, locations.distinction, schools.distinction
  -  Merge Left Join  (cost=93091.96..94256.74 rows=1149 width=852)
Merge Cond: (outer.id = inner.location_id)
-  Sort  (cost=4058.07..4060.94 rows=1148 width=646)
  Sort Key: locations.id
  -  Hash Left Join  (cost=1.01..3999.72 rows=1148 width=646)
Hash Cond: (outer.group_id = inner.id)
-  Index Scan using locations_colony_id on 
locations  (cost=0.00..3992.91 rows=1148 width=452)

  Index Cond: (colony_id = 71501)
-  Hash  (cost=1.01..1.01 rows=1 width=194)
  -  Seq Scan on groups  (cost=0.00..1.01 
rows=1 width=194)

-  Sort  (cost=89033.90..89607.67 rows=229510 width=206)
  Sort Key: schools.location_id
  -  Seq Scan on schools  (cost=0.00..5478.10 rows=229510 
width=206)


I don't completely understand what that output means, but it would seem 
that the first join costs about 4000, but if I remove that join from the 
query, the performance difference is negligible. So as I said, it seems 
the problem is the join on the schools table.


I hope it's ok for me to post the relevant tables here, so here they are 
(I removed some constraints and indexes that aren't relevant to the 
query above):


CREATE TABLE groups
(
 user_id int4 NOT NULL,
 name varchar(50) NOT NULL,
 pretty_url varchar(50) NOT NULL,
 id serial NOT NULL,
 CONSTRAINT groups_pk PRIMARY KEY (id),
)

CREATE TABLE locations
(
 colony_id int4 NOT NULL,
 user_id int4 NOT NULL,
 group_id int4 NOT NULL,
 distinction varchar(60) NOT NULL,
 street varchar(60) NOT NULL,
 street_2 varchar(50) NOT NULL,
 postal_code varchar(5) NOT NULL,
 pretty_url varchar(60) NOT NULL,
 id serial NOT NULL,
 CONSTRAINT locations_pk PRIMARY KEY (id),
 CONSTRAINT colony FOREIGN KEY (colony_id)
 REFERENCES colonies (id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION,
 CONSTRAINT group FOREIGN KEY (group_id)
 REFERENCES groups (id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION,
)
CREATE INDEX locations_fki_colony
 ON locations
 USING btree
 (colony_id);
CREATE INDEX locations_fki_group
 ON locations
 USING btree
 (group_id);

CREATE TABLE schools
(
 location_id int4 NOT NULL,
 service_id int4 NOT NULL,
 sustentation_id int4 NOT NULL,
 dependency_id int4 NOT NULL,
 level_id int4 NOT NULL,
 user_id int4 NOT NULL,
 type_id int4 NOT NULL,
 distinction varchar(25) NOT NULL,
 cct varchar(20) NOT NULL,
 created_on timestamp(0) NOT NULL,
 updated timestamp(0),
 pretty_url varchar(25) NOT NULL,
 id serial NOT NULL,
 CONSTRAINT schools_pk PRIMARY KEY (id),
 CONSTRAINT location FOREIGN KEY (location_id)
 REFERENCES locations (id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION,
)
CREATE INDEX schools_fki_location
 ON schools
 USING btree
 (location_id);

So I'm wondering what I'm doing wrong. I migrated this database from 
MySQL, and on there it ran pretty fast.


Kind regards,
Ivan V.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Planner enhancement suggestion.

2006-03-06 Thread Jim C. Nasby
On Sun, Mar 05, 2006 at 10:00:25PM +0100, PFC wrote:
 
   Bitmap index scan is bliss. Many thanks to the postgres team ! Now  
 searching in tables with a lot of fields and conditions is no longer a  
 pain.
 
   And just a thought :
 
   SELECT * FROM table WHERE category IN (1,2,3) ORDER BY price LIMIT 
   10;
 
   Suppose you have an index on category, and another index on price.  
 Depending on the stats postgres has about the values, you'll either get :
 
   0- seq scan + sort
   1- Plain or Bitmap Index scan using category, then sort by price
   2- Index scan on price, Filter on category IN (1,2,3), no sort.
 
   1 is efficient if the category is rare. Postgres knows this and uses 
   this  plan well.
   Without a LIMIT, option 1 should be preferred.
 
   2 is efficient if the items in the categories 1,2,3 are cheap (close 
   to  the start of the index on price). However if the items in question 
 are 
 on  the other side of the index, it will index-scan a large part of the 
 table.  This can be a big hit. Postgres has no stats about the correlation 
 of  category and price, so it won't know when there is going to be a  
 problem.
 
   Another option would be interesting. It has two steps :
 
   - Build a bitmap using the index on category (just like in case 1)
   so we know which pages on the table have relevant rows
 
   - Index scan on price, but only looking in the heap for pages 
   which are  flagged in the bitmap, and then Recheck Cond on category.
   In other words, do an index scan to get the rows in the right order, 
   but  don't bother to check the heap for pages where the bitmap says 
 there 
 are  no rows.
   In the worst case, you still have to run through the entire index, 
   but at  least not through the entire table !
 
   It can also speed up some merge joins.

The problem is that you're now talking about doing 2 index scans instead
of just one and a sort. If the correlation on price is high, it could
still win. As the cost estimator for index scan stands right now,
there's no way such a plan would be chosen unless correlation was
extremely high, however.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Help understanding indexes, explain, and optimizing

2006-03-06 Thread Chris

i.v.r. wrote:

Hi everyone,

I'm experimenting with PostgreSQL, but since I'm no expert DBA, I'm 
experiencing some performance issues.


Please take a look at the following query:

SELECT
 /*groups.name AS t2_r1,
 groups.id AS t2_r3,
 groups.user_id AS t2_r0,
 groups.pretty_url AS t2_r2,
 locations.postal_code AS t0_r6,
 locations.pretty_url AS t0_r7,
 locations.id AS t0_r8,
 locations.colony_id AS t0_r0,
 locations.user_id AS t0_r1,
 locations.group_id AS t0_r2,
 locations.distinction AS t0_r3,
 locations.street AS t0_r4,
 locations.street_2 AS t0_r5,
 schools.updated AS t1_r10,
 schools.level_id AS t1_r4,
 schools.pretty_url AS t1_r11,
 schools.user_id AS t1_r5,
 schools.id AS t1_r12,
 schools.type_id AS t1_r6,
 schools.distinction AS t1_r7,
 schools.cct AS t1_r8,
 schools.created_on AS t1_r9,
 schools.location_id AS t1_r0,
 schools.service_id AS t1_r1,
 schools.sustentation_id AS t1_r2,
 schools.dependency_id AS t1_r3*/
 groups.*,
 locations.*,
 schools.*
FROM locations
LEFT OUTER JOIN groups ON groups.id = locations.group_id
LEFT OUTER JOIN schools ON schools.location_id = locations.id
WHERE (colony_id = 71501)
ORDER BY groups.name, locations.distinction, schools.distinction

As you can see, I've commented out some parts. I did that as an 
experiment, and it improved the query by 2x. I really don't understand 
how is that possible... I also tried changing the second join to an 
INNER join, and that improves it a little bit also.


Anyway, the main culprit seems to be that second join. Here's the output 
from EXPLAIN:


Sort  (cost=94315.15..94318.02 rows=1149 width=852)
  Sort Key: groups.name, locations.distinction, schools.distinction
  -  Merge Left Join  (cost=93091.96..94256.74 rows=1149 width=852)
Merge Cond: (outer.id = inner.location_id)
-  Sort  (cost=4058.07..4060.94 rows=1148 width=646)
  Sort Key: locations.id
  -  Hash Left Join  (cost=1.01..3999.72 rows=1148 width=646)
Hash Cond: (outer.group_id = inner.id)
-  Index Scan using locations_colony_id on 
locations  (cost=0.00..3992.91 rows=1148 width=452)

  Index Cond: (colony_id = 71501)
-  Hash  (cost=1.01..1.01 rows=1 width=194)
  -  Seq Scan on groups  (cost=0.00..1.01 
rows=1 width=194)

-  Sort  (cost=89033.90..89607.67 rows=229510 width=206)
  Sort Key: schools.location_id
  -  Seq Scan on schools  (cost=0.00..5478.10 rows=229510 
width=206)


I don't completely understand what that output means, but it would seem 
that the first join costs about 4000, but if I remove that join from the 
query, the performance difference is negligible. So as I said, it seems 
the problem is the join on the schools table.


I hope it's ok for me to post the relevant tables here, so here they are 
(I removed some constraints and indexes that aren't relevant to the 
query above):


CREATE TABLE groups
(
 user_id int4 NOT NULL,
 name varchar(50) NOT NULL,
 pretty_url varchar(50) NOT NULL,
 id serial NOT NULL,
 CONSTRAINT groups_pk PRIMARY KEY (id),
)

CREATE TABLE locations
(
 colony_id int4 NOT NULL,
 user_id int4 NOT NULL,
 group_id int4 NOT NULL,
 distinction varchar(60) NOT NULL,
 street varchar(60) NOT NULL,
 street_2 varchar(50) NOT NULL,
 postal_code varchar(5) NOT NULL,
 pretty_url varchar(60) NOT NULL,
 id serial NOT NULL,
 CONSTRAINT locations_pk PRIMARY KEY (id),
 CONSTRAINT colony FOREIGN KEY (colony_id)
 REFERENCES colonies (id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION,
 CONSTRAINT group FOREIGN KEY (group_id)
 REFERENCES groups (id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION,
)
CREATE INDEX locations_fki_colony
 ON locations
 USING btree
 (colony_id);
CREATE INDEX locations_fki_group
 ON locations
 USING btree
 (group_id);

CREATE TABLE schools
(
 location_id int4 NOT NULL,
 service_id int4 NOT NULL,
 sustentation_id int4 NOT NULL,
 dependency_id int4 NOT NULL,
 level_id int4 NOT NULL,
 user_id int4 NOT NULL,
 type_id int4 NOT NULL,
 distinction varchar(25) NOT NULL,
 cct varchar(20) NOT NULL,
 created_on timestamp(0) NOT NULL,
 updated timestamp(0),
 pretty_url varchar(25) NOT NULL,
 id serial NOT NULL,
 CONSTRAINT schools_pk PRIMARY KEY (id),
 CONSTRAINT location FOREIGN KEY (location_id)
 REFERENCES locations (id) MATCH SIMPLE
 ON UPDATE NO ACTION ON DELETE NO ACTION,
)
CREATE INDEX schools_fki_location
 ON schools
 USING btree
 (location_id);

So I'm wondering what I'm doing wrong. I migrated this database from 
MySQL, and on there it ran pretty fast.


Have you done an 'analyze' or 'vacuum analyze' over these tables?

A left outer join gets *everything* from the second table:

 LEFT OUTER JOIN groups ON groups.id = locations.group_id
 LEFT OUTER JOIN schools ON schools.location_id = locations.id

So they will load everything from groups and schools. Maybe they should 
be left join's not left outer joins?



--

Re: [PERFORM] Help understanding indexes, explain, and optimizing

2006-03-06 Thread i.v.r.

Chris escribió:

Have you done an 'analyze' or 'vacuum analyze' over these tables?

A left outer join gets *everything* from the second table:

 LEFT OUTER JOIN groups ON groups.id = locations.group_id
 LEFT OUTER JOIN schools ON schools.location_id = locations.id

So they will load everything from groups and schools. Maybe they 
should be left join's not left outer joins?



Yes, I did that. I tried your other suggestion and it did improve it by 
about 200ms.


I also repurposed the query by selecting first from the groups table and 
joining with the locations and schools tables, and that made all the 
difference. Now it's down to

32ms. Yipee!

Thanks!

Ivan V.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Can anyone explain this pgbench results?

2006-03-06 Thread Michael Fuhr
[Please copy the mailing list on replies.]

On Mon, Mar 06, 2006 at 09:38:20PM +0100, Joost Kraaijeveld wrote:
 Michael Fuhr wrote:
  Have you tweaked postgresql.conf at all?  If so, what non-default
  settings are you using? 
 
 Yes, I have tweaked the following settings:
 
 shared_buffers = 4
 work_mem = 512000
 maintenance_work_mem = 512000
 max_fsm_pages = 4
 effective_cache_size = 131072

Are you sure you need work_mem that high?  How did you decide on
that value?  Are all other settings at their defaults?  No changes
to the write ahead log (WAL) or background writer (bgwriter) settings?
What version of PostgreSQL are you running?  The paths in your
original message suggest 8.1.x.

  Are your test results more consistent
  psql -c checkpoint
  pgbench -c 10 -t 150 test
  psql -c checkpoint
  pgbench -c 10 -t 150 test
  psql -c checkpoint
  pgbench -c 10 -t 150 test

 OK, that leads to a consistant hight score. I also noticed that
 psql -c checkpoint results in I/O on the database partition but
 not on the partition that has the logfiles (pg_xlog directory). Do
 you know if that how it should be?

A checkpoint updates the database files with the data from the
write-ahead log; you're seeing those writes to the database partition.
The postmaster does checkpoints every checkpoint_timeout seconds
(default 300) or every checkpoint_segment log segments (default 3);
it also uses a background writer to trickle pages to the database
files between checkpoints so the checkpoints don't have as much
work to do.  I've been wondering if your pgbench runs are being
affected by that background activity; the fact that you get
consistently good performance after forcing a checkpoint suggests
that that might be the case.

If you run pgbench several times without intervening checkpoints,
do your postmaster logs have any messages like checkpoints are
occurring too frequently?  It might be useful to increase
checkpoint_warning up to the value of checkpoint_timeout and then
see if you get any such messages during pgbench runs.  If checkpoints
are happening a lot more often than every checkpoint_timeout seconds
then try increasing checkpoint_segments (assuming you have the disk
space).  After doing so, restart the database and run pgbench several
times without intervening checkpoints and see if performance is
more consistent.

Note that tuning PostgreSQL for pgbench performance might be
irrelevant for your actual needs unless your usage patterns happen
to resemble what pgbench does.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Sequencial scan instead of using index

2006-03-06 Thread Mark Kirkwood

Harry Hehl wrote:

There seems to be many posts on this issue but I not yet found an answer to the 
seq scan issue.

I am having an issue with a joins. I am using 8.0.3 on FC4 


Query: select * from ommemberrelation where srcobj='somevalue' and dstobj in 
(select objectid from omfilesysentry where name='dir15_file80');

Columns srcobj, dstobj  name are all indexed.




The planner is over-estimating the number of rows here (33989 vs 100):

-  Seq Scan on ommemberrelation  (cost=0.00..2394.72 rows=33989 
width=177) (actual time=0.078..70.887 rows=100 loops=1)


The usual way to attack this is to up the sample size for ANALYZE:

ALTER TABLE ommemberrelation ALTER COLUMN srcobj SET STATISTICS 100;
ALTER TABLE ommemberrelation ALTER COLUMN dstobj SET STATISTICS 100;
-- or even 1000.
ANALYZE ommemberrelation;

Then try EXPLAIN ANALYZE again.


If you can upgrade to 8.1.(3), then the planner can consider paths that 
use *both* the indexes on srcobj and dstobj (which would probably be the 
business!).


Cheers

Mark

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match