Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Chris Mair

  Yeah - looks good! (is the default open_datasync still?). Might be worth
  trying out the fdatasync method too (ISTR this being quite good... again
  on Solaris 8, so things might have changed)!
 
 I was just talking to a member of the Solaris-UFS team who recommended that 
 we 
 test fdatasync.

Ok, so I did a few runs for each of the sync methods, keeping all the
rest constant and got this:

open_datasync  0.7
fdatasync  4.6
fsync  4.5
fsync_writethrough not supported
open_sync  0.6

in arbitrary units - higher is faster.

Quite impressive!

Bye, Chris.




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Luke Lonergan
Chris,

On 4/5/06 2:31 PM, Chris Mair [EMAIL PROTECTED] wrote:

 Doing what http://blogs.sun.com/roller/page/jkshah suggests:
   wal_sync_method = fsync (unchanged)
   wal_buffers = 128 (was 8)
   checkpoint_segments = 128 (was 3)
   bgwriter_all_percent = 0 (was 0.333)
   bgwriter_all_maxpages = 0 (was 5)
 and leaving everything else default (solarispackages from pgfoundry)
 increased performance ~ 7 times!

In the recent past, Jignesh Shaw of Sun MDE discovered that changing the
bgwriter_* parameters to zero had a dramatic positive impact on performance.

There are also some critical UFS kernel tuning parameters to set, you should
find those in his blog.

We found and fixed some libpq issues with Solaris that were also critical -
they should be in 8.1.3 I think.

- Luke



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Juan Casero \(FL FLC\)
Ok that is beginning to become clear to me.  Now I need to determine if
this server is worth the investment for us.  Maybe it is not a speed
daemon but to be honest the licensing costs of an SMP aware RDBMS is
outside our budget.  When postgresql starts does it start up a super
server process and then forks copies of itself to handle incoming
requests?  Or do I have to specify how many server processes should be
started up?   I figured maybe I can take advantage of the multiple cpu's
on this system by starting up enough postgres server processes to handle
large numbers of incoming connections.  I have this server available for
sixty days so I may as well explore the performance of postgresql on it.



Thanks,
Juan 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Luke
Lonergan
Sent: Wednesday, April 05, 2006 5:37 PM
To: Juan Casero (FL FLC); pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

Juan,

On 4/5/06 1:54 PM, Juan Casero (FL FLC) [EMAIL PROTECTED]
wrote:

 I am not sure about this.  I mean I have postgresql 8.1.3 running on 
 my Windows XP P4 HT laptop that I use for testing my webapps.  When I 
 hit this pgsql on this laptop with a large query I can see the load 
 spike up really high on both of my virtual processors.  Whatever, 
 pgsql is doing it looks like both cpu's are being used indepently. The

 usage curve is not identical on both of them that makes me think that 
 parts of the server are multithreaded.  Admittedly I am not familiar 
 with the source code fo postgresql so I was hoping maybe one of the 
 developers who is could definitely answer this question.

There's no part of the Postgres backend that is threaded or
multi-processed.
A reasonable explanation for your windows experience is that your web
server or the psql client may be taking some CPU cycles while the
backend is processing your query.  Also, depending on how the CPU load
is reported, if the OS is doing prefetching of I/O, it might show up as
load.

- Luke



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Josh Berkus
Chris,

 Remounting the fs where $PGDATA lives with forcedirectio
 (together with logging, that is default) did not help
 (if not harm...) performance.

Not all of PG.  JUST pg_xlog.  forcedirectio is only a good idea for the xlog.

 Quickly playing around with wal_buffers on Linux and Mac OS X
 I see it influences the performance of my test a bit, maybe in the
 10-20% range (I'm really doing quick tests, nothing systematic),
 but nowhere near as spectacularly as on Solaris.

 I'm happy so far, but I find it very surprising that this single
 parameter has such an impact (only on) Solaris 10.

That *is* interesting.  I hadn't tested this previously specifically on 
Solaris.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Josh Berkus
Mark, Chris,

 Yeah - looks good! (is the default open_datasync still?). Might be worth
 trying out the fdatasync method too (ISTR this being quite good... again
 on Solaris 8, so things might have changed)!

I was just talking to a member of the Solaris-UFS team who recommended that we 
test fdatasync.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Mark Kirkwood

Leigh Dyer wrote:

Luke Lonergan wrote:


Leigh,

On 4/5/06 9:23 PM, Leigh Dyer [EMAIL PROTECTED] wrote:


We've got a Sun Fire V40z and it's quite a nice machine -- 6x 15krpm
drives, 4GB RAM, and a pair of Opteron 850s. This gives us more than
enough power now for what we need, but it's nice to know that we can
shoehorn a lot more RAM, and up it to eight CPU cores if needed.



We have one of these too - ours is signed by Scott McNealy.
 


Nice :)


The newer Sun Opteron systems look nice too, but unless you're using
external storage, their little 2.5 hard drives may not be ideal.



Yes - but they end-of-lifed the V20z and V40z!

That's quite disappointing to hear -- our V40z isn't even six months 
old! We're not a big company, so external storage solutions are outside 
our price range, but we still wanted a nice brand-name box, and the V40z 
was a great deal compared to smaller boxes like the HP DL385.



One big problem with the sun line in general is the tiny internal storage
capacity - already too small on the V40z at 5/6 drives, now ridiculous 
at 4

SAS drives on the galaxy series.


I'm sure those little SAS drives would be great for web servers and 
other non-IO-intensive tasks though -- I'd love to get some X4100s in to 
replace our Poweredge 1750s for that. It's a smart move overall IMHO, 
but it's certainly not great for database serving.





I notice that Supermicro have recently brought out some Opteron systems, 
they are hiding them here:


http://www.supermicro.com/Aplus/system/


The 4U's have 8 SATA/SCSI drive bays - maybe still not enough, but 
better than 6!


Cheers

Mark




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Josh Berkus
Juan,

 Ok that is beginning to become clear to me.  Now I need to determine if
 this server is worth the investment for us.  Maybe it is not a speed
 daemon but to be honest the licensing costs of an SMP aware RDBMS is
 outside our budget. 

You still haven't explained why you want multi-threaded queries.  This is 
sounding like keeping up with the Joneses.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Miguel

Mark Kirkwood wrote:




The newer Sun Opteron systems look nice too, but unless you're using
external storage, their little 2.5 hard drives may not be ideal.




Yes - but they end-of-lifed the V20z and V40z!

That's quite disappointing to hear -- our V40z isn't even six months 
old! We're not a big company, so external storage solutions are 
outside our price range, but we still wanted a nice brand-name box, 
and the V40z was a great deal compared to smaller boxes like the HP 
DL385.


One big problem with the sun line in general is the tiny internal 
storage
capacity - already too small on the V40z at 5/6 drives, now 
ridiculous at 4

SAS drives on the galaxy series.



I'm sure those little SAS drives would be great for web servers and 
other non-IO-intensive tasks though -- I'd love to get some X4100s in 
to replace our Poweredge 1750s for that. It's a smart move overall 
IMHO, but it's certainly not great for database serving.







Excuse me for this off topic, but i notice that you are very excited 
about the sun's hardware, what os do you install on them , slowlaris?, 
has that os improved in some espectacular way that i should take a look 
again?, i used it until solaris 9 and  the performance was horrible.

im a happy freebsd user now (using hp and dell hardware  though)

---
Miguel

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Robert Lor

Chris Mair wrote:


Ok, so I did a few runs for each of the sync methods, keeping all the
rest constant and got this:

open_datasync  0.7
fdatasync  4.6
fsync  4.5
fsync_writethrough not supported
open_sync  0.6

in arbitrary units - higher is faster.

Quite impressive!


 


Chris,
Just to make sure the x4100 config is similar to your Linux system, can 
you verify the default setting for disk write cache and make sure they 
are both enabled or disabled. Here's how to check in Solaris.

As root, run format -e - pick a disk - cache - write_cache - display

Not sure how to do it on Linux though!

Regards,
-Robert

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Leigh Dyer

Miguel wrote:


Excuse me for this off topic, but i notice that you are very excited 
about the sun's hardware, what os do you install on them , slowlaris?, 
has that os improved in some espectacular way that i should take a look 
again?, i used it until solaris 9 and  the performance was horrible.

im a happy freebsd user now (using hp and dell hardware  though)


I'm running Debian Sarge AMD64 on mine, and it works wonderfully. I'm 
not a Solaris person, and I never plan on becoming one, but Sun's 
Opteron hardware is quite nice. The remote management was one of the 
features that sold me -- full control over power, etc. and 
serial-over-LAN, through an SSH interface.


Sun don't support Debian officially (but we don't have a software 
support contract anyway, so I'm not too fussed), but I'm pretty sure 
they support at least SLES and RHEL.


Thanks
Leigh


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Query planner is using wrong index.

2006-04-06 Thread Brian Herlihy
Hi,

I have a problem with the choice of index made by the query planner.

My table looks like this:

CREATE TABLE t
(
  p1 varchar not null,
  p2 varchar not null,
  p3 varchar not null,
  i1 integer,
  i2 integer,
  i3 integer,
  i4 integer,
  i5 integer,
  d1 date,
  d2 date,
  d3 date,
  PRIMARY KEY (p1, p2, p3)
);

I have also created an index on (p2, p3), as some of my lookups are on these
only.
All the integers and dates are data values.
The table has around 9 million rows.
I am using postgresl 7.4.7

I have set statistics to 1000 on the p1, p2 and p3 columns, and run vacuum full
analyse.  However, I still see
query plans like this:

db=# explain select * from t where p1 = 'something' and p2 = 'fairly_common'
and p3 = 'fairly_common';  
QUERY PLAN  
---
 Index Scan using p2p3 on t  (cost=0.00..6.01 rows=1 width=102)
   Index Cond: (((p2)::text = 'fairly_common'::text) AND ((p3)::text =
'fairly_common'::text))
   Filter: ((p1)::text = 'something'::text)
(3 rows)

The problem appears to be this:

db=# explain select * from t where p2 = 'fairly_common' and p3 =
'fairly_common';  
QUERY PLAN  
---
 Index Scan using p2p3 on t  (cost=0.00..6.01 rows=1 width=102)
   Index Cond: (((p2)::text = 'fairly_common'::text) AND ((p3)::text =
'fairly_common'::text))
(3 rows)

The query planner thinks that this will return only 1 row.
In fact, these index lookups sometimes return up to 500 rows, which then must
be filtered by p1.
This can take 2 or 3 seconds to execute for what should be a simple primary key
lookup.

For VERY common values of p2 and p3, the query planner chooses the primary key,
because these values are stored
explicitly in the analyse results.  For rare values there is no problem,
because the query runs quickly.
But for fairly common values, there is a problem.

I would like the query planner to use the primary key for all of these lookups.
 How can I enforce this?

Thanks,
Brian

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query planner is using wrong index.

2006-04-06 Thread Ragnar
On fim, 2006-04-06 at 12:35 +1000, Brian Herlihy wrote:

 I have a problem with the choice of index made by the query planner.
 
 My table looks like this:
 
 CREATE TABLE t
 (
   p1 varchar not null,
   p2 varchar not null,
   p3 varchar not null,
   i1 integer,
   i2 integer,
   i3 integer,
   i4 integer,
   i5 integer,
   d1 date,
   d2 date,
   d3 date,
   PRIMARY KEY (p1, p2, p3)
 );
 
 I have also created an index on (p2, p3), as some of my lookups are on these
 only.

 All the integers and dates are data values.
 The table has around 9 million rows.
 I am using postgresl 7.4.7
 
 I have set statistics to 1000 on the p1, p2 and p3 columns, and run vacuum 
 full
 analyse.  However, I still see
 query plans like this:
 
...
 db=# explain select * from t where p2 = 'fairly_common' and p3 =
 'fairly_common';  
 QUERY PLAN  
 ---
  Index Scan using p2p3 on t  (cost=0.00..6.01 rows=1 width=102)
Index Cond: (((p2)::text = 'fairly_common'::text) AND ((p3)::text =
 'fairly_common'::text))
 (3 rows)

please show us an actual EXPLAIN ANALYZE
this will show us more.

 I would like the query planner to use the primary key for all of these 
 lookups.
  How can I enforce this?

How would that help? have you tested to see if it would 
actualy be better?

gnari



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Maintenance_work_mem influence on queries

2006-04-06 Thread luchot
Hello,
I am doing some test with differents values for the parameter maintenance_work_mem in order to
ameliorate the time for the creation of index and and the use of vacuum and analyse.
I read in the doc that this parameter is just for create index, vacuum and analyse and foreign key .
But when i test 2 queries with differents values the result are twice big :
for mwm to 64 Mo the query 1 last 34 min and the query 2 41 min
for mwm to 512 mo the query 1 last 17 min and the query 2 21 min
So my question is in what condition the parameter maintenance_work_mem influence on the execution of queries.
 Thanks ,

Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Markus Schaber
Hi, Juan,

Juan Casero (FL FLC) wrote:
 Ok that is beginning to become clear to me.  Now I need to determine if
 this server is worth the investment for us.  Maybe it is not a speed
 daemon but to be honest the licensing costs of an SMP aware RDBMS is
 outside our budget.  When postgresql starts does it start up a super
 server process and then forks copies of itself to handle incoming
 requests? 

It starts a super server process (Postmaster) and some background
processes (background writer, stats collector). For each incoming
connection, the postmaster forks a single-threaded backend process,
which handles all queries and transactions on this connection, and
terminates when the connection terminates.

So as a thumb-rule, each connection can utilize only a single CPU. You
can utilize a few more CPUs than you have simultaneous connections, due
to the background processes and the OS needing CPU for I/O, but thats
rather marginal.

AFAIK, Bizgres MPP has extended the backend processes to be multi
threaded, so a single connection can utilize several CPUs for some types
of queries (large data sets, sorting/joining/hashing etc.). Btw, I
presume that they might offer you a free test license, and I also
presume their license fee is much lower than Oracle or DB/2.

 Or do I have to specify how many server processes should be
 started up? 

You can limit the number of server processes by setting the maximum
connection limit.

 I figured maybe I can take advantage of the multiple cpu's
 on this system by starting up enough postgres server processes to handle
 large numbers of incoming connections.  I have this server available for
 sixty days so I may as well explore the performance of postgresql on it.

Yes, you can take advantage if you have multiple clients (e. G. a wep
app, that's what the T2000 / Niagara were made for). You have a Tomcat
or Jboss sitting on it, each http connection forks its own thread. Each
customer has its own CPU :-)

Then use a connection pool to PostgreSQL, and you're fine. The more
customers, the more CPUs are utilized.

But beware, if you have floating point maths, it will be very slow. All
8 CPUs / 32 Threads share a single FPU. So if you need floating point
(e. G. Mapserver, PostGIS geoprocessing, Java2D chart drawing or
something), T2000 is not the right thing for you.


HTH,
Markus

-- 
Markus Schaber | Logical TrackingTracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Query planner is using wrong index.

2006-04-06 Thread Brian Herlihy

--- Ragnar [EMAIL PROTECTED] wrote:

 On fim, 2006-04-06 at 12:35 +1000, Brian Herlihy wrote:
 
  I have a problem with the choice of index made by the query planner.
  
  My table looks like this:
  
  CREATE TABLE t
  (
p1 varchar not null,
p2 varchar not null,
p3 varchar not null,
i1 integer,
i2 integer,
i3 integer,
i4 integer,
i5 integer,
d1 date,
d2 date,
d3 date,
PRIMARY KEY (p1, p2, p3)
  );
  
  I have also created an index on (p2, p3), as some of my lookups are on
 these
  only.
 
  All the integers and dates are data values.
  The table has around 9 million rows.
  I am using postgresl 7.4.7
  
  I have set statistics to 1000 on the p1, p2 and p3 columns, and run vacuum
 full
  analyse.  However, I still see
  query plans like this:
  
 ...
  db=# explain select * from t where p2 = 'fairly_common' and p3 =
  'fairly_common';   
   
  QUERY PLAN  
 

---
   Index Scan using p2p3 on t  (cost=0.00..6.01 rows=1 width=102)
 Index Cond: (((p2)::text = 'fairly_common'::text) AND ((p3)::text =
  'fairly_common'::text))
  (3 rows)
 
 please show us an actual EXPLAIN ANALYZE
 this will show us more.
 
  I would like the query planner to use the primary key for all of these
 lookups.
   How can I enforce this?
 
 How would that help? have you tested to see if it would 
 actualy be better?
 
 gnari
 

Yes, the primary key is far better.  I gave it the ultimate test - I dropped
the (p2, p3) index.  It's blindingly fast when using the PK, which is what I
expect from Postgresql :)  This query is part of an import process, which has
been getting increasingly slow as the table has grown.

I first discovered the problem when I noticed queries which should be simple PK
lookups taking up to 2.5 seconds on an idle system.  I discussed this problem
in the Postgres IRC channel, and it turns out to be due to an inaccurate
selectivity estimate.

The columns p2 and p3 are highly correlated, which is why I often get hundreds
of rows even after specifying values for both these columns.  However, the
query optimizer assumes the columns are not correlated.  It calculates the
selectivity for each column seperately, then multiplies them to get the
combined selectivity for specifying both p2 and p3.  This results in an
estimate of 1 row, which makes the (p2,p3) index look as good as the (p1,p2,p3)
index.

I'm aware now that there is no way to force use of a particular index in
Postgres.  I've also been told that there is no way to have the optimizer take
into account correlation between column values.

My options seem to be
  - Fudge the analysis results so that the selectivity estimate changes.  I
have tested reducing n_distinct, but this doesn't seem to help.
  - Combine the columns into one column, allowing postgres to calculate the
combined selectivity.
  - Drop the (p2, p3) index.  But I need this for other queries.

None of these are good solutions.  So I am hoping that there is a better way to
go about this!

Thanks,
Brian

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Juan Casero \(FL FLC\)
Because I plan to develop a rather large (for us anyway) data warehouse
with PostgreSQL.  I am looking for the right hardware that can handle
queries on a database that might grow to over a 100 gigabytes.  Right
now our decision support system based on postgresql 8.1.3 stores retail
sales information for about 4 four years back *but* only as weekly
summaries.  I want to build the system so it can handle daily sales
transactions also.  You can imagine how many more records this will
involve so I am looking for hardware that can give me the performance I
need to make this project useable.  In other words parsing and loading
the daily transaction logs for our stores is likely to take huge amounts
of effort.  I need a machine that can complete the task in a reasonable
amount of time.  As people start to query the database to find sales
related reports and information I need to make sure the queries will run
reasonably fast for them.  I have already hand optimized all of my
queries on the current system.  But currently I only have weekly sales
summaries.  Other divisions in our company have done a similar project
using MS SQL Server on SMP hardware far outclassing the database server
I currently use and they report heavy loads on the server with less than
ideal query run times.  I am sure I can do my part to optimize the
queries once I start this project but there is only so much you can do.
At some point you just need more powerful hardware.  This is where I am
at right now.  Apart from that since I will only get this one chance to
buy a new server for data processing I need to make sure that I buy
something that can grow over time as our needs change.  I don't want to
buy a server only to find out later that it cannot meet our needs with
future database projects.  I have to balance a limited budget, room for
future performance growth, and current system requirements.  Trust me it
isn't easy.  


Juan

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 06, 2006 2:57 AM
To: pgsql-performance@postgresql.org
Cc: Juan Casero (FL FLC); Luke Lonergan
Subject: Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

Juan,

 Ok that is beginning to become clear to me.  Now I need to determine 
 if this server is worth the investment for us.  Maybe it is not a 
 speed daemon but to be honest the licensing costs of an SMP aware 
 RDBMS is outside our budget.

You still haven't explained why you want multi-threaded queries.  This
is sounding like keeping up with the Joneses.

--
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Query planner is using wrong index.

2006-04-06 Thread Ragnar
On fim, 2006-04-06 at 19:27 +1000, Brian Herlihy wrote:
 --- Ragnar [EMAIL PROTECTED] wrote:
 
  On fim, 2006-04-06 at 12:35 +1000, Brian Herlihy wrote:
 
...
 PRIMARY KEY (p1, p2, p3)
...
   
   I have also created an index on (p2, p3), as some of my lookups are on
   these only.
...
   db=# explain select * from t where p2 = 'fairly_common' and p3 =
   'fairly_common';
  
  please show us an actual EXPLAIN ANALYZE
 
   I would like the query planner to use the primary key for all of these
  lookups.
  
  have you tested to see if it would  actualy be better?
  

 Yes, the primary key is far better.  I gave it the ultimate test - I dropped
 the (p2, p3) index.  It's blindingly fast when using the PK, 

I have problems understanding exactly how an index on 
(p1,p2,p3) can be faster than and index on (p2,p3) for
a query not involving p1.
can you demonstrate this with actual EXPLAIN ANALYZES ?
something like:
EXPLAIN ANALYZE select * from t where p2 = ? and p3 = ?;
BEGIN;
DROP INDEX p2p3;
EXPLAIN ANALYZE select * from t where p2 = ? and p3 = ?;
ROLLBACK;

maybe your p2p3 index needs REINDEX ?


 My options seem to be
   - Fudge the analysis results so that the selectivity estimate changes.  I
 have tested reducing n_distinct, but this doesn't seem to help.
   - Combine the columns into one column, allowing postgres to calculate the
 combined selectivity.
   - Drop the (p2, p3) index.  But I need this for other queries.
 
 None of these are good solutions.  So I am hoping that there is a better way 
 to
 go about this!

I think we must detemine exactly what the problem is
before  devising complex solutions

gnari



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Intel C/C++ Compiler Tests (fwd)

2006-04-06 Thread Bruce Momjian
  On 3/22/06 5:56 AM, Spiegelberg, Greg [EMAIL PROTECTED] wrote:
 
   Has anyone tested PostgreSQL 8.1.x compiled with Intel's Linux C/C++
   compiler?
 
  We used to compile 8.0 with icc and 7.x before that.  We found very good
  performance gains for Intel P4 architecture processors and some gains for
  AMD Athlon.
 
  Lately, the gcc compilers have caught up with icc on pipelining
  optimizations and they generate better code for Opteron than icc, so we
  found that icc was significantly slower than gcc on Opteron and no
  different
  on P4/Xeon.
 
  Maybe things have changed in newer versions of icc, the last tests I did
  were about 1 year ago.

EnterpriseDB is seeing the same thing, that gcc4 now has the same
performance as icc, and is more flexible.

-- 
  Bruce Momjian   http://candle.pha.pa.us

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] freebsd/softupdates for data dir

2006-04-06 Thread Vivek Khera


On Apr 5, 2006, at 6:07 PM, Jim Nasby wrote:



More importantly, it allows the system to come up and do fsck in  
the background. If you've got a large database that's a pretty big  
benefit.


That's a UFS2 feature, not a soft-updates feature.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Matthew Nuzum
On 4/6/06, Juan Casero (FL FLC) [EMAIL PROTECTED] wrote:
 Because I plan to develop a rather large (for us anyway) data warehouse
 with PostgreSQL.  I am looking for the right hardware that can handle
 queries on a database that might grow to over a 100 gigabytes.

You need to look for a server that has fast I/O. 100 GB of data will
take a long time to scan through and won't fit in RAM.

 Right
 now our decision support system based on postgresql 8.1.3 stores retail
 sales information for about 4 four years back *but* only as weekly
 summaries.  I want to build the system so it can handle daily sales
 transactions also.  You can imagine how many more records this will
 involve so I am looking for hardware that can give me the performance I
 need to make this project useable.

Sounds like you need to be doing a few heavy queries when you do this,
not tons of small queries. That likely means you need fewer CPUs that
are very fast.

 In other words parsing and loading
 the daily transaction logs for our stores is likely to take huge amounts
 of effort.  I need a machine that can complete the task in a reasonable
 amount of time.

See my previous comment

 As people start to query the database to find sales
 related reports and information I need to make sure the queries will run
 reasonably fast for them.

Get more than one CPU core and make sure you have a lot of drive
spindles. You will definately want to be able to ensure a long running
query doesn't hog your i/o system. I have a server with a single disk
and when we do a long query the server load will jump from about .2 to
10 until the long query finishes. More cpus won't help this because
the bottle neck is the disk.

  I have already hand optimized all of my
 queries on the current system.  But currently I only have weekly sales
 summaries.  Other divisions in our company have done a similar project
 using MS SQL Server on SMP hardware far outclassing the database server
 I currently use and they report heavy loads on the server with less than
 ideal query run times.  I am sure I can do my part to optimize the
 queries once I start this project but there is only so much you can do.
 At some point you just need more powerful hardware.  This is where I am
 at right now.

You say this is where I am at right __now__ but where will you be in
9 months? Sounds like you will be i/o bound by the time you get above
10GB.

 Apart from that since I will only get this one chance to
 buy a new server for data processing I need to make sure that I buy
 something that can grow over time as our needs change.  I don't want to
 buy a server only to find out later that it cannot meet our needs with
 future database projects.  I have to balance a limited budget, room for
 future performance growth, and current system requirements.  Trust me it
 isn't easy.

Isn't it about time we had our annual what kind of server can I get
for $8k thread?

--
Matthew Nuzum
www.bearfruit.org

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Query planner is using wrong index.

2006-04-06 Thread Brian Herlihy

--- Ragnar [EMAIL PROTECTED] wrote:

 On fim, 2006-04-06 at 19:27 +1000, Brian Herlihy wrote:
 
  Yes, the primary key is far better.  I gave it the ultimate test - I
 dropped
  the (p2, p3) index.  It's blindingly fast when using the PK, 
 
 I have problems understanding exactly how an index on 
 (p1,p2,p3) can be faster than and index on (p2,p3) for
 a query not involving p1.
 can you demonstrate this with actual EXPLAIN ANALYZES ?
 something like:
 EXPLAIN ANALYZE select * from t where p2 = ? and p3 = ?;
 BEGIN;
 DROP INDEX p2p3;
 EXPLAIN ANALYZE select * from t where p2 = ? and p3 = ?;
 ROLLBACK;
 
 maybe your p2p3 index needs REINDEX ?
 

Here's the output.  The timings after caching are repeatable (varying only by
10% or so).  

Query before caching:

db# explain analyze select * from t WHERE p1 = 'a' and p2 = 'uk.altavista.com'
AND p3 = 'web/results?itag=q=kgs=kls=';
  QUERY PLAN   

---
 Index Scan using p2_p3_idx on t  (cost=0.00..6.02 rows=1 width=102) (actual
time=2793.247..2793.247 rows=0 loops=1)
   Index Cond: (((p2)::text = 'uk.altavista.com'::text) AND ((p3)::text =
'web/results?itag=q=kgs=kls='::text))
   Filter: ((p1)::text = 'a'::text)
 Total runtime: 2793.303 ms
(4 rows)

Query after caching:

db# explain analyze select * from t WHERE p1 = 'a' and p2 = 'uk.altavista.com'
AND p3 = 'web/results?itag=q=kgs=kls=';
  QUERY PLAN   

---
 Index Scan using p2_p3_idx on t  (cost=0.00..6.02 rows=1 width=102) (actual
time=0.617..0.617 rows=0 loops=1)
   Index Cond: (((p2)::text = 'uk.altavista.com'::text) AND ((p3)::text =
'web/results?itag=q=kgs=kls='::text))
   Filter: ((p1)::text = 'a'::text)
 Total runtime: 0.665 ms
(4 rows)

=== At this point I did DROP INDEX p2_p3_idx

Query after dropping index:

db# explain analyze select * from t WHERE p1 = 'a' and p2 = 'uk.altavista.com'
AND p3 = 'web/results?itag=q=kgs=kls=';
 
QUERY PLAN 
  
--
 Index Scan using t_pkey on t  (cost=0.00..6.02 rows=1 width=102) (actual
time=95.188..95.188 rows=0 loops=1)
   Index Cond: (((p1)::text = 'a'::text) AND ((p2)::text =
'uk.altavista.com'::text) AND ((p3)::text =
'web/results?itag=q=kgs=kls='::text))
 Total runtime: 95.239 ms
(3 rows)

Query after dropping index, fully cached:

db# explain analyze select * from t WHERE p1 = 'a' and p2 = 'uk.altavista.com'
AND p3 = 'web/results?itag=q=kgs=kls=';
 
QUERY PLAN 
  
--
 Index Scan using t_pkey on t  (cost=0.00..6.02 rows=1 width=102) (actual
time=0.030..0.030 rows=0 loops=1)
   Index Cond: (((p1)::text = 'a'::text) AND ((p2)::text =
'uk.altavista.com'::text) AND ((p3)::text =
'web/results?itag=q=kgs=kls='::text))
 Total runtime: 0.077 ms
(3 rows)



And one where the query planner chooses the primary key instead.  Both p2 and
p3 are present as Most Common Values in pg_statistics:

Query before fully cached:

db# explain analyze SELECT * FROM t WHERE p1 = 'b' AND p2 = 'www.google.com'
AND p3 = 'search?hl=lr=q=';

   
QUERY PLAN 

--
 Index Scan using t_pkey on t  (cost=0.00..6.02 rows=1 width=102) (actual
time=212.092..212.100 rows=1 loops=1)
   Index Cond: (((p1)::text = 'b'::text) AND ((p2)::text =
'www.google.com'::text) AND ((p3)::text = 'search?hl=lr=q='::text))
 Total runtime: 212.159 ms
(3 rows)

Query after fully cached:

db# explain analyze SELECT * FROM t WHERE p1 = 'b' AND p2 = 'www.google.com'
AND p3 = 'search?hl=lr=q=';
   
QUERY PLAN 

--
 Index Scan using t_pkey on t 

[PERFORM] CURSOR OR OFFSET/LIMIT

2006-04-06 Thread Kaloyan Iliev

Hi,

I am working on Web Based application using Perl and Apache.
I have to show to the users some query results by pages.
Some time the result can be over 1000 rows (but can be more).
The question is how to make this.

The one way is to use OFFSET and LIMIT. That's OK but every time the 
whole query must be parsed and executed.


If I use cursors it's better but my problem is that cursors live only in 
the current transaction.

So when the Web Server finish I've lost the transaction and the cursor.

There is some software written from my coleagues that on every server 
request open a transaction and cursor. Move to the requested
page and show the result(After that the script finishes, so is the 
transaction). So my question is.
Should I rewrte this by using OFFSET/LIMIT or it is better every time to 
create the cursor and use it to get the rows.
Is there a way to save the cursor between separe Browser request (and to 
give it time to live)? Or After all OFFSET and LIMIT?


Thanks in advance.

Kaloyan Iliev


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Vivek Khera


On Apr 5, 2006, at 5:58 PM, August Zajonc wrote:


Most involve some AMD Opertons, lots of spindles with a good raid
controller preferred to one or two large disks and a good helping of
ram. Be interesting to get some numbers on the sunfire machine.


I can highly recommend the SunFire X4100, however the only dual- 
channel RAID card that fits in the box is the Adaptec 2230SLP.  It is  
not quite as fast as the LSI 320-2x when running freebsd, but is  
sufficient for my purpose.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Vivek Khera


On Apr 5, 2006, at 9:11 PM, Marcelo Tada wrote:


What are you think about the Sun Fire X64 X4200 Server?


I use the X4100 and like it a lot.  I'm about to buy another.  I see  
no advantage to the X4200 unless you want the extra internal disks.   
I use an external array.



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Vivek Khera


On Apr 6, 2006, at 12:47 AM, Leigh Dyer wrote:

I'm sure those little SAS drives would be great for web servers and  
other non-IO-intensive tasks though -- I'd love to get some X4100s  
in to replace our Poweredge 1750s for that. It's a smart move  
overall IMHO,


For this purpose, bang for the buck would lead me to getting Dell  
1850 with hardware RAID and U320 drives.  The quick-n-dirty tests  
i've seen on the FreeBSD mailing list shows the disks are much faster  
on the 1850 than the X4100 with its SAS disks.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Luke Lonergan
Juan,

On 4/6/06 7:01 AM, Matthew Nuzum [EMAIL PROTECTED] wrote:

 Apart from that since I will only get this one chance to
 buy a new server for data processing I need to make sure that I buy
 something that can grow over time as our needs change.  I don't want to
 buy a server only to find out later that it cannot meet our needs with
 future database projects.  I have to balance a limited budget, room for
 future performance growth, and current system requirements.  Trust me it
 isn't easy.
 
 Isn't it about time we had our annual what kind of server can I get
 for $8k thread?

Based on Juan's description, here's a config that will *definitely* be the
fastest possible in an $8K budget:

Buy a dual opteron server with 8 x 400GB SATA II disks on a 3Ware 9550SX
RAID controller with 16GB of RAM pre-installed with Centos 4.3 for $6,000
here:
  http://www.asacomputers.com/

Download the *free* open source Bizgres here:
  http://bgn.greenplum.com/

Use bitmap indexes for columns with less than 10,000 unique values, and your
system will fly through 100GB.

This is the fastest OSS business intelligence kit for the money, guaranteed.

- Luke



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Query planner is using wrong index.

2006-04-06 Thread Ragnar
On fös, 2006-04-07 at 00:01 +1000, Brian Herlihy wrote:
 --- Ragnar [EMAIL PROTECTED] wrote:
 
  On fim, 2006-04-06 at 19:27 +1000, Brian Herlihy wrote:
  
   Yes, the primary key is far better.  I gave it the ultimate test - I
  dropped
   the (p2, p3) index.  It's blindingly fast when using the PK, 
  
  I have problems understanding exactly how an index on 
  (p1,p2,p3) can be faster than and index on (p2,p3) for
  a query not involving p1.

 db# explain analyze select * from t WHERE p1 = 'a' and p2 = 'uk.altavista.com'
 AND p3 = 'web/results?itag=q=kgs=kls=';

this is different from what you said earlier. in your 
original post you showed a problem query without any
reference to p1 in the WHERE clause. this confused me.

  Index Scan using p2_p3_idx on t  (cost=0.00..6.02 rows=1 width=102) (actual
 time=2793.247..2793.247 rows=0 loops=1)
Index Cond: (((p2)::text = 'uk.altavista.com'::text) AND ((p3)::text =
 'web/results?itag=q=kgs=kls='::text))
Filter: ((p1)::text = 'a'::text)
  Total runtime: 2793.303 ms
 (4 rows)

try to add an ORDER BY clause:

explain analyze 
  select * from t 
  WHERE p1 = 'a'
and p2 = 'uk.altavista.com'
AND p3 = 'web/results?itag=q=kgs=kls='
  ORDER BY p1,p2,p3;

this might push the planner into using the primary key

gnari





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] CURSOR OR OFFSET/LIMIT

2006-04-06 Thread John DeSoi


On Apr 6, 2006, at 10:48 AM, Kaloyan Iliev wrote:

If I use cursors it's better but my problem is that cursors live  
only in the current transaction.
So when the Web Server finish I've lost the transaction and the  
cursor.



Cursors can live outside the transaction if you declare them WITH  
HOLD specified. But that still may not help you in a web environment  
if you want to break the results into pages served on separate  
requests (and possibly different connections).


http://www.postgresql.org/docs/8.1/interactive/sql-declare.html

Is there a way to save the cursor between separe Browser request  
(and to give it time to live)?


Sure, but you need to add a lot of connection management to do this.  
You would need to keep track of the cursors and make sure a  
subsequent request uses the right connection.







John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Jignesh K. Shah

Hi Leigh

inline comments


Leigh Dyer wrote:


Luke Lonergan wrote:


Juan,


We've got a Sun Fire V40z and it's quite a nice machine -- 6x 15krpm 
drives, 4GB RAM, and a pair of Opteron 850s. This gives us more than 
enough power now for what we need, but it's nice to know that we can 
shoehorn a lot more RAM, and up it to eight CPU cores if needed.


The newer Sun Opteron systems look nice too, but unless you're using 
external storage, their little 2.5 hard drives may not be ideal.




Thats because Sun Fire V40z had write cache turned on while the 
4200/4100 has the write cache turned off. There is a religious belief 
around the write cache on the disk in Sun :-)  To really compare the 
performance, you have to turn on the write cache (I believe it was 
format -e and the cache option.. but that could have changed.. need to 
verify that again.. Same goes for T2000 SAS disks too.. Write cache is 
turned off on it so be careful before you compare benchmarks on internal 
drives :-)


-Jignesh




Thanks
Leigh



- Luke



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

2006-04-06 Thread Jignesh K. Shah
For a DSS type workload with PostgreSQL where you end up with single 
long running queries on postgresql with about 100GB, you better use 
something like Sun Fire V40z with those fast Ultra320 internal drives. 
This might be perfect low cost complete database in a box.


Sun Fire T2000 is great for OLTP where you can end up with hundreds of 
users doing quick and small lookups and T2000 can crank simple thread 
executions far better than others. However when it comes to long running 
queries you end up using 1/32 of the power and may not live up to your 
expectations.  For example consider your PostgreSQL talking to Apache 
WebServer all on T2000... You can put them in separate zones if you have 
different administrators for them. :-)


As for PostgreSQL on Solaris, I already have the best parameters to use 
on Solaris based on my tests, the default odatasync hurts performance on 
Solaris, so does checkpoint segments, others are tweaked so that they 
are set for bigger databases and hence may not show much difference on 
performances...


That said I will still be interested to see your app performance with 
postgreSQL on Sun Fire T2000 as there are always ways of perseverence to 
improve performance :-)



Regards,
Jignesh


Juan Casero (FL FLC) wrote:


Because I plan to develop a rather large (for us anyway) data warehouse
with PostgreSQL.  I am looking for the right hardware that can handle
queries on a database that might grow to over a 100 gigabytes.  Right
now our decision support system based on postgresql 8.1.3 stores retail
sales information for about 4 four years back *but* only as weekly
summaries.  I want to build the system so it can handle daily sales
transactions also.  You can imagine how many more records this will
involve so I am looking for hardware that can give me the performance I
need to make this project useable.  In other words parsing and loading
the daily transaction logs for our stores is likely to take huge amounts
of effort.  I need a machine that can complete the task in a reasonable
amount of time.  As people start to query the database to find sales
related reports and information I need to make sure the queries will run
reasonably fast for them.  I have already hand optimized all of my
queries on the current system.  But currently I only have weekly sales
summaries.  Other divisions in our company have done a similar project
using MS SQL Server on SMP hardware far outclassing the database server
I currently use and they report heavy loads on the server with less than
ideal query run times.  I am sure I can do my part to optimize the
queries once I start this project but there is only so much you can do.
At some point you just need more powerful hardware.  This is where I am
at right now.  Apart from that since I will only get this one chance to
buy a new server for data processing I need to make sure that I buy
something that can grow over time as our needs change.  I don't want to
buy a server only to find out later that it cannot meet our needs with
future database projects.  I have to balance a limited budget, room for
future performance growth, and current system requirements.  Trust me it
isn't easy.  



Juan

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 06, 2006 2:57 AM

To: pgsql-performance@postgresql.org
Cc: Juan Casero (FL FLC); Luke Lonergan
Subject: Re: [PERFORM] Sun Fire T2000 and PostgreSQL 8.1.3

Juan,

 

Ok that is beginning to become clear to me.  Now I need to determine 
if this server is worth the investment for us.  Maybe it is not a 
speed daemon but to be honest the licensing costs of an SMP aware 
RDBMS is outside our budget.
   



You still haven't explained why you want multi-threaded queries.  This
is sounding like keeping up with the Joneses.

--
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Query planner is using wrong index.

2006-04-06 Thread Brian Herlihy

--- Ragnar [EMAIL PROTECTED] wrote:

 On fös, 2006-04-07 at 00:01 +1000, Brian Herlihy wrote:
   Index Scan using p2_p3_idx on t  (cost=0.00..6.02 rows=1 width=102)
 (actual
  time=2793.247..2793.247 rows=0 loops=1)
 Index Cond: (((p2)::text = 'uk.altavista.com'::text) AND ((p3)::text =
  'web/results?itag=q=kgs=kls='::text))
 Filter: ((p1)::text = 'a'::text)
   Total runtime: 2793.303 ms
  (4 rows)
 
 try to add an ORDER BY clause:
 
 explain analyze 
   select * from t 
   WHERE p1 = 'a'
 and p2 = 'uk.altavista.com'
 AND p3 = 'web/results?itag=q=kgs=kls='
   ORDER BY p1,p2,p3;
 
 this might push the planner into using the primary key
 
 gnari
 

Thankyou very much, that works very well for select.  However, I need it to
work for update as well.  Is there an equivalent way to force use of an index
for updates?

Here are the results for select:

db# explain analyze select * from t WHERE p1 = 'a' and p2 = 'uk.altavista.com'
AND p3 = 'web/results?itag=q=kgs=kls=' order by p1,p2,p3;
 
QUERY PLAN 
  
--
 Index Scan using t_pkey on t  (cost=0.00..6.02 rows=1 width=102) (actual
time=32.519..32.519 rows=0 loops=1)
   Index Cond: (((p1)::text = 'a'::text) AND ((p2)::text =
'uk.altavista.com'::text) AND ((p3)::text =
'web/results?itag=q=kgs=kls='::text))
 Total runtime: 32.569 ms
(3 rows)

db# explain analyze select * from t WHERE p1 = 'a' and p2 = 'uk.altavista.com'
AND p3 = 'web/results?itag=q=kgs=kls=';
  QUERY PLAN   

---
 Index Scan using p2_p3_idx on t  (cost=0.00..6.02 rows=1 width=102) (actual
time=2790.364..2790.364 rows=0 loops=1)
   Index Cond: (((p2)::text = 'uk.altavista.com'::text) AND ((p3)::text =
'web/results?itag=q=kgs=kls='::text))
   Filter: ((p1)::text = 'a'::text)
 Total runtime: 2790.420 ms
(4 rows)


But I cannot add an order by to an update.

The other idea I came up with last night was to change p2_p3_idx so it indexes
a value derived from p2 and p3, rather than p2 and p3 themselves.  This would
hide this index from the optimizer, forcing it to use the primary key.

I am really surprised that I have to go through such contortions just to use
the primary key!  This area of Postgres needs improvement.

Thanks,
Brian

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Query planner is using wrong index.

2006-04-06 Thread Dave Dutcher


 -Original Message-
 From: [EMAIL PROTECTED]
[mailto:pgsql-performance-
 [EMAIL PROTECTED] On Behalf Of Brian Herlihy
 Sent: Thursday, April 06, 2006 6:56 PM
 To: pgsql-performance@postgresql.org
 Subject: Re: [PERFORM] Query planner is using wrong index.
[Snip]
 I am really surprised that I have to go through such contortions just
to
 use
 the primary key!  This area of Postgres needs improvement.
 


Of course you mentioned that you are using 7.4.7.  You might want to try
upgrading to 8.1.3.  There have been a lot of improvements to the
performance since 7.4. I don't know if your specific problem was fixed,
but it's worth a try.

Also you might want to at least upgrade to 7.4.12 for the bug fixes.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Query planner is using wrong index.

2006-04-06 Thread Brian Herlihy

--- Dave Dutcher [EMAIL PROTECTED] wrote:
  -Original Message-
  To: pgsql-performance@postgresql.org
  Subject: Re: [PERFORM] Query planner is using wrong index.
 [Snip]
  I am really surprised that I have to go through such contortions just
 to
  use
  the primary key!  This area of Postgres needs improvement.
  
 
 
 Of course you mentioned that you are using 7.4.7.  You might want to try
 upgrading to 8.1.3.  There have been a lot of improvements to the
 performance since 7.4. I don't know if your specific problem was fixed,
 but it's worth a try.
 
 Also you might want to at least upgrade to 7.4.12 for the bug fixes.

Thanks for the suggestions.  I've verified the same problem in 8.1.3 as well,
after my initial post.  It was actually in 8.1.3 that I first discovered the
problem.

I noticed this item in the TODO list:

- Allow accurate statistics to be collected on indexes with more than one
column or expression indexes, perhaps using per-index statistics

This is what I need!  But until that is added, I need a way to use the primary
key with the current version :)

Thanks,
Brian

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Query planner is using wrong index.

2006-04-06 Thread Tom Lane
Brian Herlihy [EMAIL PROTECTED] writes:
 My options seem to be
   - Fudge the analysis results so that the selectivity estimate changes.  I
 have tested reducing n_distinct, but this doesn't seem to help.
   - Combine the columns into one column, allowing postgres to calculate the
 combined selectivity.
   - Drop the (p2, p3) index.  But I need this for other queries.

Have you considered reordering the pkey to be (p2,p3,p1) and then
dropping the (p2,p3) index?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match