Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-07 Thread david
On Thu, 7 May 2009, Robert Haas wrote: On Wed, May 6, 2009 at 6:08 PM, Simon Riggs wrote: Agreed. Perhaps I should say then that the syntax needs to express the requirements of the planner/executor behaviour, rather than being the main aspect of the feature, as some have suggested. Agreed.

Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-07 Thread Robert Haas
On Wed, May 6, 2009 at 6:08 PM, Simon Riggs wrote: > Agreed. Perhaps I should say then that the syntax needs to express the > requirements of the planner/executor behaviour, rather than being the > main aspect of the feature, as some have suggested. Agreed. > Hopefully, notions of partitioning w

Re: [PERFORM] Indexes not used in DELETE

2009-05-07 Thread Tom Lane
Viktor Rosenfeld writes: > -> Seq Scan on corpus toplevel (cost=0.00..1.39 > rows=1 width=54) > Filter: (top_level AND (id = 25::numeric)) > Specifically, I'm wondering why the innermost scan on corpus > (toplevel) does not use the

Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-05-07 Thread David Blewett
On Thu, May 7, 2009 at 6:44 PM, Tom Lane wrote: > Look into pg_stats for the rows concerning the columns used in the > query's WHERE and JOIN/ON clauses. Okay, here you go: http://rafb.net/p/20y8Oh72.html David -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To ma

[PERFORM] Indexes not used in DELETE

2009-05-07 Thread Viktor Rosenfeld
Hi everybody, I'm wondering why a DELETE statement of mine does not make use of defined indexes on the tables. I have the following tables which are linked as such: component -> rank -> node -> corpus; Now I want to delete all entries in component by giving a list of corpus ids. The q

Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-05-07 Thread Tom Lane
David Blewett writes: > On Thu, May 7, 2009 at 4:31 PM, Tom Lane wrote: >> as few as ten rows out, it'd likely switch to a different plan.  So the >> So the question to ask is why the rowcount estimates are so abysmally bad. >> You mentioned having tried to increase the stats targets, but without

Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-05-07 Thread David Blewett
On Thu, May 7, 2009 at 4:31 PM, Tom Lane wrote: > as few as ten rows out, it'd likely switch to a different plan.  So the > So the question to ask is why the rowcount estimates are so abysmally bad. > You mentioned having tried to increase the stats targets, but without > seeing the actual stats d

Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-05-07 Thread Tom Lane
David Blewett writes: > On Thu, May 7, 2009 at 12:53 PM, David Blewett wrote: >> 1. http://dpaste.com/hold/41842/ >> 2. http://explain.depesz.com/s/Wg >> 3. http://explain.depesz.com/s/1s >> 4. http://dpaste.com/hold/41846/ > Forgot to mention that I'm using Postgres 8.3.6 on linux 2.6.24. Well

Re: [PERFORM] Any better plan for this query?..

2009-05-07 Thread Simon Riggs
On Thu, 2009-05-07 at 20:36 +0200, Dimitri wrote: > I've simply restarted a full test with hashjoin OFF. Until 32 > concurrent users things are going well. Then since 32 users response > time is jumping to 20ms, with 64 users it's higher again, and with 256 > users reaching 700ms, so TPS is dropp

Re: [PERFORM] Any better plan for this query?..

2009-05-07 Thread Dimitri
I've simply restarted a full test with hashjoin OFF. Until 32 concurrent users things are going well. Then since 32 users response time is jumping to 20ms, with 64 users it's higher again, and with 256 users reaching 700ms, so TPS is dropping from 5.000 to ~200.. With hashjoin ON it's not happenin

Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-07 Thread Scott Carey
On 5/7/09 1:54 AM, "Simon Riggs" wrote: > > > On Thu, 2009-05-07 at 10:56 +0800, Craig Ringer wrote: >> Tom Lane wrote: >> >> I also expect that in the future there will be demand for striping data >> across multiple partitions in different tablespaces to exploit >> in-parallel scanning (wh

Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-05-07 Thread David Blewett
On Thu, May 7, 2009 at 12:53 PM, David Blewett wrote: > 1. http://dpaste.com/hold/41842/ > 2. http://explain.depesz.com/s/Wg > 3. http://explain.depesz.com/s/1s > 4. http://dpaste.com/hold/41846/ Forgot to mention that I'm using Postgres 8.3.6 on linux 2.6.24. Shared buffers are set to 1GB, effec

[PERFORM] Bad Plan for Questionnaire-Type Query

2009-05-07 Thread David Blewett
I have a query [1] that Postgres is insisting on using a Nested Loop for some reason when a Hash Join is much faster. It seems like the estimates are way off. I've set default_statistics_target to 250, 500, 1000 and analyzed, but they never seem to improve. If I disable nestloops, the query complet

Re: [PERFORM] Slow select performance despite seemingly reasonable query plan

2009-05-07 Thread Nikolas Everett
On Thu, May 7, 2009 at 11:19 AM, Matthew Wakeling wrote: > On Thu, 7 May 2009, David Brain wrote: > >> Certainly random access like this index scan can be extremely slow. 2-4 >>> MB/s >>> is quite reasonable if you're fetching one 8kB block per disc seek - no >>> more >>> than 200 per second. >>>

Re: [PERFORM] Any better plan for this query?..

2009-05-07 Thread Alvaro Herrera
Dimitri escribió: > As well when you look on profiling technology - all such kind of > solutions are based on the system clock frequency and have their > limits on time resolution. On my system this limit is 0.5ms, and it's > too big comparing to the query execution time :-) > > So, what I've don

Re: [PERFORM] Slow select performance despite seemingly reasonable query plan

2009-05-07 Thread Matthew Wakeling
On Thu, 7 May 2009, David Brain wrote: Certainly random access like this index scan can be extremely slow. 2-4 MB/s is quite reasonable if you're fetching one 8kB block per disc seek - no more than 200 per second. We have read ahead set pretty aggressively high as the SAN seems to 'like' this,

Re: [PERFORM] Slow select performance despite seemingly reasonable query plan

2009-05-07 Thread Nikolas Everett
> > Nested Loop Left Join (cost=0.00..6462463.96 rows=1894 width=110) > -> Append (cost=0.00..6453365.66 rows=1894 width=118) > -> Seq Scan on datatable sum (cost=0.00..10.75 rows=1 width=118) > Filter: ((datapointdate >= '2009-04-01 > 00:00:00'::timestamp without time

Re: [PERFORM] Slow select performance despite seemingly reasonable query plan

2009-05-07 Thread David Brain
Hi, Some answers in-line: > > Has there been a performance *change*, or are you just concerned about a > query which doesn't seem to use "enough" disc bandwidth? Performance has degraded noticeably over the past few days. > Certainly random access like this index scan can be extremely slow. 2-4

Re: [PERFORM] Slow select performance despite seemingly reasonable query plan

2009-05-07 Thread Matthew Wakeling
On Thu, 7 May 2009, David Brain wrote: This has been working reasonably well, however in the last few days I've been seeing extremely slow performance on what are essentially fairly simple 'index hitting' selects on this data. From the host side I see that the postgres query process is mostly in

Re: [PERFORM] Slow select performance despite seemingly reasonable query plan

2009-05-07 Thread David Brain
Hi, Interesting, for one index on one partition: idx_scan: 329 idx_tup_fetch: 8905730 So maybe a reindex would help? David. On Thu, May 7, 2009 at 10:26 AM, Scott Mead wrote: > On Thu, May 7, 2009 at 10:14 AM, David Brain wrote: >> >> Hi, >> >> Some context, we have a _lot_ of data, > 1TB, m

Re: [PERFORM] Slow select performance despite seemingly reasonable query plan

2009-05-07 Thread Scott Mead
On Thu, May 7, 2009 at 10:14 AM, David Brain wrote: > Hi, > > Some context, we have a _lot_ of data, > 1TB, mostly in 1 'table' - > the 'datatable' in the example below although in order to improve > performance this table is partitioned (by date range) into a number of > partition tables. Each

[PERFORM] Slow select performance despite seemingly reasonable query plan

2009-05-07 Thread David Brain
Hi, Some context, we have a _lot_ of data, > 1TB, mostly in 1 'table' - the 'datatable' in the example below although in order to improve performance this table is partitioned (by date range) into a number of partition tables. Each partition contains up to 20GB of data (tens of millons of rows),

Re: [PERFORM] GiST index performance

2009-05-07 Thread Matthew Wakeling
On Thu, 7 May 2009, Oleg Bartunov wrote: Did you try Guttman quadratic split algorithm ? We also found linear split algorithm for Rtree. The existing (bugfixed) seg split algorithm is the Guttman quadratic split algorithm. Guttman did all his work on two-dimensional and above data, dismissing

Re: [PERFORM] Any better plan for this query?..

2009-05-07 Thread Simon Riggs
On Thu, 2009-05-07 at 12:58 +0100, Gregory Stark wrote: > It would be nice if we were in the same ballpark as MySQL but we would only be > interesting in such optimizations if they don't come at the expense of > scalability under more complex workloads. It doesn't appear there is a scalability i

Re: [PERFORM] Any better plan for this query?..

2009-05-07 Thread Gregory Stark
Simon Riggs writes: > We already know that MySQL favors nested loop joins >From what I read I thought that was the *only* type of join MySQL supports. The big picture view here is that whether we run a short query in half a millisecond versus two milliseconds is usually not really important. It

Re: [PERFORM] Any better plan for this query?..

2009-05-07 Thread Merlin Moncure
On Thu, May 7, 2009 at 4:20 AM, Dimitri wrote: > Hi Simon, > > may you explain why REINDEX may help here?.. - database was just > created, data loaded, and then indexes were created + analyzed.. What > may change here after REINDEX?.. > > With hashjoin disabled was a good try! > Running this query

Re: [PERFORM] GiST index performance

2009-05-07 Thread Oleg Bartunov
On Wed, 6 May 2009, Tom Lane wrote: Matthew Wakeling writes: Here is my patch ported over to the seg contrib package, attached. Apply it to seg.c and all should be well. A similar thing needs to be done to cube, but I haven't looked at that. Teodor, Oleg, do you intend to review/apply this p

Re: [PERFORM] Transparent table partitioning in future version of PG?

2009-05-07 Thread Simon Riggs
On Thu, 2009-05-07 at 10:56 +0800, Craig Ringer wrote: > Tom Lane wrote: > > Alvaro Herrera writes: > >> I think there should be a way to refer to individual partitions as > >> objects. > > > > Yeah, the individual partitions should be nameable tables, otherwise we > > will be reinventing a *who

Re: [PERFORM] Any better plan for this query?..

2009-05-07 Thread Dimitri
The problem with "gprof" - it'll profile all stuff from the beginning to the end, and a lot of function executions unrelated to this query will be profiled... As well when you look on profiling technology - all such kind of solutions are based on the system clock frequency and have their limits on

Re: [PERFORM] Any better plan for this query?..

2009-05-07 Thread Dimitri
Hi Simon, may you explain why REINDEX may help here?.. - database was just created, data loaded, and then indexes were created + analyzed.. What may change here after REINDEX?.. With hashjoin disabled was a good try! Running this query "as it" from 1.50ms we move to 0.84ms now, and the plan is he