Re: [PERFORM] query plan not optimal
On 29/12/2013 19:51, Jeff Janes wrote: On Thursday, December 19, 2013, Marc Cousin wrote: Yeah, I had forgotten to set it up correctly on this test environment (its value is correctly set in production environments). Putting it to a few gigabytes here gives me this cost: bacula=# explain select pathid, filename from batch join path using (path); QUERY PLAN Nested Loop (cost=0.56..2083904.10 rows=479020 width=26) - Seq Scan on batch (cost=0.00..11727.20 rows=479020 width=85) - Index Scan using idx_path on path (cost=0.56..4.32 rows=1 width=16) Index Cond: (path = batch.path) (4 lignes) It still chooses the hash join though, but by a smaller margin. This is still a tangent from your original point, but if I create index on path (path, pathid), then I can get an index only scan. This actually is not much faster when everything is already cached, but the planner thinks it will be about 2x faster because it assumes the vm block accesses are free. So this might be enough to tip it over for you. Yeah, still a tangent :) Many bacula users don't have index only scans (the one I was having trouble with for example), as they are still using an older than 9.2 PostgreSQL version. And it still only will access a very small part of path (always the same 5000 records) during the query, which isn't accounted for in the cost if I understand correctly ? I think you are correct, that it doesn't take account of ndistinct being 10 to 100 fold less than ntuples on the outer loop, which theoretically could propagate down to the table size used in connection with effecitve_cache_size. It seems to me the cost of the hash join is being greatly underestimated, which I think is more important than the nested loop being overestimated. (And in my hands, the merge join is actually the winner both in the planner and in reality, but I suspect this is because all of your fake paths are lexically greater than all of the real paths) Yes, probably. Also, you talked earlier about cheating the planner by lowering random_page_cost. But why is that cheating? If caching means the cost is truly lower... It feels like cheating, as I feel I'm compensating for what looks like a bad estimate of the cost: the nested loop is very fast, even if nothing is cached at the beginning. We could put the *_page_cost hardcoded to low values in bacula's code for this query, but it is not that good to put it in postgresql.conf as we currently do, as some other queries are suffering from those very low costs. Anyway, it would be even better if it wasn't needed at all, hence this post :) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] query plan not optimal
Hi, I'm having something I feel is a bit of a limitation of the optimizer (or something I don't understand :) ). Sorry, this is a rather long mail. I have a workaround for the problem below, but I don't like cheating the optimizer for no good reason. First a little bit of context, because the example below may feel dumb if you read it with no explanation: I'm working on the bacula project, trying to further optimize data insertion into the database (we've been working on this subject for a few years now, as backup speed is a core functionality of any backup software). Bacula is a backup software: we are storing all of the backup's metadata into the database. It means all backed up files… The two very large tables are file and path: * file contains metadata about ALL files that have been backed up and are still available in the database. It sometimes contain several billions records. * path contains the full path of a file. It is stored out of file to save (lots) of space, as most path are identical in files. It still usually contains 20 to 50 million records, and can be as big as 20GB. So we have something like this: create table file (fileid bigserial, filename text, pathid bigint); alter table file add primary key (fileid); create table path (pathid bigserial, path text); alter table path add primary key (pathid); create unique index idx_path on path(path); I removed some columns from the file table: we store stat(), a checksum, etc. They're not needed for this example. In order to insert data efficiently, backups first store data into temp tables which look like this: create table batch (path text, filename text); (once again I removed all metadata columns) If you want to create a batch table with useful data to replicate what I'm going to show, you can try something like this: find /home -printf '%h\t%f\n' | psql -c COPY batch FROM STDIN bacula We analyze the batch table: in the real code, it is a temp table, so it is compulsory: analyze batch; Then we insert missing paths. This is one of the plans that fail, but we'll focus on the second one: as we are starting from an empty path table, this query won't be realistic. insert into path (path) select path from batch where not exists (select 1 from path where path.path=batch.path) group by path; We analyze: analyze path; So now we insert into the file table. insert into file (pathid,filename) select pathid, filename from batch join path using (path); Here is the plan: bacula=# explain select pathid, filename from batch join path using (path); QUERY PLAN --- Hash Join (cost=822.25..22129.85 rows=479020 width=26) Hash Cond: (batch.path = path.path) - Seq Scan on batch (cost=0.00..11727.20 rows=479020 width=85) - Hash (cost=539.89..539.89 rows=22589 width=81) - Seq Scan on path (cost=0.00..539.89 rows=22589 width=81) (5 lignes) I have this plan almost all the time. Lets add a bunch of useless records in path to be more realistic and make things worse: usually, bacula inserts data in 500,000 batches, and path is very large (millions of records), and is bigger than work_mem (it would have to be 20GB in most extreme cases), so it may trash disks heavily (many servers can be inserting at the same time). Let's simulate this (remove indexes before, put them back afterwards :) ) insert into path (path) select generate_series(100,5000); #Create a realistic path table analyze path; Here is the plan: bacula=# explain (analyze,buffers,verbose) select pathid, filename from batch join path using (path); QUERY PLAN Hash Join (cost=1646119.66..1945643.59 rows=479020 width=26) (actual time=27275.240..36745.904 rows=479020 loops=1) Output: path.pathid, batch.filename Hash Cond: (batch.path = path.path) Buffers: shared hit=130760 read=179917 written=1823, temp read=224130 written=223876 - Seq Scan on public.batch (cost=0.00..11727.20 rows=479020 width=85) (actual time=0.259..176.031 rows=479020 loops=1) Output: batch.filename, batch.path Buffers: shared read=6937 written=1823 - Hash (cost=793966.96..793966.96 rows=49022696 width=16) (actual time=27274.725..27274.725 rows=49022590 loops=1) Output: path.pathid, path.path Buckets: 131072 Batches: 128 Memory Usage: 18329kB Buffers: shared hit=130760 read=172980, temp written=218711 - Seq Scan on public.path (cost=0.00..793966.96 rows=49022696 width=16) (actual time=0.231..9650.452 rows=49022590 loops=1) Output: path.pathid,
Re: [PERFORM] query plan not optimal
On 19/12/2013 19:33, Jeff Janes wrote: QUERY PLAN -- Nested Loop (cost=0.56..4001768.10 rows=479020 width=26) (actual time=2.303..15371.237 rows=479020 loops=1) Output: path.pathid, batch.filename Buffers: shared hit=2403958 read=7539 - Seq Scan on public.batch (cost=0.00..11727.20 rows=479020 width=85) (actual time=0.340..160.142 rows=479020 loops=1) Output: batch.path, batch.filename Buffers: shared read=6937 - Index Scan using idx_path on public.path (cost=0.56..8.32 rows=1 width=16) (actual time=0.030..0.031 rows=1 loops=479020) Output: path.pathid, path.path Index Cond: (path.path = batch.path) Buffers: shared hit=2403958 read=602 Total runtime: 15439.043 ms As you can see, more than twice as fast, and a very high hit ratio on the path table, even if we start from a cold cache (I did, here, both PostgreSQL and OS). We have an excellent hit ratio because the batch table contains few different path (several files in a directory), and is already quite clustered, as it comes from a backup, which is of course performed directory by directory. What is your effective_cache_size set to? Cheers, Jeff Yeah, I had forgotten to set it up correctly on this test environment (its value is correctly set in production environments). Putting it to a few gigabytes here gives me this cost: bacula=# explain select pathid, filename from batch join path using (path); QUERY PLAN Nested Loop (cost=0.56..2083904.10 rows=479020 width=26) - Seq Scan on batch (cost=0.00..11727.20 rows=479020 width=85) - Index Scan using idx_path on path (cost=0.56..4.32 rows=1 width=16) Index Cond: (path = batch.path) (4 lignes) It still chooses the hash join though, but by a smaller margin. And it still only will access a very small part of path (always the same 5000 records) during the query, which isn't accounted for in the cost if I understand correctly ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] query plan not optimal
On 19/12/2013 21:36, Kevin Grittner wrote: Marc Cousin cousinm...@gmail.com wrote: Then we insert missing paths. This is one of the plans that fail insert into path (path) select path from batch where not exists (select 1 from path where path.path=batch.path) group by path; I know you said you wanted to focus on a different query, but this one can easily be optimized. Right now it is checking for an existing row in path for each row in batch; and it only needs to check once for each path. One way to write it would be: insert into path (path) select path from (select distinct path from batch) b where not exists (select 1 from path p where p.path = b.path); Yeah, I know, that's why I said I didn't want to focus on this one… we already do this optimization :) Thanks anyway :) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] overzealous sorting?
Le Tue, 27 Sep 2011 12:45:00 +1000, anthony.ship...@symstream.com a écrit : On Monday 26 September 2011 19:39, Marc Cousin wrote: Because Index Scans are sorted, not Bitmap Index Scans, which builds a list of pages to visit, to be then visited by the Bitmap Heap Scan step. Marc. Where does this bitmap index scan come from? It seems to negate the advantages of b-tree indexes described in the section Indexes and ORDER BY of the manual. If I do set enable_bitmapscan = off; the query runs a bit faster although with a larger time range it reverts to a sequential scan. Bitmap Index Scan is just another way to use a btree index. It is often used when a bigger part of a table is required, as it costs more than plain index scan to retrieve a few records, but less when a lot of records are needed. Your tests show that index scans are a bit faster on this query. But it is probably true only when most needed data is cached, which is probably your case, as you are doing tests using the same query all the time. The bitmap index scan is probably cheaper when data isn't in cache. You could also see the bitmap index scan as safer, as it won't perform as bad when data is not cached (less random IO) :) The thing is, the optimizer doesn't know if your data will be in cache when you will run your query… if you are sure most of your data is in the cache most of the time, you could try to tune random_page_cost (lower it) to reflect that data is cached. But if the win is small on this query, it may not be worth it. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] overzealous sorting?
Le Tue, 27 Sep 2011 19:05:09 +1000, anthony.ship...@symstream.com a écrit : On Tuesday 27 September 2011 18:54, Marc Cousin wrote: The thing is, the optimizer doesn't know if your data will be in cache when you will run your query… if you are sure most of your data is in the cache most of the time, you could try to tune random_page_cost (lower it) to reflect that data is cached. But if the win is small on this query, it may not be worth it. What I really want is to just read a sequence of records in timestamp order between two timestamps. The number of records to be read may be in the millions totalling more than 1GB of data so I'm trying to read them a slice at a time but I can't get PG to do just this. If I use offset and limit to grab a slice of the records from a large timestamp range then PG will grab all of the records in the range, sort them on disk and return just the slice I want. This is absurdly slow. The query that I've shown is one of a sequence of queries with the timestamp range progressing in steps of 1 hour through the timestamp range. All I want PG to do is find the range in the index, find the matching records in the table and return them. All of the planner's cleverness just seems to get in the way. Maybe you should try using a cursor, if you don't know where you'll stop. This associated with a very low cursor_tuple_fraction will probably give you what you want (a fast start plan). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] inheritance: planning time vs children number vs column number
The Tuesday 01 March 2011 16:33:51, Tom Lane wrote : Marc Cousin cousinm...@gmail.com writes: Le mardi 01 mars 2011 07:20:19, Tom Lane a écrit : It's worth pointing out that the only reason this effect is dominating the runtime is that you don't have any statistics for these toy test tables. If you did, the cycles spent using those entries would dwarf the lookup costs, I think. So it's hard to get excited about doing anything based on this test case --- it's likely the bottleneck would be somewhere else entirely if you'd bothered to load up some data. Yes, for the same test case, with a bit of data in every partition and statistics up to date, planning time goes from 20 seconds to 125ms for the 600 children/1000 columns case. Which is of course more than acceptable. [ scratches head ... ] Actually, I was expecting the runtime to go up not down. Maybe there's something else strange going on here. regards, tom lane Then, what can I do to help ?
Re: [PERFORM] inheritance: planning time vs children number vs column number
The Monday 28 February 2011 13:57:45, Heikki Linnakangas wrote : On 28.02.2011 11:38, Marc Cousin wrote: I've been facing a very large (more than 15 seconds) planning time in a partitioned configuration. The amount of partitions wasn't completely crazy, around 500, not in the thousands. The problem was that there were nearly 1000 columns in the parent table (very special use case, there is a reason for this application for having these many columns). The check constraint was extremely simple (for each child, 1 column = 1 constant, always the same column). As I was surprised by this very large planning time, I have been trying to study the variation of planning time against several parameters: - number of columns - number of children tables - constraint exclusion's value (partition or off) What (I think) I measured is that the planning time seems to be O(n^2) for the number of columns, and O(n^2) for the number of children tables. Constraint exclusion had a limited impact on planning time (it added between 20% and 100% planning time when there were many columns). Testing here with a table with 1000 columns and 100 partitions, about 80% of the planning time is looking up the statistics on attribute width, to calculate average tuple width. I don't see O(n^2) behavior, though, it seems linear. It is only based on experimentation, for my part, of course… If you measure the planning time, modifying either the columns or the partitions number, the square root of the planning time is almost perfectly proportional with the parameter you're playing with.
Re: [PERFORM] inheritance: planning time vs children number vs column number
The Monday 28 February 2011 16:35:37, Tom Lane wrote : Marc Cousin cousinm...@gmail.com writes: The Monday 28 February 2011 13:57:45, Heikki Linnakangas wrote : Testing here with a table with 1000 columns and 100 partitions, about 80% of the planning time is looking up the statistics on attribute width, to calculate average tuple width. I don't see O(n^2) behavior, though, it seems linear. It is only based on experimentation, for my part, of course⊠If you measure the planning time, modifying either the columns or the partitions number, the square root of the planning time is almost perfectly proportional with the parameter you're playing with. Could we see a concrete example demonstrating that? I agree with Heikki that it's not obvious what you are testing that would have such behavior. I can think of places that would have O(N^2) behavior in the length of the targetlist, but it seems unlikely that they'd come to dominate runtime at a mere 1000 columns. regards, tom lane I feel a little silly not having provided a test case from the start… A script doing a complete test is attached to this email. It's doing a simple CREATE TABLE test_father (col0 int,col1 int,col2 int,col3 int,col4 int,col5 int,col6 int,col7 int,col8 int,col9 int,col10 in t,col11 int,col12 int,col13 int,col14 int,col15 int,col16 int,col17 int,col18 int,col19 int,col20 int,col21 int,col22 int,co l23 int,…) Followed by 600 CREATE TABLE test_child_0 (CHECK (col0=0)) INHERITS (test_father); And a single SELECT col0 FROM test_father WHERE col0=0; Here are my results (from the same machine). I've done it with 600 partitions, to have big planning times. If you need a smaller one (this one takes nearly ten minutes to run) tell me. COLS:100 PARTITIONS:600 Time : 513,764 ms (sqrt : 22.6) COLS:200 PARTITIONS:600 Time : 906,214 ms (sqrt : 30.1) COLS:300 PARTITIONS:600 Time : 2255,390 ms (sqrt : 47.48) COLS:400 PARTITIONS:600 Time : 4816,820 ms (sqrt : 69.4) COLS:500 PARTITIONS:600 Time : 5736,602 ms (sqrt : 75.73) COLS:600 PARTITIONS:600 Time : 7659,617 ms (sqrt : 87.51) COLS:700 PARTITIONS:600 Time : 9313,260 ms (sqrt : 96.5) COLS:800 PARTITIONS:600 Time : 13700,353 ms (sqrt : 117.04) COLS:900 PARTITIONS:600 Time : 13914,765 ms (sqrt : 117.95) COLS:1000 PARTITIONS:600 Time : 20335,750 ms (sqrt : 142.6) COLS:1100 PARTITIONS:600 Time : 21048,958 ms (sqrt : 145.08) COLS:1200 PARTITIONS:600 Time : 27619,559 ms (sqrt : 166.18) COLS:1300 PARTITIONS:600 Time : 31357,353 ms (sqrt : 177.08) COLS:1400 PARTITIONS:600 Time : 34435,711 ms (sqrt : 185.57) COLS:1500 PARTITIONS:600 Time : 38954,676 ms (sqrt : 197.37) As for my previous results, these ones are on a machine doing a bit of other work, so some values may be a bit offset, and it's only one measure each time anyway. The CSV file I sent from the first email is obtained running the exact same commands, but playing on both columns and partitions, and averaged over 3 measures. Regards. script.sql.bz2 Description: application/bzip -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] inheritance: planning time vs children number vs column number
Le mardi 01 mars 2011 07:20:19, Tom Lane a écrit : Marc Cousin cousinm...@gmail.com writes: The Monday 28 February 2011 16:35:37, Tom Lane wrote : Could we see a concrete example demonstrating that? I agree with Heikki that it's not obvious what you are testing that would have such behavior. I can think of places that would have O(N^2) behavior in the length of the targetlist, but it seems unlikely that they'd come to dominate runtime at a mere 1000 columns. I feel a little silly not having provided a test case from the startق� A script doing a complete test is attached to this email. I did some oprofile analysis of this test case. It's spending essentially all its time in SearchCatCache, on failed searches of pg_statistic. The cache accumulates negative entries for each probed column, and then the searches take time proportional to the number of entries, so indeed there is an O(N^2) behavior --- but N is the number of columns times number of tables in your test case, not just the number of columns. The cache is a hash table, so ideally the search time would be more or less constant as the table grows, but to make that happen we'd need to reallocate with more buckets as the table grows, and catcache.c doesn't do that yet. We've seen a few cases that make that look worth doing, but they tend to be pretty extreme, like this one. It's worth pointing out that the only reason this effect is dominating the runtime is that you don't have any statistics for these toy test tables. If you did, the cycles spent using those entries would dwarf the lookup costs, I think. So it's hard to get excited about doing anything based on this test case --- it's likely the bottleneck would be somewhere else entirely if you'd bothered to load up some data. regards, tom lane Yes, for the same test case, with a bit of data in every partition and statistics up to date, planning time goes from 20 seconds to 125ms for the 600 children/1000 columns case. Which is of course more than acceptable. Now I've got to check it's the same problem on the real environment. I think it has quite a few empty partitions, so no statistics for them… Thanks a lot. Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] performance regression with Linux 2.6.33 and glibc 2.12
Hi. I hope I'm not going to expose an already known problem, but I couldn't find it mailing list archives (I only found http://archives.postgresql.org/pgsql- hackers/2009-12/msg01543.php). On one of my (non production) machines, I've just seen a very big performance regression (I was doing a very simple insert test). I had an 'old' 8.4 postgresql compiled a few month ago, performing very well, and my 'bleeding edge' 9.0, doing the same insert very slowly. I managed to find the cause of the regression : with Linux 2.6.33, O_DSYNC is now available. With glibc 2.12, O_DSYNC is available in userspace. Having both (they are both very new, 2.12 isn't even official on glibc website), my new build defaulted to open_datasync. The problem is that it is much slower. I tested it on 2 small machines (no big raid, just basic machines, with SATA or software RAID). Here is the trivial test : The configuration is the default configuration, just after initdb CREATE TABLE test (a int); CREATE INDEX idxtest on test (a); with wal_sync_method = open_datasync (new default) marc=# INSERT INTO test SELECT generate_series(1,10); INSERT 0 10 Time: 16083,912 ms with wal_sync_method = fdatasync (old default) marc=# INSERT INTO test SELECT generate_series(1,10); INSERT 0 10 Time: 954,000 ms Doing synthetic benchmarks with test_fsync: open_datasync performance, glibc 2.12, 2.6.34, 1 SATA drive Simple 8k write timing: write 0.037511 Compare file sync methods using one 8k write: open_datasync write56.998797 open_sync write 168.653995 write, fdatasync 55.359279 write, fsync 166.854911 Compare file sync methods using two 8k writes: open_datasync write, write113.342738 open_sync write, write339.066883 write, write, fdatasync57.336820 write, write, fsync 166.847923 Compare open_sync sizes: 16k open_sync write 169.423723 2 8k open_sync writes 336.457119 Compare fsync times on write() and new file descriptors (if the times are similar, fsync() can sync data written on a different descriptor): write, fsync, close 166.264048 write, close, fsync 168.702035 This is it, I just wanted to raise an alert on this: the degradation was 16- fold with this test. We wont see linux 2.6.33 + glibc 2.12 in production before months (I hope), but shouldn't PostgreSQL use fdatasync by default with Linux, seeing the result ? By the way, I re-did my tests with both 2.6.33, 2.6.34 and 2.6.35-rc1 and got the exact same result (O_DSYNC there, obviously, but also the performance degradation). Cheers Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance regression with Linux 2.6.33 and glibc 2.12
The Friday 04 June 2010 15:59:05, Tom Lane wrote : Marc Cousin cousinm...@gmail.com writes: I hope I'm not going to expose an already known problem, but I couldn't find it mailing list archives (I only found http://archives.postgresql.org/pgsql- hackers/2009-12/msg01543.php). You sure this isn't the well-known ext4 actually implements fsync where ext3 didn't issue? regards, tom lane Everything is ext4. So I should have fsync working with write barriers on all the tests. I don't think this problem is of the same kind: I think it is really because of O_DSYNC appearing on 2.6.33, and PostgreSQL using it by default now. If my filesystem was lying to me about barriers, I should take no more performance hit with open_datasync than with fdatasync, should I ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Why is PostgreSQL so slow on Windows ( Postgres 8.3.7) version
The few 'obvious' things I see : ID and POLLID aren't of the same type (numeric vs bigint) TTIME isn't indexed. And as a general matter, you should stick to native datatypes if you don't need numeric. But as said in the other answer, maybe you should redo this schema and use more consistent datatypes Anyway, from what I remenber, it's not advised to set up shared buffers that high for windows (I don't do so much windows myself, so maybe someone will be better informed). Anyway you can start by correcting the schema… On Friday 31 July 2009 07:45:55 pari krishnan wrote: Dear All, We are using Postgres 8.3.7 in our java application. We are doing performances tuning and load testing in our setup. we have noticed that ,some of our queries to the database taking long time to return the results.Please find our setup details belows. We observed that postgres is running in windows is slower than the linux . Machine Database Details : Windows configuration: 4 GB RAM 4*1.6 GHZ windows 2008 server standard edition Postgresql configuration: shared_buffers: 1 GB Effective_cache_size: 2GB fsync: off (even we tested this parameter is on ,we observed the same slowness ) Database Details : Postgres Database : PostgreSQL 8.3.7.1 Driver Version: PostgreSQL 8.3 JDBC4 with SSL (build 604) We are using 40 database connections. We have few tables which will be having more amount data.While running our application STATSDATA table will be created daily with table name with date. like as STATSDATA8_21_2009 Schema for STATSDATA table create table STATSDATA8_21_2009( POLLID Numeric(19), INSTANCE varchar(100), TTIME Numeric(19), VAL Numeric(13)) ;CREATE INDEX POLLID%_ndx on STATSDATA%(POLLID) Schema for PolledData create table PolledData( NAME varchar(50) NOT NULL , ID BIGINT NOT NULL , AGENT varchar(50) NOT NULL , COMMUNITY varchar(100) NOT NULL , PERIOD INTEGER NOT NULL, ACTIVE varchar(10), OID varchar(200) NOT NULL, LOGDIRECTLY varchar(10), LOGFILE varchar(100), SSAVE varchar(10), THRESHOLD varchar(10), ISMULTIPLEPOLLEDDATA varchar(10), PREVIOUSSEVERITY INTEGER, NUMERICTYPE INTEGER, SAVEABSOLUTES varchar(10), TIMEAVG varchar(10), PORT INTEGER, WEBNMS varchar(100), GROUPNAME varchar(100), LASTCOUNTERVALUE BIGINT , LASTTIMEVALUE BIGINT , TIMEVAL BIGINT NOT NULL , POLICYNAME varchar(100), THRESHOLDLIST varchar(200), DNSNAME varchar(100), SUFFIX varchar(20), STATSDATATABLENAME varchar(100), POLLERNAME varchar(200), FAILURECOUNT INTEGER, FAILURETHRESHOLD INTEGER, PARENTOBJ varchar(100), PROTOCOL varchar(50), SAVEPOLLCOUNT INTEGER, CURRENTSAVECOUNT INTEGER, SAVEONTHRESHOLD varchar(10), SNMPVERSION varchar(10), USERNAME varchar(30), CONTEXTNAME varchar(30), PRIMARY KEY (ID,NAME,AGENT,OID), index PolledData0_ndx ( NAME), index PolledData1_ndx ( AGENT), index PolledData2_ndx ( OID), index PolledData3_ndx ( ID), index PolledData4_ndx ( PARENTOBJ), ) We have 300k row's in PolledData Table.In each STATSDATA table ,we have almost 12 to 13 million rows. Every one minute interval ,we insert data into to STATSDATA table. In our application ,we use insert and select query to STATSDATA table at regular interval. Please let us know why the below query takes more time to return the results. is there any thing we need to do to tune the postgres database ? Please find explain analyze output. First Query : postgres=# explain analyze SELECT COUNT(*) FROM ( SELECT ID, PolledData.AGENT, N AME, INSTANCE, TTIME, VAL FROM PolledData, STATSDATA8_21_2009 WHERE ( ( PolledDa ta.ID=STATSDATA8_21_2009.POLLID) AND ( ( TTIME = 1250838027454) AND ( TTIME = 1250838079654) ) ) ) t1; QUERY PLAN --- - -- Aggregate (cost=773897.12..773897.13 rows=1 width=0) (actual time=17818.410..1 7818.412 rows=1 loops=1) - Merge Join (cost=717526.23..767505.06 rows=2556821 width=0) (actual time =17560.469..17801.790 rows=13721 loops=1) Merge Cond: (statsdata8_21_2009.pollid = ((polleddata.id)::numeric)) - Sort (cost=69708.44..69742.49 rows=13619 width=8) (actual time=239 2.659..2416.093 rows=13721 loops=1) Sort Key: statsdata8_21_2009.pollid Sort Method: quicksort Memory: 792kB - Seq Scan on statsdata8_21_2009 (cost=0.00..68773.27 rows=136 19 width=8) (actual time=0.077..2333.132 rows=13721 loops=1) Filter: ((ttime = 1250838027454::numeric) AND (ttime = 12 50838079654::numeric)) - Materialize (cost=647817.78..688331.92 rows=3241131 width=8) (actu al time=15167.767..15282.232 rows=21582 loops=1) - Sort (cost=647817.78..655920.61 rows=3241131 width=8) (actua l time=15167.756..15218.645 rows=21574 loops=1)
Re: [PERFORM] Very big insert/join performance problem (bacula)
It really has very little impact. It only affects index scans, and even then only if effective_cache_size is less than the size of the table. Essentially, when this kicks in, it models the effect that if you are index scanning a table much larger than the size of your cache, you might have to reread some blocks that you previously read in during *that same index scan*. Ok, thanks for clearing that up for me. Still, I think the doc could be improved on this point (sorry to be a bit obsessed with that, but I'm one of the french translators, so I like the doc to be perfect :) ) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
On Thursday 16 July 2009 07:20:18 Marc Cousin wrote: Le Thursday 16 July 2009 01:56:37, Devin Ben-Hur a écrit : Marc Cousin wrote: This mail contains the asked plans : Plan 1 around 1 million records to insert, seq_page_cost 1, random_page_cost 4 - Hash (cost=425486.72..425486.72 rows=16746972 width=92) (actual time=23184.196..23184.196 rows=16732049 loops=1) - Seq Scan on path (cost=0.00..425486.72 rows=16746972 width=92) (actual time=0.004..7318.850 rows=16732049 loops=1) - Hash (cost=1436976.15..1436976.15 rows=79104615 width=35) (actual time=210831.840..210831.840 rows=79094418 loops=1) - Seq Scan on filename (cost=0.00..1436976.15 rows=79104615 width=35) (actual time=46.324..148887.662 rows=79094418 loops=1) This doesn't address the cost driving plan question, but I think it's a bit puzzling that a seq scan of 17M 92-byte rows completes in 7 secs, while a seqscan of 79M 35-byte rows takes 149secs. It's about 4:1 row ratio, less than 2:1 byte ratio, but a 20:1 time ratio. Perhaps there's some terrible bloat on filename that's not present on path? If that seq scan time on filename were proportionate to path this plan would complete about two minutes faster (making it only 6 times slower instead of 9 :). Much simpler than that I think : there is a bigger percentage of path that is used all the time than of filename. The database used is the production database, so there were other insert queries running a few minutes before I got this plan. But I'll give it a look today and come back with bloat and cache information on these 2 tables. Here are the stats for filename : SELECT * from pgstattuple('public.filename'); table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent +-++---+--++++-- 5308719104 |79338344 | 4717466438 | 88.86 |0 | 0 | 0 | 11883396 | 0.22 So I guess it's not bloated. I checked in the cache, the times displayed before were with path in the cache. filename couldn't stay in the cache, as it's too big. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Thursday 16 July 2009 22:07:25, Kevin Grittner a écrit : Marc Cousin cousinm...@gmail.com wrote: the hot parts of these 2 tables are extremely likely to be in the database or linux cache (buffer hit rate was 97% in the example provided). Moreover, the first two queries of the insert procedure fill the cache for us... This would be why the optimizer does the best job estimating the relative costs of various plans when you set the random_page_cost and seq_page_cost very low. -Kevin Ok, so to sum it up, should I keep these values (I hate doing this :) ) ? Would there be a way to approximately evaluate them regarding to the expected buffer hit ratio of the query ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Thursday 16 July 2009 23:54:54, Kevin Grittner a écrit : Marc Cousin cousinm...@gmail.com wrote: to sum it up, should I keep these values (I hate doing this :) ) ? Many people need to set the random_page_cost and/or seq_page_cost to reflect the overall affect of caching on the active portion of the data. We set our fully-cached databases to 0.1 for both. Databases with less caching usually wind up at 2 and 1. We have one database which does best at 0.5 and 0.3. My advice is to experiment and try to find a pair of settings which works well for most or all of your queries. If you have a few which need a different setting, you can set a special value right before running the query, but I've always been able to avoid that (thankfully). Would there be a way to approximately evaluate them regarding to the expected buffer hit ratio of the query ? Nothing query-specific except setting them on the connection right before the query (and setting them back or discarding the connection afterward). Well, that and making sure that effective_cache_size reflects reality. -Kevin OK, thanks a lot. A last thing : As mentionned in another mail from the thread (from Richard Huxton), I felt this message in the documentation a bit misleading : effective_cache_size (integer) Sets the planner's assumption about the effective size of the disk cache that is available to a single query I don't really know what the 'a single query' means. I interpreted that as 'divide it by the amount of queries typically running in parallel on the database'. Maybe it should be rephrased ? (I may not be the one misunderstanding it). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
This mail contains the asked plans : I've done them with the different configurations, as I had done the effort of setting up the whole thing :) Stats were updated between all runs. Each time is the first run of the query (that's what we have in production with bacula) And I added the executor stats, in case ... By the way, I think I must mention it, the whole thing runs over DRBD, but with 2 gigabyte links between the master and the slave. And I tried deactivating replication when things got really slow (despooling in 24 hours), it changed nothing (sorts were a bit faster, around 20%). Server is 12 GB ram, 1 quad core xeon E5335. PostgreSQL starts to hash filename a bit later than what I said in the first mail, because it's become bigger (it was around 30-40 million last time I did the tests). This is the query (temp_mc is the table I've created to do my tests ...): explain ANALYZE SELECT batch.FileIndex, batch.JobId, Path.PathId, Filename.FilenameId, batch.LStat, batch.MD5 FROM temp_mc AS batch JOIN Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name = Filename.Name); Plan 1 around 1 million records to insert, seq_page_cost 1, random_page_cost 4 LOG: EXECUTOR STATISTICS DETAIL: ! system usage stats: ! 380.143452 elapsed 79.000938 user 44.386774 system sec ! [415.785985 user 155.733732 sys total] ! 15848728/12934936 [24352752/50913184] filesystem blocks in/out ! 0/44188 [86/987512] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 93812/40706 [405069/184511] voluntary/involuntary context switches ! buffer usage stats: ! Shared blocks: 877336 read, 0 written, buffer hit rate = 6.75% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written Hash Join (cost=3923929.71..5131377.91 rows=1286440 width=91) (actual time=234021.194..380018.709 rows=1286440 loops=1) Hash Cond: (batch.name = filename.name) - Hash Join (cost=880140.87..1286265.62 rows=1286440 width=102) (actual time=23184.959..102400.782 rows=1286440 loops=1) Hash Cond: (batch.path = path.path) - Seq Scan on temp_mc batch (cost=0.00..49550.40 rows=1286440 width=189) (actual time=0.007..342.396 rows=1286440 loops=1) - Hash (cost=425486.72..425486.72 rows=16746972 width=92) (actual time=23184.196..23184.196 rows=16732049 loops=1) - Seq Scan on path (cost=0.00..425486.72 rows=16746972 width=92) (actual time=0.004..7318.850 rows=16732049 loops=1) - Hash (cost=1436976.15..1436976.15 rows=79104615 width=35) (actual time=210831.840..210831.840 rows=79094418 loops=1) - Seq Scan on filename (cost=0.00..1436976.15 rows=79104615 width=35) (actual time=46.324..148887.662 rows=79094418 loops=1) Total runtime: 380136.601 ms Plan 2 the same insert, with seq_page_cost to 0.01 and random_page_cost to 0.02 DETAIL: ! system usage stats: ! 42.378039 elapsed 28.277767 user 12.192762 system sec ! [471.865489 user 180.499280 sys total] ! 0/4072368 [24792848/59059032] filesystem blocks in/out ! 0/0 [86/989858] page faults/reclaims, 0 [0] swaps ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent ! 1061/9131 [429738/200320] voluntary/involuntary context switches ! buffer usage stats: ! Shared blocks: 251574 read, 0 written, buffer hit rate = 96.27% ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00% ! Direct blocks: 0 read, 0 written LOG: duration: 42378.373 ms statement: QUERY PLAN --- Hash Join (cost=381840.21..1012047.92 rows=1286440 width=91) (actual time=20284.387..42242.955 rows=1286440 loops=1) Hash Cond: (batch.path = path.path) - Nested Loop (cost=0.00..583683.91 rows=1286440 width=178) (actual time=0.026..10333.636 rows=1286440 loops=1) - Seq Scan on temp_mc batch (cost=0.00..13231.26 rows=1286440 width=189) (actual time=0.008..380.361 rows=1286440 loops=1) - Index Scan using filename_name_idx on filename (cost=0.00..0.43 rows=1 width=35) (actual time=0.006..0.007 rows=1 loops=1286440) Index Cond: (filename.name = batch.name) - Hash (cost=170049.89..170049.89 rows=16746972 width=92) (actual time=20280.729..20280.729 rows=16732049 loops=1) - Seq Scan on path (cost=0.00..170049.89 rows=16746972 width=92) (actual time=0.005..4560.872 rows=16732049 loops=1) Total runtime: 42371.362 ms The thing is that this query is ten
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Wednesday 15 July 2009 15:45:01, Alvaro Herrera a écrit : Marc Cousin escribió: There are other things I am thinking of : maybe it would be better to have sort space on another (and not DBRD'ded) raid set ? we have a quite cheap setup right now for the database, and I think maybe this would help scale better. I can get a filesystem in another volume group, which is not used that much for now. You know, that's the first thing it came to me when I read you're using DRDB. Have you tried setting temp_tablespace to a non-replicated disk? I wish I could easily. I'm not entitled to tune the database, only to give directives. I've given this one, but I don't know when it will be done. I'll keep you informed on this one, but I don't have my hopes too high. As mentionned before, I tried to deactivate DRBD (still using the DRBD device, but not connected to the other node, so it has almost no effect). It didn't change much (performance was a bit (around 20% better). Anyway, the thing is that : - big sorts kill my machine when there are more that 5 of them. I think it is a system problem (raid, filesystem, linux tuning, don't really know, I'll have to dig into this, but it will be complicated, for human reasons :) ) - the plan through nested loops is faster anyway, and I think it's because there is only a small fraction of filename and path that is used (most files backed up have the same name or path, as we save 600 machines with mostly 2 OSes, linux and windows), so the hot parts of these 2 tables are extremely likely to be in the database or linux cache (buffer hit rate was 97% in the example provided). Moreover, the first two queries of the insert procedure fill the cache for us... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Thursday 16 July 2009 01:56:37, Devin Ben-Hur a écrit : Marc Cousin wrote: This mail contains the asked plans : Plan 1 around 1 million records to insert, seq_page_cost 1, random_page_cost 4 - Hash (cost=425486.72..425486.72 rows=16746972 width=92) (actual time=23184.196..23184.196 rows=16732049 loops=1) - Seq Scan on path (cost=0.00..425486.72 rows=16746972 width=92) (actual time=0.004..7318.850 rows=16732049 loops=1) - Hash (cost=1436976.15..1436976.15 rows=79104615 width=35) (actual time=210831.840..210831.840 rows=79094418 loops=1) - Seq Scan on filename (cost=0.00..1436976.15 rows=79104615 width=35) (actual time=46.324..148887.662 rows=79094418 loops=1) This doesn't address the cost driving plan question, but I think it's a bit puzzling that a seq scan of 17M 92-byte rows completes in 7 secs, while a seqscan of 79M 35-byte rows takes 149secs. It's about 4:1 row ratio, less than 2:1 byte ratio, but a 20:1 time ratio. Perhaps there's some terrible bloat on filename that's not present on path? If that seq scan time on filename were proportionate to path this plan would complete about two minutes faster (making it only 6 times slower instead of 9 :). Much simpler than that I think : there is a bigger percentage of path that is used all the time than of filename. The database used is the production database, so there were other insert queries running a few minutes before I got this plan. But I'll give it a look today and come back with bloat and cache information on these 2 tables. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Tuesday 14 July 2009 10:15:21, vous avez écrit : Marc Cousin wrote: Your effective_cache_size is really small for the system you seem to have - its the size of IO caching your os is doing and uses no resources itself. And 800MB of that on a system with that amount of data seems a bit unlikely ;-) Using `free` you can see the amount of io caching your OS is doing atm. in the 'cached' column. That possibly might tip some plans in a direction you prefer. What kind of machine are you running this on? I played with this parameter too, and it didn't influence the plan. Anyway, the doc says it's the OS cache available for one query, No they don't. I'm guessing you're getting mixed up with work_mem. I'm not (from the docs) : effective_cache_size (integer) Sets the planner's assumption about the effective size of the disk cache that is available to a single query I trust you, of course, but then I think maybe this should be rephrased in the doc then, because I understand it like I said ... I always had a doubt about this sentence, and that's why I tried both 800MB and 8GB for this parameter. and there may be a lot of insert queries at the same time, so I chose to be conservative with this value. I tried it with 8GB too, the plans were the same. The OS cache is around 8-10GB by the way. That's what you need to set effective_cache_size to then. Ok but that doesn't change a thing for this query (I had a doubt on this parameter and tried with both 800MB and 8GB) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
Le Tuesday 14 July 2009 10:23:25, Richard Huxton a écrit : Marc Cousin wrote: Temporarily I moved the problem at a bit higher sizes of batch by changing random_page_cost to 0.02 and seq_page_cost to 0.01, but I feel like an apprentice sorcerer with this, as I told postgreSQL that fetching rows from disk are much cheaper than they are. These values are, I think, completely abnormal. They certainly don't have anything to do with reality. Try putting them back to (say) seq_page_cost=1 and random_page_cost=2. That's the first thing I tried (it seemed more sensible), and it didn't work. I can't put them back to these values for more than one test query, the server really died before I changed the settings. So, finally, to my questions : - Is it normal that PostgreSQL is this off base on these queries (sorry I don't have the plans, if they are required I'll do my best to get some, but they really are the two obvious plans for this kind of query). What could make it choose the hash join for too small batch tables ? No point in speculating without plans. Ok, I'll try to have them tomorrow. - Is changing the 2 costs the way to go ? Not the way you have. That's what I thought, and the reason I posted :) - Is there a way to tell postgreSQL that it's more costly to sort than it thinks ? (instead of telling it that fetching data from disk doesn't cost anything). That's what the configuration settings do. But if you put a couple way off from reality it'll be pure chance if it gets any estimates right. Here are the other non-default values from my configuration : shared_buffers = 2GB work_mem = 64MB Set this *much* higher when you are running your bulk imports. You can do it per-connection. Try 256MB, 512MB, 1GB (but keep an eye on total memory used). I'll try that. But anyhow, I've got much better performance when not doing the hash join. I'll get back with the plans as soon as possible. maintenance_work_mem = 256MB max_fsm_pages = 1500 # There are quite big deletes with bacula ... effective_cache_size = 800MB See other emails on this one. default_statistics_target = 1000 Probably don't need this for all columns, but it won't cause problems with these queries. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Very big insert/join performance problem (bacula)
to underestimate the cost of sorting here (the row estimates were good, no problem with that). PostgreSQL seems to think that at around 1 million records in file it should go with a hash join on filename and path, so we go on hashing the 17 million records of path, the 80 millions of filename, then joining and inserting into file (we're talking about sorting around 15 GB for each of these despools in parallel). Temporarily I moved the problem at a bit higher sizes of batch by changing random_page_cost to 0.02 and seq_page_cost to 0.01, but I feel like an apprentice sorcerer with this, as I told postgreSQL that fetching rows from disk are much cheaper than they are. These values are, I think, completely abnormal. Doing this, I got the change of plan at around 8 million. And had 2 of them at 9 millions at the same time this weekend, and both of the took 24 hours, while the nested loops before the join (for inserts in path and filename) did their work in minutes... So, finally, to my questions : - Is it normal that PostgreSQL is this off base on these queries (sorry I don't have the plans, if they are required I'll do my best to get some, but they really are the two obvious plans for this kind of query). What could make it choose the hash join for too small batch tables ? - Is changing the 2 costs the way to go ? - Is there a way to tell postgreSQL that it's more costly to sort than it thinks ? (instead of telling it that fetching data from disk doesn't cost anything). Here are the other non-default values from my configuration : shared_buffers = 2GB work_mem = 64MB maintenance_work_mem = 256MB max_fsm_pages = 1500 # There are quite big deletes with bacula ... effective_cache_size = 800MB default_statistics_target = 1000 PostgreSQL is 8.3.5 on Debian Lenny I'm sorry for this very long email, I tried to be as precise as I could, but don't hesitate to ask me more. Thanks for helping. Marc Cousin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
We regularly do all of dbcheck. This is our real configuration, there are really lots of servers and lots of files (500 million files backed up every month). But thanks for mentionning that. The thing is we're trying to improve bacula with postgresql in order to make it able to bear with this kind of volumes. So we are looking for things to improve bacula and postgresql tuning to make it cope with the queries mentionned (or rewrite the queries or the way to do inserts, that may not be a problem either) On Monday 13 July 2009 16:37:06 SystemManagement wrote: Hi, just a remark, as the number of entries seems to be very high: Did you ever activate bacula's program dbcheck Option 16? Regards Reiner -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Very big insert/join performance problem (bacula)
While this is not your questions, I still noticed you seem to be on 8.3 - it might be a bit faster to use GROUP BY instead of DISTINCT. It didn't do a big difference, I already tried that before for this query. Anyway, as you said, it's not the query having problems :) Your effective_cache_size is really small for the system you seem to have - its the size of IO caching your os is doing and uses no resources itself. And 800MB of that on a system with that amount of data seems a bit unlikely ;-) Using `free` you can see the amount of io caching your OS is doing atm. in the 'cached' column. That possibly might tip some plans in a direction you prefer. What kind of machine are you running this on? I played with this parameter too, and it didn't influence the plan. Anyway, the doc says it's the OS cache available for one query, and there may be a lot of insert queries at the same time, so I chose to be conservative with this value. I tried it with 8GB too, the plans were the same. The OS cache is around 8-10GB by the way. The machine is a dell PE2900, with 6 disks dedicated to this database (raid 10 config) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Scalability in postgres
It's not that trivial with Oracle either. I guess you had to use shared servers to get to that amount of sessions. They're most of the time not activated by default (dispatchers is at 0). Granted, they are part of the 'main' product, so you just have to set up dispatchers, shared servers, circuits, etc ... but there is still setup to do : dispatchers are (if I recall correctly) a completely manual parameter (and every dispatcher can only drive a certain amount of sessions, dependant on the operating system), where shared servers is a bit more dynamic, but still uses processes (so you may have to tweak max processes also). What I mean to say is that Oracle does something quite alike PostgreSQL + a connection pooler, even if it's more advanced (it's a shared memory structure that is used to send messages between dispatchers and shared servers). Or did you mean that you had thousands of sessions in dedicated mode ? On Wednesday 03 June 2009 20:13:39 Dimitri wrote: Just to say you don't need a mega server to keep thousands connections with Oracle, it's just trivial, nor CPU affinity and other stuff you may or may not need with Sybase :-) Regarding PostgreSQL, I think it'll only benefit to have an integrated connection pooler as it'll make happy all populations anyway: - those who don't like the idea may always disable it :-) - those who have a lot but mostly inactive sessions will be happy to simplify session pooling - those who really seeking for the most optimal workload on their servers will be happy twice: if there are any PG scalability limits, integrated pooler will be in most cases more performant than external; if there are no PG scalability limits - it'll still help to size PG most optimally according a HW or OS capacities.. Rgds, -Dimitri On 6/3/09, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Dimitri dimitrik...@gmail.com wrote: Few weeks ago tested a customer application on 16 cores with Oracle: - 20,000 sessions in total - 70,000 queries/sec without any problem on a mid-range Sun box + Solaris 10.. I'm not sure what point you are trying to make. Could you elaborate? (If it's that Oracle doesn't need an external connection pool, then are you advocating that PostgreSQL include that in the base product?) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Query optimization
Le Sunday 30 November 2008 19:45:11 tmp, vous avez écrit : I am struggeling with the following query which fetches a random subset of 200 questions that matches certain tags within certain languages. However, the query takes forever to evaluate, even though I have a limit 200 appended. Any ideas on how to optimize it? QUERY: SELECT distinct q.question_id FROM question_tags qt, questions q WHERE q.question_id = qt.question_id AND q.STATUS = 1 AND not q.author_id = 105 AND ((qt.language_id = 5 and qt.tag_id in (1101,917,734,550,367,183)) or (qt.language_id = 4 and qt.tag_id in (856,428)) or (qt.language_id = 3 and qt.tag_id in (1156,1028,899,771,642,514,385,257,128)) or (qt.language_id = 2 and qt.tag_id in (1193,1101,1009,917,826,734,642,550,458,367,275,183,91))) and q.question_id not in (413) LIMIT 200 EXPLAIN ANALYZE: = Limit (cost=1.50..1267.27 rows=200 width=4) (actual time=278.169..880.934 rows=200 loops=1) - Unique (cost=1.50..317614.50 rows=50185 width=4) (actual time=278.165..880.843 rows=200 loops=1) - Merge Join (cost=1.50..317489.04 rows=50185 width=4) (actual time=278.162..880.579 rows=441 loops=1) Merge Cond: (qt.question_id = q.question_id) - Index Scan using question_tags_question_id on question_tags qt (cost=0.00..301256.96 rows=82051 width=4) (actual time=24.171..146.811 rows=6067 loops=1) Filter: (((language_id = 5) AND (tag_id = ANY ('{1101,917,734,550,367,183}'::integer[]))) OR ((language_id = 4) AND (tag_id = ANY ('{856,428}'::integer[]))) OR ((language_id = 3) AND (tag_id = ANY ('{1156,1028,899,771,642,514,385,257,128}'::integer[]))) OR ((language_id = 2) AND (tag_id = ANY ('{1193,1101,1009,917,826,734,642,550,458,367,275,183,91}'::integer[] - Index Scan using questions_pkey on questions q (cost=0.00..15464.12 rows=83488 width=4) (actual time=222.956..731.737 rows=1000 loops=1) Filter: ((q.author_id 105) AND (q.question_id 413) AND (q.status = 1)) Total runtime: 881.152 ms (9 rows) First, because of the distinct, the limit 200 wont reduce the work to be done a lot : it will still have to sort for the unique. Only when everything is sorted will it take only the first 200 records. And anyhow it seems there are only 441 rows before doing the distinct, so, at least for this query, it won't change a lot the times. Then it seems to me that you may try to create composed indexes, because there is a lot of filtering after the index scans (that is if you want the query to be faster). Maybe (q.author_id,q.status). For qt you may try (qt.language_id,qt.tag_id)... Hope it helps Cheers -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Weird row estimate
Le Wednesday 11 July 2007 22:35:31 Tom Lane, vous avez écrit : Marc Cousin [EMAIL PROTECTED] writes: Nevertheless, shouldn't the third estimate be smaller or equal to the sum of the two others ? The planner's estimation for subplan conditions is pretty primitive compared to joinable conditions. When you add the OR, it's no longer possible to treat the IN like a join, and everything gets an order of magnitude dumber :-( It might be worth trying this as a UNION of the two simple queries. Yes, it's much better on this query with a UNION. The problem is that this is a small set of the query, and there are several nested IN with an OR condition... But at least now I understand where it comes from. Thanks a lot. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Weird row estimate
Hi, I'm having a weird problem on a query : I've simplified it to get the significant part (see end of message). The point is I've got a simple SELECT field FROM table WHERE 'condition1' Estimated returned rows : 5453 Then SELECT field FROM table WHERE 'condition2' Estimated returned rows : 705 Then SELECT field FROM table WHERE 'condition1' OR 'condition2' Estimated returned rows : 143998 Condition2 is a bit complicated (it's a subquery). Nevertheless, shouldn't the third estimate be smaller or equal to the sum of the two others ? Postgresql is 8.2.4 on Linux, stats are up to date, show default_statistics_target; default_statistics_target --- 1000 Any ideas ? explain analyze SELECT stc.CMD_ID FROM STOL_STC stc WHERE (stc.STC_DATE='2007-07-05' AND stc.STC_DATEPLAN='2007-07-05'); QUERY PLAN -- Seq Scan on stol_stc stc (cost=0.00..24265.15 rows=5453 width=8) (actual time=17.186..100.941 rows=721 loops=1) Filter: ((stc_date = '2007-07-05'::date) AND (stc_dateplan = '2007-07-05'::date)) Total runtime: 101.656 ms (3 rows) explain analyze SELECT stc.CMD_ID FROM STOL_STC stc WHERE stc.STC_ID IN (SELECT STC_ID FROM STOL_TRJ WHERE TRJ_DATEARRT='2007-07-05' AND TRJ_DATEDEPT=TRJ_DATEARRT AND (TRJ_DATEDEPT='2007-07-05' OR TRJ_DATECREAT='2007-07-05') ); QUERY PLAN Nested Loop (cost=4649.62..10079.52 rows=705 width=8) (actual time=6.266..13.037 rows=640 loops=1) - HashAggregate (cost=4649.62..4657.13 rows=751 width=8) (actual time=6.242..6.975 rows=648 loops=1) - Index Scan using stol_trj_fk5 on stol_trj (cost=0.00..4647.61 rows=803 width=8) (actual time=0.055..4.901 rows=688 loops=1) Index Cond: (trj_datearrt = '2007-07-05'::date) Filter: ((trj_datedept = trj_datearrt) AND ((trj_datedept = '2007-07-05'::date) OR (trj_datecreat = '2007-07-05'::date))) - Index Scan using stol_stc_pk on stol_stc stc (cost=0.00..7.21 rows=1 width=16) (actual time=0.004..0.005 rows=1 loops=648) Index Cond: (stc.stc_id = stol_trj.stc_id) Total runtime: 13.765 ms (8 rows) explain analyze SELECT stc.CMD_ID FROM STOL_STC stc WHERE (stc.STC_DATE='2007-07-05' AND stc.STC_DATEPLAN='2007-07-05') OR (stc.STC_ID IN (SELECT STC_ID FROM STOL_TRJ WHERE TRJ_DATEARRT='2007-07-05' AND TRJ_DATEDEPT=TRJ_DATEARRT AND (TRJ_DATEDEPT='2007-07-05' OR TRJ_DATECREAT='2007-07-05') )); QUERY PLAN Seq Scan on stol_stc stc (cost=4649.62..29621.12 rows=143998 width=8) (actual time=21.564..146.365 rows=1048 loops=1) Filter: (((stc_date = '2007-07-05'::date) AND (stc_dateplan = '2007-07-05'::date)) OR (hashed subplan)) SubPlan - Index Scan using stol_trj_fk5 on stol_trj (cost=0.00..4647.61 rows=803 width=8) (actual time=0.054..4.941 rows=688 loops=1) Index Cond: (trj_datearrt = '2007-07-05'::date) Filter: ((trj_datedept = trj_datearrt) AND ((trj_datedept = '2007-07-05'::date) OR (trj_datecreat = '2007-07-05'::date))) Total runtime: 147.407 ms SELECT count(*) from stol_stc ; count 140960 (1 row)
Re: [PERFORM] Optimicing Postgres for SunSolaris10 on V240
Hi... Bacula does no transaction right now, so every insert is done separately with autocommit. Moreover, the insert loop for the main table is done by several individual queries to insert data in several tables (filename, dir, then file), so this is slow. There's work underway to speed that up, using a big COPY to a temp table, then queries to dispatch the records in the right places as fast as possible. The patch has been made, but as it is a noticeable change in the core, will take some time to be integrated... See the thread about that in the bacula devel list a few weeks ago... Anyhow, our benchmark for now shows a 10-20 times speedup with postgresql, fsync stays on, and it becomes faster than mysql, and scales with the number of cpus... I cannot tell when/if it will be included, but there's work on this. For now, the only thing you can do is fsync=off, knowing you're taking a chance with the data (but it's not that big a problem, as it's only bacula's database, and can be rebuilt from the tapes or from a dump...) or a writeback disk controller. On Friday 17 November 2006 19:44, Josh Berkus wrote: Berner, First, I've corrected you e-mail so that it goes to the list, and not to me directly. I use my PostgreSQL 8.0.4 as Catalogue-Database for Bacula. Bacula is a Backupsoftware. Yes. The lead contributor to Bacula is a active PostgreSQL project participant; I'll see if he'll look into your issue. When I backing up System (lot of very small Files) then PostgreSQL seams to by the bottleneck by inserting Catalogueinformation of every single File. The System on which Postgres runs is a Sun Solaris 10 Server on a Sun Fire V240 with 1GB RAM, 1CPU (SUNW,UltraSPARC-IIIi at 1.3GHz), 2 Ultra SCSI-3 Disks 73GB at 10k RPM which are in Raid1 (Solaris Softraid). Can someone gif me a hint for compiling PostgreSQL or configuring the Database. fsync is already disabled.. This is a bad idea if you care about your database. So, PostgreSQL 8.1 is now official supported by Sun and ships with Solaris 10 update 2 or later. It is recommended that you use that rather and an out-of-date version. Second, see www.powerpostgresql.com/PerfList --Josh Berkus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Inserts optimization?
I hope I'm not going to say stupid things, but here's what i know (or i think i know :) ) about bacula + postgresql If I remember correctly (I allready discussed this with Kern Sibbald a while ago), bacula does each insert in its own transaction : that's how the program is done, and of course it works ok with mysql and MyIsam tables, as mysql doesn't have transactions with myisam... So, you'll probably end up being slowed down by WAL fsyncs ... and you won't have a lot of solutions. Maybe you should start with trying to set fsync=no as a test to confirm that (you should have a lot of iowaits right now if you haven't disabled fsync). For now, I only could get good performance with bacula and postgresql when disabling fsync... On Thursday 13 April 2006 20:45, Francisco Reyes wrote: Chris writes: If you can, use copy instead: http://www.postgresql.org/docs/8.1/interactive/sql-copy.html I am familiar with copy. Can't use it in this scenario. The data is coming from a program called Bacula (Backup server). It is not static data. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] partitioning
Hi, I've been working on trying to partition a big table (I've never partitioned a table in any other database till now). Everything went ok, except one query that didn't work afterwards. I've put the partition description, indexes, etc ..., and the explain plan attached. The query is extremely fast without partition (index scan backards on the primary key) The query is : select * from logs order by id desc limit 100; id is the primary key. It is indexed on all partitions. But the explain plan does full table scan on all partitions. While I think I understand why it is doing this plan right now, is there something that could be done to optimize this case ? Or put a warning in the docs about this kind of behaviour. I guess normally someone would partition to get faster queries :) Anyway, I thought I should mention this, as it has been quite a surprise. CREATE TABLE logs_150 (CHECK ( id_machine = 150)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_163 (CHECK ( id_machine = 163)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_289 (CHECK ( id_machine = 289)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_319 (CHECK ( id_machine = 319)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_238 (CHECK ( id_machine = 238)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_148 (CHECK ( id_machine = 148)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_176 (CHECK ( id_machine = 176)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_164 (CHECK ( id_machine = 164)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_316 (CHECK ( id_machine = 316)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_313 (CHECK ( id_machine = 313)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_217 (CHECK ( id_machine = 217)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_167 (CHECK ( id_machine = 167)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_287 (CHECK ( id_machine = 287)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_279 (CHECK ( id_machine = 279)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_248 (CHECK ( id_machine = 248)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_280 (CHECK ( id_machine = 280)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_165 (CHECK ( id_machine = 165)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_330 (CHECK ( id_machine = 330)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_149 (CHECK ( id_machine = 149)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_157 (CHECK ( id_machine = 157)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_272 (CHECK ( id_machine = 272)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_311 (CHECK ( id_machine = 311)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_309 (CHECK ( id_machine = 309)) INHERITS (logs) TABLESPACE data_logs; CREATE TABLE logs_318 (CHECK ( id_machine = 318)) INHERITS (logs) TABLESPACE data_logs; CREATE UNIQUE INDEX logs_150_pkey ON logs_150 (id) TABLESPACE index_logs;CREATE INDEX date_150 ON logs_150 (date) TABLESPACE index_logs ;CREATE INDEX event_150 ON logs_150 (evenement) TABLESPACE index_logs; . logs= explain select * from logs order by id desc limit 100; QUERY PLAN - Limit (cost=16524647.29..16524647.54 rows=100 width=295) - Sort (cost=16524647.29..16568367.11 rows=17487927 width=295) Sort Key: public.logs.id - Result (cost=0.00..827622.27 rows=17487927 width=295) - Append (cost=0.00..827622.27 rows=17487927 width=295) - Seq Scan on logs (cost=0.00..826232.78 rows=17451978 width=165) - Seq Scan on logs_150 logs (cost=0.00..199.04 rows=6104 width=144) - Seq Scan on logs_163 logs (cost=0.00..261.79 rows=7079 width=169) - Seq Scan on logs_289 logs (cost=0.00..428.93 rows=10693 width=200) - Seq Scan on logs_319 logs (cost=0.00..31.92 rows=992 width=129) - Seq Scan on logs_238 logs (cost=0.00..28.01 rows=701 width=199) - Seq Scan on logs_148 logs (cost=0.00..80.15 rows=2015 width=195) - Seq Scan on logs_176 logs (cost=0.00..12.40 rows=240 width=295) - Seq Scan on logs_164 logs (cost=0.00..12.40 rows=240 width=295) - Seq Scan on logs_316 logs (cost=0.00..12.40 rows=240 width=295) - Seq Scan on logs_313 logs (cost=0.00..12.40 rows=240 width=295) - Seq Scan on logs_217 logs (cost=0.00..12.40 rows=240 width=295) - Seq Scan on logs_167 logs (cost=0.00..57.36 rows=1536 width=170) - Seq Scan on logs_287 logs (cost=0.00..12.40
Re: [PERFORM] partitioning
Yes, that's how I solved it... and I totally agree that it's hard for the planner to guess what to do on the partitions. But maybe there should be something in the docs explaining the limitations ... I'm only asking for the biggest 100 ids from the table, so I thought maybe the planner would take the 100 biggest from all partitions or something like that and return me the 100 biggest from those results. It didn't and that's quite logical. What I meant is that I understand why the planner chooses this plan, but maybe it should be written somewhere in the docs that some plans will be worse after partitionning. Le Mardi 13 Décembre 2005 12:50, vous avez écrit : I just saw that there is no where clause in the query, that you had fed to explain plan. you need to include a where clause based on id_machine column to see the effect. On 12/13/05, Pandurangan R S [EMAIL PROTECTED] wrote: Did you set constraint_exclusion = true in postgresql.conf file? On 12/13/05, Marc Cousin [EMAIL PROTECTED] wrote: Hi, I've been working on trying to partition a big table (I've never partitioned a table in any other database till now). Everything went ok, except one query that didn't work afterwards. I've put the partition description, indexes, etc ..., and the explain plan attached. The query is extremely fast without partition (index scan backards on the primary key) The query is : select * from logs order by id desc limit 100; id is the primary key. It is indexed on all partitions. But the explain plan does full table scan on all partitions. While I think I understand why it is doing this plan right now, is there something that could be done to optimize this case ? Or put a warning in the docs about this kind of behaviour. I guess normally someone would partition to get faster queries :) Anyway, I thought I should mention this, as it has been quite a surprise. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Regards Pandu ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] insert performance for win32
Le Mercredi 02 Novembre 2005 14:54, Magnus Hagander a écrit : I've done the tests with rc1. This is still as slow on windows ... about 6-10 times slower thant linux (via Ip socket). (depending on using prepared queries, etc...) By the way, we've tried to insert into the windows database from a linux psql client, via the network. In this configuration, inserting is only about 2 times slower than inserting locally (the linux client had a slower CPU 1700Mhz agains 3000). Could it be related to a problem in the windows psql client ? [OK, I'm bringing this back on-list, and bringing it to QingQing's attention, who I secretly hope is the right person to be looking at this problem :)] Just to recap Marc and I have been looking at the performance disparity between windows and linux for a single transaction statement by statement insert on a very narrow table with no keys from a remote client. Marc's observations showed (and I verified) that windows is much slower in this case than it should be. I gprof'ed both the psql client and the server during the insert and didn't see anything seriously out of order...unfortunately QQ's latest win32 performance tweaks haven't helped. Marc's observation that by switching to a linux client drops time down drastically is really intersing! Could this be a case of the network being slow, as we've seen a couple of times before? And if you run psql on the local box, you get it double. Do you get a speed difference between the local windows box and a remote wnidows box? //Magnus The Windows-Windows test is local (via loopback interface) The Linux (client) - Windows (server) is via network (100Mbits) I can't test with 2 windows box ... I haven't got that much (all machines linux, except the test one...) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] insert performance for win32
On Tuesday 06 September 2005 19:11, Merlin Moncure wrote: This makes me wonder if we are looking in the wrong place. Maybe the problem is coming from psql? More results to follow. problem is not coming from psql. One thing I did notice that in a 250k insert transaction the insert time grows with #recs inserted. Time to insert first 50k recs is about 27 sec and last 50 k recs is 77 sec. I also confimed that size of table is not playing a role here. Marc, can you do select timeofday() every 50k recs from linux? Also a gprof trace from linux would be helpful. Here's the timeofday ... i'll do the gprof as soon as I can. Every 5 rows... Wed Sep 07 13:58:13.860378 2005 CEST Wed Sep 07 13:58:20.926983 2005 CEST Wed Sep 07 13:58:27.928385 2005 CEST Wed Sep 07 13:58:35.472813 2005 CEST Wed Sep 07 13:58:42.825709 2005 CEST Wed Sep 07 13:58:50.789486 2005 CEST Wed Sep 07 13:58:57.553869 2005 CEST Wed Sep 07 13:59:04.298136 2005 CEST Wed Sep 07 13:59:11.066059 2005 CEST Wed Sep 07 13:59:19.368694 2005 CEST Merlin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] insert performance for win32
Hi, I usually use PostgreSQL coupled with Linux, but I have to use Windows for a perticular project. So I wanted to do some tests to know if the performance will be acceptable (I don't need PostgreSQL to be as fast with windows as with linux, but it has to be usable...). I started with trying to do lots of inserts, and I'm quite astonished by the catastrophics results ... The test : The computer was the same (my workstation, a PIV Dell with SATA disk), dual boot The windows OS is XP. Both Oses are PostgreSQL 8.0.3 Both PostgreSQL clusters (windows and linux) have the same tuning (shared_buffers=2, wal_buffers=128, checkpoint_segments=10) Before each test, the clusters are vacuum analyzed, and the test database is recreated. The script is quite dumb : BEGIN; CREATE TABLE test (col1 serial, col2 text); INSERT INTO test (col2) values ('test'); INSERT INTO test (col2) values ('test'); INSERT INTO test (col2) values ('test'); INSERT INTO test (col2) values ('test'); INSERT INTO test (col2) values ('test'); .. 500,000 times Then COMMIT. I know it isn't realistic, but I needed to start with something :) The results are as follows : Linux : 1'9'' Windows : 9'38'' What I've tried to solve, and didn't work : - Deactivate antivirus on windows - fsync=no - raise the checkpoint_segments value (32) - remove hyperthreading (who knows...) I don't know what could cause this (I'm not a windows admin...at all). All I see is a very high kernel load during the execution of this script, but I can't determine where it comes from. I'd like to know if this is a know problem, if there is something I can do, etc... Thanks a lot. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] insert performance for win32
In my experience win32 is par with linux generally with a few gotchas on either side. Are your times with fsync=no? It's much harder to give apples-apples comparison with fsync=on for various reasons. It is with fsync=off on windows, fsync=on on linux Are you running stats_command_string=on? Try disabling and compare results. Deactivated on windows, activated on linux Is your loading app running locally or on the server? Yes I am very interesting in discovering sources of high cpu load problems on win32. If you are still having problems could you get a gprof profile together? There is a recent thread on win32-hackers discussing how to do this. I'll give it a look Merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Problem with large query
Hi. I hope I'm not asking a too trivial question here... I'm having trouble with a (quite big) query, and can't find a way to make it faster. Here is the information : Tables : sces_vte - 2753539 rows sces_art - 602327 sces_fsf - 8126 sces_frc - 7763 sces_tps - 38 sces_gtr - 35 Query : === SELECT sces_gtr_art.gtr_cod, sces_gtr_art.gtr_lib, sces_frc_art.fou_cod, sces_frc_art.fou_lib, sces_tps.tps_annee_mois, TO_NUMBER('200401','99'), TO_NUMBER('200405','99'), sces_tps.tps_libc, sum(sces_vte.vte_mnt), sum(sces_vte.vte_qte), sum(sces_vte.vte_ton), sces_famille.fsf_codfam, sces_famille.fsf_lib, sces_s_famille.fsf_codsfm, sces_s_famille.fsf_lib FROM sces_vte, sces_art, sces_fsf sces_famille, sces_fsf sces_s_famille, sces_frc sces_frc_art, sces_tps, sces_gtr sces_gtr_art WHERE ( sces_famille.fsf_codfam=sces_s_famille.fsf_codfam ) AND ( sces_famille.fsf_codseg= 0 and sces_famille.fsf_codsfm = 0 ) AND ( sces_vte.tps_annee_mois=sces_tps.tps_annee_mois ) AND ( sces_vte.art_cod=sces_art.art_cod and sces_vte.dos_cod=sces_art.dos_cod ) AND ( sces_gtr_art.gtr_cod=sces_frc_art.gtr_cod ) AND ( sces_frc_art.gtr_cod=sces_art.gtr_cod and sces_frc_art.fou_cod=sces_art.fou_cod ) AND ( sces_s_famille.fsf_codfam=sces_art.fsf_codfam and sces_s_famille.fsf_codsfm=sces_art.fsf_codsfm ) AND ( sces_s_famille.fsf_codseg = 0 ) AND ( ( ( ( sces_tps.tps_annee_mois ) = ( TO_NUMBER('200401','99') ) and ( sces_tps.tps_annee_mois ) = ( TO_NUMBER('200405','99') ) ) OR ( ( sces_tps.tps_annee_mois ) = ( TO_NUMBER('200401','99') )-100 and ( sces_tps.tps_annee_mois ) = ( TO_NUMBER('200405','99') )-100 ) ) AND ( sces_gtr_art.gtr_cod in (2)) ) GROUP BY sces_gtr_art.gtr_cod, sces_gtr_art.gtr_lib, sces_frc_art.fou_cod, sces_frc_art.fou_lib, sces_tps.tps_annee_mois, TO_NUMBER('200401','99'), TO_NUMBER('200405','99'), sces_tps.tps_libc, sces_famille.fsf_codfam, sces_famille.fsf_lib, sces_s_famille.fsf_codsfm, sces_s_famille.fsf_lib Explain Analyze Plan : GroupAggregate (cost=27161.91..27938.72 rows=16354 width=280) (actual time=484509.210..544436.148 rows=4115 loops=1) - Sort (cost=27161.91..27202.79 rows=16354 width=280) (actual time=484496.188..485334.151 rows=799758 loops=1) Sort Key: sces_gtr_art.gtr_cod, sces_gtr_art.gtr_lib, sces_frc_art.fou_cod, sces_frc_art.fou_lib, sces_tps.tps_annee_mois, 200401::numeric, 200405::numeric, sces_tps.tps_libc, sces_famille.fsf_codfam, sces_famille.fsf_lib, sces_s_famille.fsf_codsfm, sces_s_famille.fsf_lib - Merge Join (cost=25727.79..26017.34 rows=16354 width=280) (actual time=58945.821..69321.146 rows=799758 loops=1) Merge Cond: ((outer.fsf_codfam = inner.fsf_codfam) AND (outer.fsf_codsfm = inner.fsf_codsfm)) - Sort (cost=301.36..304.60 rows=1298 width=83) (actual time=27.926..28.256 rows=332 loops=1) Sort Key: sces_s_famille.fsf_codfam, sces_s_famille.fsf_codsfm - Seq Scan on sces_fsf sces_s_famille (cost=0.00..234.24 rows=1298 width=83) (actual time=0.042..19.124 rows=1341 loops=1) Filter: (fsf_codseg = 0::numeric) - Sort (cost=25426.43..25448.05 rows=8646 width=225) (actual time=58917.106..59693.810 rows=799758 loops=1) Sort Key: sces_art.fsf_codfam, sces_art.fsf_codsfm - Merge Join (cost=24726.32..24861.08 rows=8646 width=225) (actual time=19036.709..29404.943 rows=799758 loops=1) Merge Cond: (outer.tps_annee_mois = inner.tps_annee_mois) - Sort (cost=2.49..2.53 rows=17 width=23) (actual time=0.401..0.428 rows=20 loops=1) Sort Key: sces_tps.tps_annee_mois - Seq Scan on sces_tps (cost=0.00..2.14 rows=17 width=23) (actual time=0.068..0.333 rows=20 loops=1) Filter: (((tps_annee_mois = 200301::numeric) OR (tps_annee_mois = 200401::numeric)) AND ((tps_annee_mois = 200305::numeric) OR (tps_annee_mois = 200401::numeric)) AND ((tps_annee_mois = 200301::numeric) OR (tps_annee_mois = 200405::numeric)) AND ((tps_annee_mois = 200305::numeric) OR (tps_annee_mois = 200405::numeric))) - Sort (cost=24723.83..24747.97 rows=9656 width=214) (actual time=19036.223..19917.214 rows=799757 loops=1) Sort Key: sces_vte.tps_annee_mois - Nested Loop (cost=21825.09..24084.74 rows=9656 width=214) (actual time=417.603..8644.294 rows=399879 loops=1) - Nested Loop (cost=21825.09..21837.50 rows=373 width=195) (actual time=417.444..672.741 rows=14158 loops=1) - Seq Scan on sces_gtr sces_gtr_art (cost=0.00..1.44 rows=1 width=40) (actual
Re: [PERFORM] Problem with large query
The query has been generated by business objects ... i'ill try to suggest to the developpers to remove this constant (if they can)... The fields used by the sort are of type numeric(6,0) or (10,0) ... Could it be better if the fields were integer or anything else ? On Wednesday 08 September 2004 16:40, you wrote: Marc Cousin [EMAIL PROTECTED] writes: I'm having trouble with a (quite big) query, and can't find a way to make it faster. Seems like it might help if the thing could use a HashAggregate instead of sort/group. Numeric is not hashable, so having those TO_NUMBER constants in GROUP BY destroys this option instantly ... but why in the world are you grouping by constants anyway? You didn't say what the datatypes of the other columns were... regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Problem with large query
On Wednesday 08 September 2004 16:56, you wrote: Marc Cousin [EMAIL PROTECTED] writes: The query has been generated by business objects ... i'ill try to suggest to the developpers to remove this constant (if they can)... The fields used by the sort are of type numeric(6,0) or (10,0) ... Could it be better if the fields were integer or anything else ? integer or bigint would be a WHOLE lot faster. I'd venture that comparing two numerics is order of a hundred times slower than comparing two integers. Even if you don't want to change the fields on-disk, you might think about casting them all to int/bigint in the query. Another thing that might or might not be easy is to change the order of the GROUP BY items so that the fields with the largest number of distinct values are listed first. If two rows are distinct at the first column, the sorting comparison doesn't even have to look at the remaining columns ... regards, tom lane Thanks. I've just had confirmation that they can remove the two constants (allready won 100 seconds thanks to that) I've tried the cast, and got down to 72 seconds. So now we're going to try to convert the fields to int or bigint. Thanks a lot for your help and time. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org