Re: [PERFORM] I/O on select count(*)
Hi, Luke Lonergan wrote: BTW – we’ve removed HINT bit checking in Greenplum DB and improved the visibility caching which was enough to provide performance at the same level as with the HINT bit optimization, but avoids this whole “write the data, write it to the log also, then write it again just for good measure” behavior. can you go a bit deeper into how you implemented this or is it some IP of greenplum you cannot reveal? Btw, is there something with your eyes: STYLE='font-size:14pt'> ? :-)) Cheers Tino smime.p7s Description: S/MIME Cryptographic Signature
Re: [PERFORM] I/O on select count(*)
On Thu, 15 May 2008 10:52:01 +0800 Luke Lonergan <[EMAIL PROTECTED]> wrote: > BTW we¹ve removed HINT bit checking in Greenplum DB and improved the > visibility caching which was enough to provide performance at the > same level as with the HINT bit optimization, but avoids this whole > ³write the data, write it to the log also, then write it again just > for good measure² behavior. > > For people doing data warehousing work like the poster, this Postgres > behavior is miserable. It should be fixed for 8.4 for sure > (volunteers?) Donations? You have the code Luke :) Sincerely, Joshua D. Drake P.S. Sorry for the almost bad Star Wars pun. -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate signature.asc Description: PGP signature
Re: [PERFORM] I/O on select count(*)
On Thu, 15 May 2008, Pavan Deolasee wrote: I had suggested in the past that whenever we set hint bits for a tuple, we should check all other tuples in the page and set their hint bits too to avoid multiple writes of the same page. I guess the idea got rejected because of lack of benchmarks to prove the benefit. From glancing at http://www.postgresql.org/docs/faqs.TODO.html I got the impression the idea was to have the background writer get involved to help with this particular situation. The way things are setup right now, I would guess it's impractical for an individual client to be forced to wait for all the tuples in a block to be checked just because it ran into one tuple that needed its hint bits refreshed. If the pages that had any hint bit updates since they were read/created were made easy to identify (maybe they already are), the writer could do the kind of scan you suggest anytime it was about to evict that page. That wouldn't be in the client's critical path and it would maximize the possible improvement here. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I/O on select count(*)
BTW we¹ve removed HINT bit checking in Greenplum DB and improved the visibility caching which was enough to provide performance at the same level as with the HINT bit optimization, but avoids this whole ³write the data, write it to the log also, then write it again just for good measure² behavior. For people doing data warehousing work like the poster, this Postgres behavior is miserable. It should be fixed for 8.4 for sure (volunteers?) BTW for the poster¹s benefit, you should implement partitioning by date, then load each partition and VACUUM ANALYZE after each load. You probably won¹t need the date index anymore so your load times will vastly improve (no indexes), you¹ll store less data (no indexes) and you¹ll be able to do simpler data management with the partitions. You may also want to partition AND index if you do a lot of short range selective date predicates. Example would be: partition by day, index on date field, queries selective on date ranges by hour will then select out only the day needed, then index scan to get the hourly values. Typically time-oriented data is nearly time sorted anyway, so you¹ll also get the benefit of a clustered index. - Luke On 5/15/08 10:40 AM, "Pavan Deolasee" <[EMAIL PROTECTED]> wrote: > On Thu, May 15, 2008 at 7:51 AM, Greg Smith <[EMAIL PROTECTED]> wrote: >> > >> > >> > So is vacuum helpful here because it will force all that to happen in one >> > batch? To put that another way: if I've run a manual vacuum, is it true >> > that it will have updated all the hint bits to XMIN_COMMITTED for all the >> > tuples that were all done when the vacuum started? >> > > > Yes. For that matter, even a plain SELECT or count(*) on the entire > table is good enough. That will check every tuple for visibility and > set it's hint bits. > > Another point to note is that the hint bits are checked and set on a > per tuple basis. So especially during index scan, the same heap page > may get rewritten many times. I had suggested in the past that > whenever we set hint bits for a tuple, we should check all other > tuples in the page and set their hint bits too to avoid multiple > writes of the same page. I guess the idea got rejected because of lack > of benchmarks to prove the benefit. > > Thanks, > Pavan > > -- > Pavan Deolasee > EnterpriseDB http://www.enterprisedb.com > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Re: [PERFORM] I/O on select count(*)
On Thu, May 15, 2008 at 7:51 AM, Greg Smith <[EMAIL PROTECTED]> wrote: > > > So is vacuum helpful here because it will force all that to happen in one > batch? To put that another way: if I've run a manual vacuum, is it true > that it will have updated all the hint bits to XMIN_COMMITTED for all the > tuples that were all done when the vacuum started? > Yes. For that matter, even a plain SELECT or count(*) on the entire table is good enough. That will check every tuple for visibility and set it's hint bits. Another point to note is that the hint bits are checked and set on a per tuple basis. So especially during index scan, the same heap page may get rewritten many times. I had suggested in the past that whenever we set hint bits for a tuple, we should check all other tuples in the page and set their hint bits too to avoid multiple writes of the same page. I guess the idea got rejected because of lack of benchmarks to prove the benefit. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I/O on select count(*)
On 5/14/08, Greg Smith <[EMAIL PROTECTED]> wrote: > On Wed, 14 May 2008, Alvaro Herrera wrote: > > > > If neither of the bits is set, then the transaction is either in progress > (which you can check by examining the list of running transactions in shared > memory) or your process is the first one to check (in which case, you need > to consult pg_clog to know the status, and you can update the hint bits if > you find out a permanent state). > > > > So is vacuum helpful here because it will force all that to happen in one > batch? To put that another way: if I've run a manual vacuum, is it true > that it will have updated all the hint bits to XMIN_COMMITTED for all the > tuples that were all done when the vacuum started? >From my benchmarking experience: Yes, vacuum helps. See also below. > > > > Regarding FAQs, I'm having trouble imagining putting this in the user > > FAQ; I think it belongs into the developer's FAQ. However, a > > benchmarker is not going to look there. Maybe we should start "a > > benchmarker's FAQ"? > > > > On the wiki I've started adding a series of things that are > performance-related FAQs. There's three of them mixed in the bottom of > http://wiki.postgresql.org/wiki/Frequently_Asked_Questions > right now, about slow count(*) and dealing with slow queries. > > Here the FAQ would be "Why am I seeing all these writes when I'm just doing > selects on my table?", and if it's mixed in with a lot of other performance > related notes people should be able to find it. The answer and suggestions > should be simple enough to be useful to a user who just noticed this > behavior, while perhaps going into developer land for those who want to know > more about the internals. Obviously, this issue is tied to the slow count(*) one, as I found out the hard way. Consider the following scenario: * Insert row * Update that row a couple of times * Rinse and repeat many times Now somewhere during that cycle, do a select count(*) just to see where you are. You will be appalled by how slow that is, due to not only the usual 'slow count(*)' reasons. This whole hint bit business makes it even worse, as demonstrated by the fact that running a vacuum before the count(*) makes the latter noticably faster. jan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I/O on select count(*)
On Wed, 14 May 2008, Alvaro Herrera wrote: If neither of the bits is set, then the transaction is either in progress (which you can check by examining the list of running transactions in shared memory) or your process is the first one to check (in which case, you need to consult pg_clog to know the status, and you can update the hint bits if you find out a permanent state). So is vacuum helpful here because it will force all that to happen in one batch? To put that another way: if I've run a manual vacuum, is it true that it will have updated all the hint bits to XMIN_COMMITTED for all the tuples that were all done when the vacuum started? Regarding FAQs, I'm having trouble imagining putting this in the user FAQ; I think it belongs into the developer's FAQ. However, a benchmarker is not going to look there. Maybe we should start "a benchmarker's FAQ"? On the wiki I've started adding a series of things that are performance-related FAQs. There's three of them mixed in the bottom of http://wiki.postgresql.org/wiki/Frequently_Asked_Questions right now, about slow count(*) and dealing with slow queries. Here the FAQ would be "Why am I seeing all these writes when I'm just doing selects on my table?", and if it's mixed in with a lot of other performance related notes people should be able to find it. The answer and suggestions should be simple enough to be useful to a user who just noticed this behavior, while perhaps going into developer land for those who want to know more about the internals. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I/O on select count(*)
Greg Smith wrote: > On Wed, 14 May 2008, Kevin Grittner wrote: > >> If this is the first time that the rows are being read since they were >> inserted (or since the database was loaded, including from backup), it >> may be rewriting the rows to set hint bits, which can make subsequent >> access faster. > > This is the second time this has come up recently, and I know it used to > puzzle me too. This is a particularly relevant area to document better > for people doing benchmarking. As close I've found to a useful > commentary on this subject is the thread at > http://archives.postgresql.org/pgsql-patches/2005-07/msg00390.php > > I still don't completely understand this myself though, if I did I'd add > a FAQ on it. Anyone want to lecture for a minute on the birth and care > of hint bits? I'll make sure any comments here get onto the wiki. Hint bits are used to mark tuples as created and/or deleted by transactions that are know committed or aborted. To determine the visibility of a tuple without such bits set, you need to consult pg_clog and possibly pg_subtrans, so it is an expensive check. On the other hand, if the tuple has the bits set, then it's state is known (or, at worst, it can be calculated easily from your current snapshot, without looking at pg_clog.) There are four hint bits: XMIN_COMMITTED -- creating transaction is known committed XMIN_ABORTED -- creating transaction is known aborted XMAX_COMMITTED -- same, for the deleting transaction XMAX_ABORTED -- ditto If neither of the bits is set, then the transaction is either in progress (which you can check by examining the list of running transactions in shared memory) or your process is the first one to check (in which case, you need to consult pg_clog to know the status, and you can update the hint bits if you find out a permanent state). Regarding FAQs, I'm having trouble imagining putting this in the user FAQ; I think it belongs into the developer's FAQ. However, a benchmarker is not going to look there. Maybe we should start "a benchmarker's FAQ"? -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I/O on select count(*)
On Wed, 14 May 2008, Kevin Grittner wrote: If this is the first time that the rows are being read since they were inserted (or since the database was loaded, including from backup), it may be rewriting the rows to set hint bits, which can make subsequent access faster. This is the second time this has come up recently, and I know it used to puzzle me too. This is a particularly relevant area to document better for people doing benchmarking. As close I've found to a useful commentary on this subject is the thread at http://archives.postgresql.org/pgsql-patches/2005-07/msg00390.php I still don't completely understand this myself though, if I did I'd add a FAQ on it. Anyone want to lecture for a minute on the birth and care of hint bits? I'll make sure any comments here get onto the wiki. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Update performance degrades over time
Hi All, We are doing some load tests with our application running postgres 8.2.4. At times we see updates on a table taking longer (around 11-16secs) than expected sub-second response time. The table in question is getting updated constantly through the load tests. In checking the table size including indexes, they seem to be bloated got it confirmed after recreating it (stats below). We have autovacuum enabled with default parameters. I thought autovaccum would avoid bloating issues but looks like its not aggressive enough. Wondering if table/index bloating is causing update slowness in over a period of time. Any ideas how to troubleshoot this further. No IO waits seen during load tests and cpu usage on the server seem to be 85% idle. This is a v445 sol10 with 4 cpu box attached to SAN storage. Here is the update statement and table/index/instance stats. shared_buffers=4000MB max_fsm_pages = 2048000 maintenance_work_mem = 512MB checkpoint_segments = 128 effective_cache_size = 4000MB update tablexy set col2=$1,col9=$2, col10=$3,col11=$4,col3=$5 WHERE ID=$6; Bloated relname| relowner | relpages | reltuples --+--+--+--- tablexy | 10 | 207423 |502627 ix_tablexy_col1_col2 | 10 |38043 |502627 ix_tablexy_col3 | 10 |13944 |502627 ix_tablexy_col4 | 10 |17841 |502627 ix_tablexy_col5 | 10 |19669 |502627 ix_tablexy_col6 | 10 | 3865 |502627 ix_tablexy_col7 | 10 |12359 |502627 ix_tablexy_col8_col7 | 10 |26965 |502627 ct_tablexy_id_u1 | 10 | 6090 |502627 Recreating tablexy (compact), relname| relowner | relpages | reltuples --+--+--+--- tablexy | 10 |41777 |501233 ix_tablexy_col3 | 10 | 2137 |501233 ix_tablexy_col8_col7 | 10 | 4157 |501233 ix_tablexy_col6 | 10 | 1932 |501233 ix_tablexy_col7 | 10 | 1935 |501233 ix_tablexy_col1_col2 | 10 | 1933 |501233 ix_tablexy_col5 | 10 | 2415 |501233 ix_tablexy_col6 | 10 | 1377 |501233 ct_tablexy_id_u1 | 10 | 3046 |501233 Thanks, Stalin
[PERFORM] poor row estimates with multi-column joins
The following query produces some fairly off estimates for the number of rows that should be returned (this is based on a much more complex query, but whittling it down to this which seems to be the heart of the problem) peii=# explain analyze select * from adv.peii_fast_lookup pfl1 join adv.lsteml_m le1 on (pfl1.ctm_nbr = le1.ctm_nbr and pfl1.emal_id = le1.emal_id) ; QUERY PLAN --- Hash Join (cost=386721.95..1848154.67 rows=7 width=100) (actual time=11407.555..103368.646 rows=18348993 loops=1) Hash Cond: (((le1.ctm_nbr)::text = (pfl1.ctm_nbr)::text) AND ((le1.emal_id)::text = (pfl1.emal_id)::text)) -> Seq Scan on lsteml_m le1 (cost=0.00..435026.44 rows=18712844 width=67) (actual time=0.027..7057.486 rows=18703401 loops=1) -> Hash (cost=172924.18..172924.18 rows=9371918 width=33) (actual time=11387.413..11387.413 rows=9368565 loops=1) -> Seq Scan on peii_fast_lookup pfl1 (cost=0.00..172924.18 rows=9371918 width=33) (actual time=0.006..2933.512 rows=9368565 loops=1) Total runtime: 108132.205 ms default_stats_target is 100, both tables freshly analyzed all join columns on both sides are varchar(12) and we're on 8.3.1 I notice that it seems to give a better number of rows when doing single column joins (explain only, didnt want to wait for it to actually run this) peii=# explain select * from adv.peii_fast_lookup pfl1 join adv.lsteml_m le1 on (pfl1.ctm_nbr = le1.ctm_nbr) ; QUERY PLAN Merge Join (cost=7243997.70..8266364.43 rows=65065332 width=100) Merge Cond: ((pfl1.ctm_nbr)::text = (le1.ctm_nbr)::text) -> Sort (cost=1917159.20..1940589.00 rows=9371918 width=33) Sort Key: pfl1.ctm_nbr -> Seq Scan on peii_fast_lookup pfl1 (cost=0.00..172924.18 rows=9371918 width=33) -> Materialize (cost=5326833.82..5560745.31 rows=18712919 width=67) -> Sort (cost=5326833.82..5373616.12 rows=18712919 width=67) Sort Key: le1.ctm_nbr -> Seq Scan on lsteml_m le1 (cost=0.00..435028.19 rows=18712919 width=67) (9 rows) peii=# explain select * from adv.peii_fast_lookup pfl1 join adv.lsteml_m le1 on (pfl1.emal_id = le1.emal_id) ; QUERY PLAN Hash Join (cost=363292.16..1754557.17 rows=18712919 width=100) Hash Cond: ((le1.emal_id)::text = (pfl1.emal_id)::text) -> Seq Scan on lsteml_m le1 (cost=0.00..435028.19 rows=18712919 width=67) -> Hash (cost=172924.18..172924.18 rows=9371918 width=33) -> Seq Scan on peii_fast_lookup pfl1 (cost=0.00..172924.18 rows=9371918 width=33) (5 rows) for kicks, I upped the stats target and reran everything... peii=# set default_statistics_target = 400; SET peii=# analyze verbose adv.peii_fast_lookup; INFO: analyzing "adv.peii_fast_lookup" INFO: "peii_fast_lookup": scanned 79205 of 79205 pages, containing 9368569 live rows and 316 dead rows; 12 rows in sample, 9368569 estimated total rows ANALYZE peii=# analyze verbose adv.lsteml_m; INFO: analyzing "adv.lsteml_m" INFO: "lsteml_m": scanned 12 of 247899 pages, containing 9050726 live rows and 110882 dead rows; 12 rows in sample, 18697216 estimated total rows ANALYZE peii=# explain analyze select * from adv.peii_fast_lookup pfl1 join adv.lsteml_m le1 on (pfl1.ctm_nbr = le1.ctm_nbr and pfl1.emal_id = le1.emal_id) ; QUERY PLAN --- Hash Join (cost=386611.22..1847063.87 rows=4 width=100) (actual time=11169.338..95460.560 rows=18348993 loops=1) Hash Cond: (((le1.ctm_nbr)::text = (pfl1.ctm_nbr)::text) AND ((le1.emal_id)::text = (pfl1.emal_id)::text)) -> Seq Scan on lsteml_m le1 (cost=0.00..434871.16 rows=18697216 width=67) (actual time=0.008..7012.533 rows=18703401 loops=1) -> Hash (cost=172890.69..172890.69 rows=9368569 width=33) (actual time=11160.329..11160.329 rows=9368569 loops=1) -> Seq Scan on peii_fast_lookup pfl1 (cost=0.00..172890.69 rows=9368569 width=33) (actual time=0.005..2898.336 rows=9368569 loops=1) Total runtime: 100223.220 ms (6 rows) peii=# set enable_hashjoin = false; SET peii=# explain analyze select * from adv.peii_fast_lookup pfl1 join adv.lsteml_m le1 on (pfl1.ctm_nbr = le1.ctm_nbr and pfl1.emal_id = le1.emal_id) ; QUERY PLAN
Re: [PERFORM] I/O on select count(*)
>>> Doug Eck <[EMAIL PROTECTED]> wrote: > I am attempting to run a query to determine the number of rows for a given > day using something like "select count(*) from tbl1 where ts between > '2008-05-12 00:00:00.000' and '2008-05-12 23:59:59.999'". Explain tells me > that the query will be done using an index scan (as I would expect), and I > realize that it is going to take a while. My question concerns some unusual > I/O activity on the box (SUSE) when I run the query. > > For the first couple of minutes I see reads only. After that vmstat shows > mixed reads and writes in a ratio of about 1 block read to 5 blocks written. > Any thoughts into what could be going on? Thanks in advance for your help. Odd as it may seem, a SELECT can cause a page to be rewritten. If this is the first time that the rows are being read since they were inserted (or since the database was loaded, including from backup), it may be rewriting the rows to set hint bits, which can make subsequent access faster. The best solution may be to vacuum more often. http://archives.postgresql.org/pgsql-performance/2007-12/msg00206.php -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] I/O on select count(*)
- Original Message From: Merlin Moncure <[EMAIL PROTECTED]> To: Doug Eck <[EMAIL PROTECTED]> Cc: pgsql-performance@postgresql.org Sent: Wednesday, May 14, 2008 3:38:23 PM Subject: Re: [PERFORM] I/O on select count(*) On Wed, May 14, 2008 at 4:09 PM, Doug Eck <[EMAIL PROTECTED]> wrote: > I have a large table (~ 2B rows) that contains an indexed timestamp column. > I am attempting to run a query to determine the number of rows for a given > day using something like "select count(*) from tbl1 where ts between > '2008-05-12 00:00:00.000' and '2008-05-12 23:59:59.999'". Explain tells me > that the query will be done using an index scan (as I would expect), and I > realize that it is going to take a while. My question concerns some unusual > I/O activity on the box (SUSE) when I run the query. > > For the first couple of minutes I see reads only. After that vmstat shows > mixed reads and writes in a ratio of about 1 block read to 5 blocks > written. We have determined that files in our data and log partitions are > being hit, but the file system itself is not growing during this time (it > appears to be writing over the same chunk of space over and over again). > Memory on the box is not being swapped while all of this is happening. I > would have guessed that a "select count(*)" would not require a bunch of > writes, and I can't begin to figure out why the number of blocks written are > so much higher than the blocks read. If I modify the where clause to only > count the rows for a given minute or two, I see the reads but I never see > the unusual write behavior. > > Any thoughts into what could be going on? Thanks in advance for your help. can you post the exact output of explain analyze? (or, at least, explain if the query takes too long) merlin The query takes a long time to run, so I'll start with the explain output. I can run explain analyze (given enough time) if you believe its output could hold some clues. db_2008=> explain select count(*) from ot_2008_05 where transact_time between '2008-05-12 00:00:00.000' and '2008-05-12 23:59:59.999'; QUERY PLAN --- Aggregate (cost=10368613.47..10368613.48 rows=1 width=0) -> Index Scan using ot_2008_05_ak2 on ot_2008_05 (cost=0.00..10011333.27 rows=142912078 width=0) Index Cond: ((transact_time >= '2008-05-12 00:00:00-04'::timestamp with time zone) AND (transact_time <= '2008-05-12 23:59:59.999-04'::timestamp with time zone)) (3 rows) db_2008=> Doug
Re: [PERFORM] I/O on select count(*)
On Wed, May 14, 2008 at 4:09 PM, Doug Eck <[EMAIL PROTECTED]> wrote: > I have a large table (~ 2B rows) that contains an indexed timestamp column. > I am attempting to run a query to determine the number of rows for a given > day using something like "select count(*) from tbl1 where ts between > '2008-05-12 00:00:00.000' and '2008-05-12 23:59:59.999'". Explain tells me > that the query will be done using an index scan (as I would expect), and I > realize that it is going to take a while. My question concerns some unusual > I/O activity on the box (SUSE) when I run the query. > > For the first couple of minutes I see reads only. After that vmstat shows > mixed reads and writes in a ratio of about 1 block read to 5 blocks > written. We have determined that files in our data and log partitions are > being hit, but the file system itself is not growing during this time (it > appears to be writing over the same chunk of space over and over again). > Memory on the box is not being swapped while all of this is happening. I > would have guessed that a "select count(*)" would not require a bunch of > writes, and I can't begin to figure out why the number of blocks written are > so much higher than the blocks read. If I modify the where clause to only > count the rows for a given minute or two, I see the reads but I never see > the unusual write behavior. > > Any thoughts into what could be going on? Thanks in advance for your help. can you post the exact output of explain analyze? (or, at least, explain if the query takes too long) merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] I/O on select count(*)
I have a large table (~ 2B rows) that contains an indexed timestamp column. I am attempting to run a query to determine the number of rows for a given day using something like "select count(*) from tbl1 where ts between '2008-05-12 00:00:00.000' and '2008-05-12 23:59:59.999'". Explain tells me that the query will be done using an index scan (as I would expect), and I realize that it is going to take a while. My question concerns some unusual I/O activity on the box (SUSE) when I run the query. For the first couple of minutes I see reads only. After that vmstat shows mixed reads and writes in a ratio of about 1 block read to 5 blocks written. We have determined that files in our data and log partitions are being hit, but the file system itself is not growing during this time (it appears to be writing over the same chunk of space over and over again). Memory on the box is not being swapped while all of this is happening. I would have guessed that a "select count(*)" would not require a bunch of writes, and I can't begin to figure out why the number of blocks written are so much higher than the blocks read. If I modify the where clause to only count the rows for a given minute or two, I see the reads but I never see the unusual write behavior. Any thoughts into what could be going on? Thanks in advance for your help. Doug
Re: [PERFORM] Regexps - never completing join.
Returning to this problem this morning, I made some more insight. The regexp cache isn't getting very many hits because the executor is looping through all of the classification rows then looping through all of the regular expressions, causing each expression to be recompiled every time since the cache limit is only for 32 cached regular expressions. You can think of the behavior like: foreach classification { foreach regexp { do match } } Obviously to make this perform better without requiring a bigger regexp cache I'd like it to run like: foreach regexp { foreach classification { do match } } That way the cache wouldn't have to be very big at all since the last used regular expression would be at the top of the cache. Various methods of changing the query don't seem to have the desired effect. Even with setting join_collapse_limit to 1. select wc_rule.id from wc_rule cross join classifications on classifications.classification ~* wc_rule.regexp; QUERY PLAN - Nested Loop (cost=13.71..891401.71 rows=197843 width=4) Join Filter: (classifications.classification ~* wc_rule.regexp) -> Seq Scan on classifications (cost=0.00..1093.46 rows=56446 width=42) -> Materialize (cost=13.71..20.72 rows=701 width=22) -> Seq Scan on wc_rule (cost=0.00..13.01 rows=701 width=22) (5 rows) select wc_rule.id from classifications cross join wc_rule on classifications.classification ~* wc_rule.regexp; QUERY PLAN - Nested Loop (cost=13.71..891401.71 rows=197843 width=4) Join Filter: (classifications.classification ~* wc_rule.regexp) -> Seq Scan on classifications (cost=0.00..1093.46 rows=56446 width=42) -> Materialize (cost=13.71..20.72 rows=701 width=22) -> Seq Scan on wc_rule (cost=0.00..13.01 rows=701 width=22) (5 rows) Both of those queries execute in the same looping order, there doesn't seem to be a control to say use this table as the inner table and this table as the outer table for the join that I could find. One way I did find that worked to control the loop (but doesn't yield the same results because its a left join) select wc_rule.id from wc_rule left join classifications on classifications.classification ~* wc_rule.regexp; QUERY PLAN -- Nested Loop Left Join (cost=1149.91..891457.45 rows=197843 width=4) (actual time=0.627..149051.505 rows=55126 loops=1) Join Filter: (classifications.classification ~* wc_rule.regexp) -> Seq Scan on wc_rule (cost=0.00..13.01 rows=701 width=22) (actual time=0.030..1.272 rows=701 loops=1) -> Materialize (cost=1149.91..1714.37 rows=56446 width=42) (actual time=0.001..14.244 rows=56446 loops=701) -> Seq Scan on classifications (cost=0.00..1093.46 rows=56446 width=42) (actual time=0.022..29.913 rows=56446 loops=1) Total runtime: 149067.764 ms (6 rows) Thanks, Rusty -- Rusty Conover InfoGears Inc. http://www.infogears.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres overall performance seems to degrade when large SELECT are requested
The problem seem to arise when a SELECT that returns a lot of rows is Does the SELECT return a lot of rows, or does it scan a lot of rows ? (for instance, if you use aggregates, it might scan lots of data but only return few rows). The problem is that when the SELECTs are run the main application starts running out of available connections which means that postgres is not returning the query results fast enough. What I find a little bit starnge is that the report engine's SELECTs operate on a different set of tables than the ones the main application is using. Also the db box is hardly breaking a sweat, CPU and memory utilization are ridiculously low and IOwaits are typically less than 10%. Is it swapping ? (vmstat -> si/so) Is it locking ? (probably not from what you say) Is the network connection between the client and DB server saturated ? (easy with 100 Mbps connections, SELECT with a large result set will happily blast your LAN) Is the reporting tool running on the same machine as the DB client and killing it ? (swapping, etc) If it's a saturated network, solutions are : - install Gb ethernet - run the report on the database server (no bandwidth problems...) - rewrite the reporting tool to use SQL aggregates to transfer less data over the network - or use a cursor to fetch your results in chunks, and wait a little between chunks Has anyone experienced this? Yeah on benchmarks sometimes the LAN gave up before Postgres broke a sweat... Gb ethernet solved that... Are there any settings I can change to improve throughput? Any help will be greatly appreciated. iptraf will tell you all about your network traffic vmstat will tell you if your server or client is io-cpu-swap bound you'd need to post output from those... Thanks, val __ Sent from Yahoo! Mail. A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] postgres overall performance seems to degrade when large SELECT are requested
HI, I have an application that maintains 150 open connections to a Postgres DB server. The application works fine without a problem for the most time. The problem seem to arise when a SELECT that returns a lot of rows is executed or the SELECT is run on a large object. These selects are run from time to time by a separate process whose purpose is to generate reports from the db data. The problem is that when the SELECTs are run the main application starts running out of available connections which means that postgres is not returning the query results fast enough. What I find a little bit starnge is that the report engine's SELECTs operate on a different set of tables than the ones the main application is using. Also the db box is hardly breaking a sweat, CPU and memory utilization are ridiculously low and IOwaits are typically less than 10%. Has anyone experienced this? Are there any settings I can change to improve throughput? Any help will be greatly appreciated. Thanks, val __ Sent from Yahoo! Mail. A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] can I move sort to first outer join ?
On Wed, 14 May 2008 06:40:40 +0200, fernando castano <[EMAIL PROTECTED]> wrote: Hi all, This sql is taking too long for the size of my tiny db. Any tips from this alias? I tried moving the sort to the first left outer join (between projects and features tables) using a nested subquery, but postgres tells me only one column could be returned from a subqueyr. Instead of : SELECT * FROM a LEFT JOIN b LEFT JOIN c WHERE c.column=... ORDER BY c.x LIMIT N You could write : SELECT * FROM a LEFT JOIN b LEFT JOIN (SELECT * FROM c WHERE c.column=... ORDER BY c.x LIMIT N) AS cc ORDER BY cc.x LIMIT N This is only interesting of you use a LIMIT and this allows you to reduce the number of rows sorted/joined. However in your case this is not the right thing to do since you do not use LIMIT, and sorting your 846 rows will only take a very small time. Your problem are those seq scans, you need to optimize that query so it can use indexes. -> Seq Scan on projects (cost=0.00..10.90 rows=4 width=1884) (actual time=0.039..0.109 rows=10 loops=1) Filter: (visible AND (id = ANY ('{3,4,5,6,10,7,8,9,13,11}'::integer[]))) -> Hash (cost=10.40..10.40 rows=40 width=1870) (actual time=1.048..1.048 rows=101 loops=1) -> Seq Scan on features (cost=0.00..10.40 rows=40 width=1870) (actual time=0.026..0.464 rows=101 loops=1) -> Hash (cost=10.70..10.70 rows=70 width=1065) (actual time=0.098..0.098 rows=29 loops=1) -> Seq Scan on person_roles (cost=0.00..10.70 rows=70 width=1065) (actual time=0.014..0.037 rows=29 loops=1) -> Hash (cost=15.80..15.80 rows=580 width=106) (actual time=0.105..0.105 rows=32 loops=1) -> Seq Scan on project_tags (cost=0.00..15.80 rows=580 width=106) (actual time=0.013..0.036 rows=32 loops=1) Total runtime: 149.622 ms All those seq scans !!! Please post, for each of those tables : - The total number of rows (SELECT count(*) is fine) - The table definitions with indexes (\d table) EXPLAIN ANALYZE tells you the number of rows it picked out of a seq scan (that's the "rows=") but not the number of rows scanned... this is important, because a seq scan on a small table isn't a problem, but on a big one, it is. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Regexps - never completing join.
On May 13, 2008, at 11:45 PM, Rusty Conover wrote: Hi Guys, I'm using postgresql 8.3.1 and I'm seeing weird behavior between what I expect and what's happening when the query is executed I'm trying to match a table that contains regexps against another table that is full of the text to match against so my query is: select wc_rule.id from classifications, wc_rule where classifications.classification ~* wc_rule.regexp; When I run that the query takes a very very long time (never ending so far 20 minutes or so) to execute. But if I loop through all of the rules and a query for each rule: select wc_rule.id from classifications, wc_rule where classifications.classification ~* wc_rule.regexp and wc_rule.id = ? All of the rules when run individually can be matched in a little under then 3 minutes. I'd assume postgres would be equal to or faster with the single row execution method. The table schema: CREATE TABLE wc_rule ( id integer NOT NULL, regexp text, ); CREATE TABLE classifications ( id integer NOT NULL, classification text NOT NULL ); gb_render_1_db=# explain select wc_rule.id from classifications, wc_rule where classifications.classification ~* wc_rule.regexp; QUERY PLAN - Nested Loop (cost=13.71..891401.71 rows=197843 width=4) Join Filter: (classifications.classification ~* wc_rule.regexp) -> Seq Scan on classifications (cost=0.00..1093.46 rows=56446 width=42) -> Materialize (cost=13.71..20.72 rows=701 width=22) -> Seq Scan on wc_rule (cost=0.00..13.01 rows=701 width=22) (5 rows) As a followup I did some digging: by editing: src/backend/utils/adt/regexp.c and increasing the cache size for regular expressions to an arbitrarily large number #define MAX_CACHED_RES 3200 Rather then the default of #define MAX_CACHED_RES 32 I was able to get the query to complete in a respectable amount of time: gb_render_1_db=# explain analyze select wc_rule.id from classifications, wc_rule where classifications.classification ~* wc_rule.regexp; QUERY PLAN -- Nested Loop (cost=13.71..891401.71 rows=197843 width=4) (actual time=72.714..366899.913 rows=55052 loops=1) Join Filter: (classifications.classification ~* wc_rule.regexp) -> Seq Scan on classifications (cost=0.00..1093.46 rows=56446 width=42) (actual time=28.820..109.895 rows=56446 loops=1) -> Materialize (cost=13.71..20.72 rows=701 width=22) (actual time=0.000..0.193 rows=701 loops=56446) -> Seq Scan on wc_rule (cost=0.00..13.01 rows=701 width=22) (actual time=0.030..0.593 rows=701 loops=1) Total runtime: 366916.632 ms (6 rows) Which is still > 6 minutes, but at least it completed. I'll keep digging into what is causing this bad performance. Thanks, Rusty -- Rusty Conover InfoGears Inc. http://www.infogears.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance