Re: [PERFORM] performance with query
Ok, here are the last rows for the vacuum analyze verbose INFO: free space map contains 154679 pages in 39 relations DETAIL: A total of 126176 page slots are in use (including overhead). 126176 page slots are required to track all free space. Current limits are: 16 page slots, 5000 relations, using 1476 kB. L'interrogazione è stata eseguita con successo, ma senza risultato, in 1332269 ms. and I attach the complete explain analyze of the complex query. Giving more detail about the tables involved in the query could be not so easy as they are a lot. The joins are made between columns that are primary key in a table and indexed in the other. All the where clausole are on indexed colums (perhaps there are too many indexes...) Thanks a lot. QUERY PLAN Merge Right Join (cost=508603077.17..508603195.59 rows=1 width=227) (actual time=73312.340..1463106.860 rows=32407 loops=1) Merge Cond: (ve_edil_rendite.id_domanda = domande.id_domanda) - GroupAggregate (cost=0.00..105.51 rows=1031 width=11) (actual time=0.098..110.794 rows=1031 loops=1) - Index Scan using pk_ve_edil_rendite on ve_edil_rendite (cost=0.00..86.84 rows=1157 width=11) (actual time=0.063..98.601 rows=1157 loops=1) - Materialize (cost=508603077.17..508603077.18 rows=1 width=195) (actual time=73312.188..1462604.387 rows=32407 loops=1) - Nested Loop (cost=506932259.90..508603077.17 rows=1 width=195) (actual time=73312.174..1462385.266 rows=32407 loops=1) - Merge Join (cost=406932259.90..408603074.89 rows=1 width=188) (actual time=73312.123..1461834.776 rows=32407 loops=1) Merge Cond: (domande.id_domanda = c_elaout_7.id_domanda) - Merge Join (cost=406932259.90..408188339.97 rows=1 width=240) (actual time=72975.426..1458427.886 rows=32407 loops=1) Merge Cond: (c_elaout_5.id_domanda = domande.id_domanda) - Merge Join (cost=3895.15..1259628.81 rows=138561 width=41) (actual time=1721.643..7493.711 rows=99308 loops=1) Merge Cond: (edil_veneto.id_domanda = c_elaout_5.id_domanda) - Merge Join (cost=1123.18..372710.75 rows=98122 width=29) (actual time=569.693..4135.019 rows=99308 loops=1) Merge Cond: (edil_veneto.id_domanda = c_elaout_6.id_domanda) - Index Scan using IDX_pk_Edil_Veneto on edil_veneto (cost=0.00..11825.14 rows=232649 width=17) (actual time=0.080..1157.486 rows=232471 loops=1) - Index Scan using IDX_3_c_elaout on c_elaout c_elaout_6 (cost=0.00..359914.34 rows=98122 width=12) (actual time=0.094..1900.373 rows=99308 loops=1) Index Cond: ((c_elaout_6.node)::text = 'contributo_sociale'::text) - Index Scan using IDX_3_c_elaout on c_elaout c_elaout_5 (cost=0.00..887091.20 rows=245306 width=12) (actual time=0.120..2389.615 rows=250746 loops=1) Index Cond: ((c_elaout_5.node)::text = 'contributo'::text) - Materialize (cost=406928364.74..406928364.75 rows=1 width=199) (actual time=69623.122..1450472.706 rows=32407 loops=1) - Nested Loop (cost=402583154.89..406928364.74 rows=1 width=199) (actual time=69623.107..1450215.911 rows=32407 loops=1) Join Filter: ((r_enti.codice_ente)::text = (r_luoghi.cod_catastale)::text) - Merge Join (cost=202583154.89..206928031.60 rows=1 width=198) (actual time=69611.258..115367.182 rows=32407 loops=1) Merge Cond: (domande.id_domanda = c_elaout_4.id_domanda) - Merge Join (cost=202583154.89..206425374.54 rows=1 width=186) (actual time=69007.657..113053.726 rows=32407 loops=1) Merge Cond: (domande.id_domanda = c_elain_3.id_domanda) - Merge Join (cost=201328203.80..205170407.27 rows=41 width=138) (actual time=66160.710..100104.342 rows=32407 loops=1) Merge Cond: (domande.id_domanda = c_elain_7.id_domanda) - Merge Join (cost=201328203.80..204498966.35 rows=93 width=126) (actual time=56792.251..72298.070 rows=32407 loops=1) Merge Cond: (domande.id_domanda = c_elain_9.id_domanda) - Merge Join (cost=201322293.83..203828121.81 rows=424 width=114) (actual time=47349.082..55619.999 rows=32407 loops=1) Merge Cond:
Re: [PERFORM] performance with query (OT)
Alberto Dalmaso wrote: [...] in the explanation I'll see that the db use nasted loop. [...] Sorry for the remark off topic, but I *love* the term nasted loop. It should not go to oblivion unnoticed. Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Speeding up a query.
Matthew Hartman wrote: To determine the available slots, the algorithm finds the earliest slot that has an available chair and a count of the required concurrent intervals afterwards. So a 60 minute regimen requires 12 concurrent rows. This is accomplished by joining the table on itself. A second query is ran for the same range, but with respect to the nurse time and an available nurse. Finally, those two are joined against each other. Effectively, it is: Select * From ( Select * From matrix m1, matrix m2 Where m1.x = m2.x ) chair, ( Select * From matrix m1, matrix m2 Where m1.x = m2.x ) nurse Where chair.id = nurse.id With matrix having 3,280 rows. Ugh. I have tried various indexes and clustering approachs with little success. Any ideas? I don't understand your data model well enough to understand the query, so I can only give you general hints (which you probably already know): - Frequently the biggest performance gains can be reached by a (painful) redesign. Can ou change the table structure in a way that makes this query less expensive? - You have an index on matrix.x, right? - Can you reduce the row count of the two subqueries by adding additional conditions that weed out rows that can be excluded right away? - Maybe you can gain a little by changing the select * to select id in both subqueries and adding an additional join with matrix that adds the relevant columns in the end. I don't know the executor, so I don't know if that will help, but it would be a simple thing to test in an experiment. Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 8.4 COPY performance regression on Solaris
Alan Li wrote: Hi, It seems that a COPY of 8M rows to a table to 8.4rc1 takes 30% longer than it does to 8.3.7 on Solaris. Here are the steps I've taken to reproduce this problem on two different solaris boxes (Solaris 10 11/06 s10x_u3wos_10 X86 and Solaris 10 8/07 s10x_u4wos_12b X86). I've tried this on a Linux box, and I do not see the problem there. tried that on my box (though I increased the testset size by 10x to get more sensible runtimes) and I can reproduce that on Linux(CentoS 5.3/x86_64, Nehalem Xeon E5530) as well. I get ~45 rows/s on 8.3 and only ~33/s on 8.4 on 8.4 I get: 3m59/4m01/3m56s runtime and a profile of samples %symbol name 636302 19.6577 XLogInsert 415510 12.8366 CopyReadLine 2253476.9618 DoCopy 1311434.0515 ParseDateTime 1220433.7703 DecodeNumber 81730 2.5249 DecodeDate 81045 2.5038 DecodeDateTime 80900 2.4993 pg_verify_mbstr_len 80235 2.4787 pg_next_dst_boundary 67571 2.0875 LWLockAcquire 64548 1.9941 heap_insert 64178 1.9827 LWLockRelease 63609 1.9651 PageAddItem 63402 1.9587 heap_form_tuple 56544 1.7468 timestamp_in 48697 1.5044 heap_fill_tuple 45248 1.3979 pg_atoi 42390 1.3096 IsSystemRelation 41287 1.2755 BufferGetBlockNumber 38936 1.2029 ValidateDate 36619 1.1313 ExecStoreTuple 35367 1.0926 DecodeTime on 8.3.7 I get 2m58s,2m54s,2m55s and a profile of: samples %symbol name 460966 16.2924 XLogInsert 307386 10.8643 CopyReadLine 301745 10.6649 DoCopy 1534525.4236 pg_next_dst_boundary 1197574.2327 DecodeNumber 1053563.7237 heap_formtuple 83456 2.9497 ParseDateTime 83020 2.9343 pg_verify_mbstr_len 72735 2.5708 DecodeDate 70425 2.4891 LWLockAcquire 65820 2.3264 LWLockRelease 61823 2.1851 DecodeDateTime 55895 1.9756 hash_any 51305 1.8133 PageAddItem 47440 1.6767 AllocSetAlloc 47218 1.6689 heap_insert 38912 1.3753 DecodeTime 34871 1.2325 ReadBuffer_common 34519 1.2200 date2j 33093 1.1696 DetermineTimeZoneOffset 31334 1.1075 MemoryContextAllocZero 30951 1.0939 RelationGetBufferForTuple If I do the same test utilizing WAL bypass the picture changes: 8.3 runtimes:2m16,2min14s,2min22s and profile: samples %symbol name 445583 16. CopyReadLine 332772 12.5300 DoCopy 1569745.9106 pg_next_dst_boundary 1319524.9684 heap_formtuple 1191144.4850 DecodeNumber 94340 3.5522 ParseDateTime 81624 3.0734 pg_verify_mbstr_len 75012 2.8245 DecodeDate 74950 2.8221 DecodeDateTime 64467 2.4274 hash_any 62859 2.3669 PageAddItem 62054 2.3365 LWLockAcquire 57209 2.1541 LWLockRelease 45812 1.7250 hash_search_with_hash_value 41530 1.5637 DetermineTimeZoneOffset 40790 1.5359 heap_insert 39694 1.4946 AllocSetAlloc 38855 1.4630 ReadBuffer_common 36056 1.3576 MemoryContextAllocZero 36030 1.3567 DecodeTime 29057 1.0941 UnpinBuffer 28291 1.0653 PinBuffer 8.4 runtime: 2m1s,2m,1m59s and profile: 404775 17.9532 CopyReadLine 2084829.2469 DoCopy 1488986.6042 ParseDateTime 1186455.2623 DecodeNumber 80972 3.5914 DecodeDate 79005 3.5042 pg_verify_mbstr_len 73645 3.2664 PageAddItem 72167 3.2009 DecodeDateTime 65264 2.8947 heap_form_tuple 52680 2.3365 timestamp_in 46264 2.0520 pg_next_dst_boundary 45819 2.0322 ExecStoreTuple 45745 2.0290 heap_fill_tuple 43690 1.9378 heap_insert 38453 1.7055 InputFunctionCall 37050 1.6433 LWLockAcquire 36853 1.6346 BufferGetBlockNumber 36428 1.6157 heap_compute_data_size 33818 1.5000 DetermineTimeZoneOffset 33468 1.4844 DecodeTime 30896 1.3703 tm2timestamp 30888 1.3700 GetCurrentTransactionId Stefan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Speeding up a query.
On Wed, Jun 17, 2009 at 8:33 AM, Albe Laurenzlaurenz.a...@wien.gv.at wrote: I don't understand your data model well enough to understand the query, so I can only give you general hints (which you probably already know): He is effectively joining same table 4 times in a for loop, to get result, this is veeery ineffective. imagine: for(x) for(x) for(x) for(x) .. where X is number of rows in table matrix. not scarred yet ? -- GJ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Speeding up a query.
yes, I have to make that because the data on the table need to be pivoted so it is joined many times with different filter on the column that describe the meaning of the column called numeric_value I'm going to show. That could be very ineffective, event because that table contains something like 2500 rows... There are two tables in this condition (as you can se in the explain) and both are the table with the higher number of rows in the database. But I don's see any other choice to obtain that information. P.S.: i'm trying with all enable_* to on and pumping to higher values from_collapse_limit and join_collapse_limit that I've put to 30. The result is that the query, after an hour of work, goes out of memory (SQL State 53200)... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] GiST index performance
Tom Lane wrote: Matthew Wakeling matt...@flymine.org writes: I'm guessing my next step is to install a version of libc with debugging symbols? Yeah, if you want to find out what's happening in libc, that's what you need. Getting callgraph information from oprofile would also help. Although it won't directly tell what function in libc is being called, you would see where the calls are coming from, which is usually enough to guess what the libc function is. You can also get the oprofile data, including callgraph, into kcachegrind, which is *very* helpful. Here's a script I use: http://roberts.vorpus.org/~njs/op2calltree.py -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Speeding up a query.
On Tue, Jun 16, 2009 at 2:35 PM, Hartman, Matthewmatthew.hart...@krcc.on.ca wrote: Good afternoon. I have developed an application to efficiently schedule chemotherapy patients at our hospital. The application takes into account several resource constraints (available chairs, available nurses, nurse coverage assignment to chairs) as well as the chair time and nursing time required for a regimen. The algorithm for packing appointments in respects each constraint and typically schedules a day of treatments (30-60) within 9-10 seconds on my workstation, down from 27 seconds initially. I would like to get it below 5 seconds if possible. I think what's slowing is down is simply the number of rows and joins. The algorithm creates a scheduling matrix with one row per 5 minute timeslot, per unit, per nurse assigned to the unit. That translates to 3,280 rows for the days I have designed in development (each day can change). To determine the available slots, the algorithm finds the earliest slot that has an available chair and a count of the required concurrent intervals afterwards. So a 60 minute regimen requires 12 concurrent rows. This is accomplished by joining the table on itself. A second query is ran for the same range, but with respect to the nurse time and an available nurse. Finally, those two are joined against each other. Effectively, it is: Select * From ( Select * From matrix m1, matrix m2 Where m1.x = m2.x ) chair, ( Select * From matrix m1, matrix m2 Where m1.x = m2.x ) nurse Where chair.id = nurse.id With matrix having 3,280 rows. Ugh. I have tried various indexes and clustering approachs with little success. Any ideas? how far in advance do you schedule? As far as necessary? How many chairs are there? How many nurses are there? This is a tricky (read: interesting) problem. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index Scan taking long time
Scott Marlowe scott.marl...@gmail.com writes: Without looking at the explain just yet, it seems to me that you are constraining the order of joins to insist that the left joins be done first, then the regular joins second, because of your mix of explicit and implicit join syntax. The query planner is constrained to run explicit joins first, then implicit if I remember correctly. That isn't true as of recent releases (8.2 and up, I think). It is true that there are semantic constraints that prevent certain combinations of inner and outer joins from being rearranged ... but if that applies here, it would also prevent manual rearrangement, unless the OP decides that this query doesn't express quite what he meant. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance with query
Alberto Dalmaso dalm...@clesius.it wrote: Ok, here are the last rows for the vacuum analyze verbose INFO: free space map contains 154679 pages in 39 relations DETAIL: A total of 126176 page slots are in use (including overhead). 126176 page slots are required to track all free space. Current limits are: 16 page slots, 5000 relations, using 1476 ? kB. No indication of bloat there. You could afford to free some RAM by reducing the max_fsm_relations setting. (You have 39 relations but are reserving RAM to keep track of free space in 5000 relations.) and I attach the complete explain analyze of the complex query. I'll see what I can glean from that when I get some time. All the where clausole are on indexed colums (perhaps there are too many indexes...) That's not usually a problem. The other thing I was hoping to see, which I don't think you've sent, is an EXPLAIN ANALYZE of the same query with the settings which you have found which cause it to pick a faster plan. As I understand it, that runs pretty fast, so hopefully that's a quick one for you to produce. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Speeding up a query.
Alberto Dalmaso dalm...@clesius.it wrote: P.S.: i'm trying with all enable_* to on and pumping to higher values from_collapse_limit and join_collapse_limit that I've put to 30. Tom suggested that you set those numbers higher than the number of tables joined in the query. I don't think 30 will do that. The result is that the query, after an hour of work, goes out of memory (SQL State 53200)... Ouch! Can you provide more details? All information from the PostgreSQL log about that event would be good. If there's anything which might be related in the OS logs from around that time, please include that, too. Also, with those settings at a high value, try running just an EXPLAIN (no ANALYZE) of the query, to see how long that takes, and whether you have a memory issue during the planning phase. (You can use \timing in psql to get a report of the run time of the EXPLAIN.) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Speeding up a query.
Thanks for the replies everyone. I'll try to answer them all in this one email. I will send another email immediately after this with additional details about the query. - Frequently the biggest performance gains can be reached by a (painful) redesign. Can ou change the table structure in a way that makes this query less expensive? I have considered redesigning the algorithm to accommodate this. As I've said, there's one row per five minute time slot. Instead, I could represent an interval of time with a row. For example, start_time of 08:00 with an end_time of 12:00 or perhaps an interval duration of 4 hours. The difficulty becomes in managing separate time requirements (nurse vs unit) for each time slot, and in inserting/updating new rows as pieces of those time slots or intervals are used up. Having a row per five minute interval avoids those complications so far. Still, I'd start with 32 rows and increase the number, never reaching 3,280.. :) - You have an index on matrix.x, right? I have tried indexes on each common join criteria. Usually it's time,unit, time,nurse, or time,unit_scheduled, time,nurse_scheduled (the later two being Booleans). In the first two cases it's made a difference of less than a second. In the last two, the time actually increases if I add analyze statements in after updates are made. - Can you reduce the row count of the two subqueries by adding additional conditions that weed out rows that can be excluded right away? I use some additional conditions. I'll paste the meat of the query below. - Maybe you can gain a little by changing the select * to select id in both subqueries and adding an additional join with matrix that adds the relevant columns in the end. I don't know the executor, so I don't know if that will help, but it would be a simple thing to test in an experiment. I wrote the select * as simplified, but really, it returns the primary key for that row. how far in advance do you schedule? As far as necessary? It's done on a per day basis, each day taking 8-12 seconds or so on my workstation. We typically schedule patients as much as three to six months in advance. The query already pulls data to a temporary table to avoid having to manage a massive number of rows. How many chairs are there? How many nurses are there? This is a tricky (read: interesting) problem. In my current template there are 17 chairs and 7 nurses. Chairs are grouped into pods of 2-4 chairs. Nurses cover one to many pods, allowing for a primary nurse per pod as well as floater nurses that cover multiple pods. Matthew Hartman Programmer/Analyst Information Management, ICP Kingston General Hospital (613) 549- x4294 -Original Message- From: Merlin Moncure [mailto:mmonc...@gmail.com] Sent: Wednesday, June 17, 2009 9:09 AM To: Hartman, Matthew Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Speeding up a query. On Tue, Jun 16, 2009 at 2:35 PM, Hartman, Matthewmatthew.hart...@krcc.on.ca wrote: Good afternoon. I have developed an application to efficiently schedule chemotherapy patients at our hospital. The application takes into account several resource constraints (available chairs, available nurses, nurse coverage assignment to chairs) as well as the chair time and nursing time required for a regimen. The algorithm for packing appointments in respects each constraint and typically schedules a day of treatments (30-60) within 9-10 seconds on my workstation, down from 27 seconds initially. I would like to get it below 5 seconds if possible. I think what's slowing is down is simply the number of rows and joins. The algorithm creates a scheduling matrix with one row per 5 minute timeslot, per unit, per nurse assigned to the unit. That translates to 3,280 rows for the days I have designed in development (each day can change). To determine the available slots, the algorithm finds the earliest slot that has an available chair and a count of the required concurrent intervals afterwards. So a 60 minute regimen requires 12 concurrent rows. This is accomplished by joining the table on itself. A second query is ran for the same range, but with respect to the nurse time and an available nurse. Finally, those two are joined against each other. Effectively, it is: Select * From ( Select * From matrix m1, matrix m2 Where m1.x = m2.x ) chair, ( Select * From matrix m1, matrix m2 Where m1.x = m2.x ) nurse Where chair.id = nurse.id With matrix having 3,280 rows. Ugh. I have tried various indexes and clustering approachs with little success. Any ideas? how far in advance do you schedule? As far as necessary? How many chairs are there? How many nurses are there? This is a tricky (read: interesting) problem. merlin -- Sent via pgsql-performance mailing list
Re: [PERFORM] Speeding up a query.
I promised to provide more details of the query (or the function as it is). Here goes. Scenario: A chemotherapy regimen requires chair time and nursing time. A patient might sit in the chair for three hours but the nurse only has to be with them for the first hour. Therefore, nurses can manage multiple chairs at a time. Each regimen has a different time requirement. To efficiently manage our chair and nursing resources, we want to schedule against these constraints. Our room currently has 17 chairs and around 8 nurses per day. We administer several hundred different regimens and the time for each regimen varies based on the day of the regimen as well as the course. All of these variables are entered and maintained through a web application I wrote. Scheduling algorithm: Written in PostgreSQL (naturally), the algorithm is a single function call. It gathers the data for a day into a temporary table and cycles through each appointment. Appointments are scheduled in the following order: locked appointments (previously scheduled and assigned to a nurse and chair), reserved appointments (a desired time slot has been selected), open appointments (ordered by the required chair time descending and the required nurse time descending). Here's the busy part that loops through each appointment. The table definition follows. Anything beginning with an underscore is a declared variable. -- Reserved and unscheduled appointments. FOR _APPOINTMENT IN SELECT * FROM MATRIX_UNSCHEDULED WHERE APPT_STATUS 'L' ORDER BY ROW_NUM LOOP -- Initialize the variables for this record. RAISE NOTICE 'Status ''%'' - %', _APPOINTMENT.APPT_STATUS, _APPOINTMENT; _AVAILABLE := null; select into _UNIT_INTERVALS, _NURSE_INTERVALS, _UNIT_REQUIRED, _NURSE_REQUIRED _APPOINTMENT.total_unit_time / 5, _APPOINTMENT.total_nurse_time / 5, (_APPOINTMENT.total_unit_time || ' minutes')::INTERVAL, (_APPOINTMENT.total_nurse_time || ' minutes')::INTERVAL; -- Find the first available row for the required unit and nurse time. select into _AVAILABLE unit.row_num from ( select m1.row_num from matrix m1, matrix m2 wherem1.unit_id = m2.unit_id and m1.nurse_id = m2.nurse_id and m1.unit_scheduled = false and m2.unit_scheduled = false and (_APPOINTMENT.reserved_time is null or m1.timeslot = _APPOINTMENT.reserved_time) and m2.timeslot between m1.timeslot and (m1.timeslot + _UNIT_REQUIRED) group by m1.row_num having count(m2.row_num) = _UNIT_INTERVALS + 1 ) unit, ( select m1.row_num from matrix m1, matrix m2 wherem1.unit_id = m2.unit_id and m1.nurse_id = m2.nurse_id and m1.nurse_scheduled = false and m2.nurse_scheduled = false and (_APPOINTMENT.reserved_time is null or m1.timeslot = _APPOINTMENT.reserved_time) and m2.timeslot between m1.timeslot and (m1.timeslot + _NURSE_REQUIRED) group by m1.row_num having count(m1.row_num) = _NURSE_INTERVALS + 1 ) nurse wherenurse.row_num = unit.row_num order by unit.row_num limit 1; -- Assign the time, unit, and nurse to the unscheduled appointment. update matrix_unscheduled set appt_time = matrix.timeslot, unit_id = matrix.unit_id, nurse_id = matrix.nurse_id, appt_status = 'S' from matrix whereschedule_appt_id = _APPOINTMENT.schedule_appt_id and matrix.row_num = _AVAILABLE; -- Mark the unit as scheduled for that time. update matrix set unit_scheduled = true from (select timeslot, unit_id from matrix where row_num = _AVAILABLE) m2
Re: [PERFORM] Speeding up a query.
Alberto Dalmaso dalm...@clesius.it writes: P.S.: i'm trying with all enable_* to on and pumping to higher values from_collapse_limit and join_collapse_limit that I've put to 30. The result is that the query, after an hour of work, goes out of memory (SQL State 53200)... Hmm, is that happening during planning (ie, do you get the same error if you just try to EXPLAIN the query with those settings)? If not, please show the EXPLAIN output. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance with query
That what i send is the quick execution, with other parameters this query simply doesn't come to an end. It is the little query that changing the settings (using the default with all the query analyzer on) becames really quick, while with this settings (with some analyzer switched off) became very slow. I don't belleve: using this settings set enable_hashjoin = 'on'; set enable_nestloop = 'on'; set enable_seqscan = 'on'; set enable_sort = 'on'; set from_collapse_limit = 8; set join_collapse_limit = 3; select * from v_fsa_2007_estrazione; finnally end in acceptable time (156 sec) what does it mean using join_collapse_limit = 3 (that is really a lot of object less that what i'm using in taht query). I'm executing an explain analyze in this new situation... It is possible that such a configuration can create performance problem on other queryes? (join_collapse_limit is set to a really low value) I'll made some test in this direction. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Speeding up a query.
Sorry, I missed this reponse. I'm entirely new to PostgreSQL and have yet to figure out how to use EXPLAIN ANALYZE on a function. I think I realize where the problem is though (the loop), I simply do not know how to fix it ;). Glpk and cbc, thanks, I'll look into those. You're right, the very nature of using a loop suggests that another tool might be more appropriate. Matthew Hartman Programmer/Analyst Information Management, ICP Kingston General Hospital (613) 549- x4294 -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Anthony Presley Sent: Tuesday, June 16, 2009 3:37 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Speeding up a query. On the DB side of things, you will want to make sure that your caching as much as possible - putting a front-end like memcached could help. I assume you have indexes on the appropriate tables? What does the EXPLAIN ANALYZE on that query look like? Not necessarily a postgres solution, but I'd think this type of solution would work really, really well inside of say a a mixed integer or integer solver ... something like glpk or cbc. You'd need to reformulate the problem, but we've built applications using these tools which can crunch through multiple billions of combinations in under 1 or 2 seconds. (Of course, you still need to store the results, and feed the input, using a database of some kind). -- Anthony Presley On Tue, 2009-06-16 at 14:35 -0400, Hartman, Matthew wrote: Good afternoon. I have developed an application to efficiently schedule chemotherapy patients at our hospital. The application takes into account several resource constraints (available chairs, available nurses, nurse coverage assignment to chairs) as well as the chair time and nursing time required for a regimen. The algorithm for packing appointments in respects each constraint and typically schedules a day of treatments (30-60) within 9-10 seconds on my workstation, down from 27 seconds initially. I would like to get it below 5 seconds if possible. I think what's slowing is down is simply the number of rows and joins. The algorithm creates a scheduling matrix with one row per 5 minute timeslot, per unit, per nurse assigned to the unit. That translates to 3,280 rows for the days I have designed in development (each day can change). To determine the available slots, the algorithm finds the earliest slot that has an available chair and a count of the required concurrent intervals afterwards. So a 60 minute regimen requires 12 concurrent rows. This is accomplished by joining the table on itself. A second query is ran for the same range, but with respect to the nurse time and an available nurse. Finally, those two are joined against each other. Effectively, it is: Select * From ( Select * From matrix m1, matrix m2 Where m1.x = m2.x ) chair, ( Select * From matrix m1, matrix m2 Where m1.x = m2.x ) nurse Where chair.id = nurse.id With matrix having 3,280 rows. Ugh. I have tried various indexes and clustering approachs with little success. Any ideas? Thanks, Matthew Hartman Programmer/Analyst Information Management, ICP Kingston General Hospital (613) 549- x4294 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance with query
Alberto Dalmaso dalm...@clesius.it wrote: what does it mean using join_collapse_limit = 3 http://www.postgresql.org/docs/8.3/interactive/runtime-config-query.html#RUNTIME-CONFIG-QUERY-OTHER -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Performance issue - 2 linux machines, identical configs, different performance
Hi, sorry about the blank post yesterday - let's try again We have two machines. Both running Linux Redhat, both running postgres 8.2.5. Both have nearly identical 125 GB databases. In fact we use PITR Recovery to Replicate from one to the other. The machine we replicate to runs a query with About 10 inner and left joins about 5 times slower than the original machine I run an explain on both. Machine1 (original) planner favors hash joins about 3 to 1 Over nested loop joins. Machine2 (replicated) uses only nested loop joins - no hash at all. A few details - I can always provide more MACHINE1 - original: TOTAL RAW MEMORY - 30 GB TOTAL SHARED MEMORY (shmmax value) - 4 GB Database configs SHARED_BUFFERS - 1525 MB MAX_PREPARED_TRANSACTIONS - 5 WORK_MEM - 300 MB MAINTENANCE_WORK_MEM - 512 MB MAX_FSM_PAGES -- 3,000,000 CHECKPOINT_SEGMENTS - 64 WAL_BUFFERS -768 EFFECTIVE_CACHE_SIZE 2 GB Planner method configs all turned on by default, including enable_hashjoin MACHINE2 - we run 2 postgres instances. Port 5433 runs continuous PITR recoveries Port 5432 receives the 'latest and greatest' database when port 5433 finishes a recovery TOTAL RAW MEMORY - 16 GB (this is a VMWARE setup on a netapp) TOTAL SHARED MEMORY (shmmax value) - 4 GB Database configs - port 5432 instance SHARED_BUFFERS 1500 MB MAX_PREPARED_TRANSACTIONS - 1 (we don't run prepared transactions here) WORK_MEM - 300 MB MAINTENANCE_WORK_MEM - 100 MB (don't think this comes into play in this conversation) MAX_FSM_PAGES -- 1,000,000 CHECKPOINT_SEGMENTS - 32 WAL_BUFFERS -768 EFFECTIVE_CACHE_SIZE 2 GB Planner method configs all turned on by default, including enable_hashjoin Database configs - port 5433 instance SHARED_BUFFERS 1500 MB MAX_PREPARED_TRANSACTIONS - 1 (we don't run prepared transactions here) WORK_MEM - 250 MB MAINTENANCE_WORK_MEM - 100 MB (don't think this comes into play in this conversation) MAX_FSM_PAGES -- 1,000,000 CHECKPOINT_SEGMENTS - 32 WAL_BUFFERS -768 EFFECTIVE_CACHE_SIZE 2 GB Planner method configs all turned on by default, including enable_hashjoin Now some size details about the 11 tables involved in the join All join fields are indexed unless otherwise noted and are of type integer unless otherwise noted TABLE1 -398 pages TABLE2 5,014 pages INNER JOIN on TABLE1 TABLE3 --- 34,729 pages INNER JOIN on TABLE2 TABLE4 1,828,000 pages INNER JOIN on TABLE2 TABLE5 1,838,000 pages INNER JOIN on TABLE4 TABLE6 -- 122,500 pages INNER JOIN on TABLE4 TABLE7 --- 621 pages INNER JOIN on TABLE6 TABLE8 -- 4 pages INNER JOIN on TABLE7 (TABLE7 column not indexed) TABLE9 --- 2 pages INNER JOIN on TABLE8 (TABLE8 column not indexed) TABLE10 - 13 pages LEFT JOIN on TABLE6 (columns on both tables text, neither column indexed) TABLE11 -1,976,430 pages LEFT JOIN on TABLE5. AND explicit join on TABLE6 The WHERE clause filters out primary key values from TABLE1 to 1 value and a 1 month range of Indexed dates from TABLE4. So, my guess is the disparity of performance (40 seconds vs 180 seconds) has to do with MACHINE2 not Availing itself of hash joins which by my understanding is much faster. Any help / insight appreciated. Thank you Mark Steben│Database Administrator│ @utoRevenue-R- Join the Revenue-tion 95 Ashley Ave. West Springfield, MA., 01089 413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax) @utoRevenue is a registered trademark and a division of Dominion Enterprises
Re: [PERFORM] Performance issue - 2 linux machines, identical configs, different performance
2009/6/17 Mark Steben mste...@autorevenue.com: A few details – I can always provide more Could you send: 1. Exact text of query. 2. EXPLAIN ANALYZE output on each machine. 3. VACUUM VERBOSE output on each machine, or at least the last 10 lines. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance issue - 2 linux machines, identical configs, different performance
We have two machines. Both running Linux Redhat, both running postgres 8.2.5. Both have nearly identical 125 GB databases. In fact we use PITR Recovery to Replicate from one to the other. I have to ask the obvious question. Do you regularly analyze the machine you replicate too? Dave -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance issue - 2 linux machines, identical configs, different performance
Yes I analyze after each replication. Mark Steben│Database Administrator│ @utoRevenue-R- Join the Revenue-tion 95 Ashley Ave. West Springfield, MA., 01089 413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax) @utoRevenue is a registered trademark and a division of Dominion Enterprises -Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Dave Dutcher Sent: Wednesday, June 17, 2009 1:39 PM To: 'Mark Steben'; pgsql-performance@postgresql.org Cc: 'Rich Garabedian' Subject: Re: [PERFORM] Performance issue - 2 linux machines, identical configs, different performance We have two machines. Both running Linux Redhat, both running postgres 8.2.5. Both have nearly identical 125 GB databases. In fact we use PITR Recovery to Replicate from one to the other. I have to ask the obvious question. Do you regularly analyze the machine you replicate too? Dave -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] very slow selects on a small table
There are 4 threads (4 postgres processes) loading all rows from a table with 50,018 rows. The table has a int8 PK that is incremented by 1 for each new row and the PK is used by the threads to partition the rows so that each loads distinct rows. As you can see below, these 4 SELECTs have been running since 6:30am (it's now 11:30am) -- sluggish at best -- and each of the postgres processes is using 100% CPU. The table schema is long (~160 cols), so I'm omitting it but will provide it if deemed necessary. Any ideas about the cause of this are appreciated. Thanks, Brian cemdb=# select procpid, xact_start, current_query from pg_stat_activity; procpid | xact_start | current_query -+---+ - 27825 | | IDLE 27826 | | IDLE 27824 | | IDLE 27828 | | IDLE 27827 | | IDLE 27829 | | IDLE 27830 | | IDLE 27831 | | IDLE 27832 | | IDLE 27833 | | IDLE 14031 | 2009-06-17 05:48:02.931503-07 | autovacuum: VACUUM ANALYZE public.ts_stats_transet_user_weekly 16044 | | IDLE 32169 | 2009-06-17 08:17:39.034142-07 | autovacuum: VACUUM ANALYZE public.ts_stats_transetgroup_user_weekly 7165 | | IDLE 16043 | | IDLE 22130 | 2009-06-17 11:22:05.339582-07 | select procpid, xact_start, current_query from pg_stat_activity; 7169 | 2009-06-17 06:31:26.997641-07 | select a.ts_id as id1_0_, a.version_info as versionI2_1_0_, a.ts_user_id as userId1_0_, null as tranSetId1_0_, null as tranUnitId1_0_, null as userGro10_1_0_, a.ts_transet_group_id as tranSetG7_1_0_, a.ts_last_aggregated_row as lastAgg12_1_0_, null as tranSetI5_1_0_, a.ts_user_incarnation_id as userInca9_1_0_, a.ts_interval_start_time as interva11_1_0_, a.ts_interval_wire_time as interva13_1_0_, a.ts_total_transactions as totalTr14_1_0_, a.ts_bad_transactions as badTran15_1_0_, a.ts_opportunities as opportu16_1_0_, a.ts_defects as defects1_0_, a.ts_data_type as dataType1_0_, a.ts_tth_type as tthType1_0_, a.ts_tth_lower_spec as tthLowe20_1_0_, a.ts_tth_upper_spec as tthUppe21_1_0_, a.ts_tth_b0 as tthB22_1_0_, a.ts_tth_b1 as tthB23_1_0_, a.ts_tth_b2 as tthB24_1_0_, a.ts_tth_b3 as tthB25_1_0_, a.ts_tth_b4 as tthB26_1_0_, a.ts_tth_b5 as tthB27_1_0_, a.ts_tth_b6 as tthB28_1_0_, a.ts_tth_b7 as tthB29_1_0_, a.ts_tth_b8 as tthB30_1_0_, a.ts_tth_b9 as tthB31_1_0_, a.ts_tth_b10 as tthB32_1_0_, a.ts_tth_b11 as tthB33_1 7171 | | IDLE 7172 | | IDLE 28106 | | IDLE 7392 | 2009-06-17 06:31:26.997985-07 | select a.ts_id as id1_0_, a.version_info as versionI2_1_0_, a.ts_user_id as userId1_0_, null as tranSetId1_0_, null as tranUnitId1_0_, null as userGro10_1_0_, a.ts_transet_group_id as tranSetG7_1_0_, a.ts_last_aggregated_row as lastAgg12_1_0_, null as tranSetI5_1_0_, a.ts_user_incarnation_id as userInca9_1_0_, a.ts_interval_start_time as interva11_1_0_, a.ts_interval_wire_time as interva13_1_0_, a.ts_total_transactions as totalTr14_1_0_, a.ts_bad_transactions as badTran15_1_0_, a.ts_opportunities as opportu16_1_0_, a.ts_defects as defects1_0_, a.ts_data_type as dataType1_0_, a.ts_tth_type as tthType1_0_, a.ts_tth_lower_spec as tthLowe20_1_0_, a.ts_tth_upper_spec as tthUppe21_1_0_, a.ts_tth_b0 as tthB22_1_0_, a.ts_tth_b1 as tthB23_1_0_, a.ts_tth_b2 as tthB24_1_0_, a.ts_tth_b3 as tthB25_1_0_, a.ts_tth_b4 as tthB26_1_0_, a.ts_tth_b5 as tthB27_1_0_, a.ts_tth_b6 as tthB28_1_0_, a.ts_tth_b7 as tthB29_1_0_, a.ts_tth_b8 as tthB30_1_0_,
Re: [PERFORM] Index Scan taking long time
The nested loops (which are due to the joins) don't seem to be part of the problem at all. The main time that is taken (actual time that is) is in this part: Index Scan using event_20090526_domain_idx on event_20090526 e (cost=0.00..10694.13 rows=3606 width=1276) (actual time=50.233..14305.211 rows=3453 loops=1) Index Cond: (e.domain_id = d.id) Which is the leaf node in the query plan, the total time for the query being: Total runtime: 14380.000 ms And as I said once that query is run once it then does the same query plan and has this output for the same leaf node above: Index Scan using event_20090526_domain_idx on event_20090526 e (cost=0.00..10694.13 rows=3606 width=1276) (actual time=0.027..7.510 rows=3453 loops=1) Index Cond: (e.domain_id = d.id) So it seems to me that once the index is in memory everything is fine with the world, but the loading of the index into memory is horrendous. Tom Lane wrote: Scott Marlowe scott.marl...@gmail.com writes: Without looking at the explain just yet, it seems to me that you are constraining the order of joins to insist that the left joins be done first, then the regular joins second, because of your mix of explicit and implicit join syntax. The query planner is constrained to run explicit joins first, then implicit if I remember correctly. That isn't true as of recent releases (8.2 and up, I think). It is true that there are semantic constraints that prevent certain combinations of inner and outer joins from being rearranged ... but if that applies here, it would also prevent manual rearrangement, unless the OP decides that this query doesn't express quite what he meant. regards, tom lane -- *Bryce Ewing *| Platform Architect *DDI:* +64 9 950 2195 *Fax:* +64 9 302 0518 *Mobile:* +64 21 432 293 *Freephone:* 0800 SMX SMX (769 769) Level 11, 290 Queen Street, Auckland, New Zealand | SMX Ltd | smx.co.nz http://smx.co.nz SMX | Business Email Specialists The information contained in this email and any attachments is confidential. If you are not the intended recipient then you must not use, disseminate, distribute or copy any information contained in this email or any attachments. If you have received this email in error or you are not the originally intended recipient please contact SMX immediately and destroy this email. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] enum for performance?
I have a column which only has six states or values. Is there a size advantage to using an enum for this data type? Currently I have it defined as a character(1). This table has about 600 million rows, so it could wind up making a difference in total size. Thanks, Whit -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] enum for performance?
Whit Armstrong armstrong.w...@gmail.com writes: I have a column which only has six states or values. Is there a size advantage to using an enum for this data type? Currently I have it defined as a character(1). Nope. enums are always 4 bytes. char(1) is going to take 2 bytes (assuming those six values are simple ASCII characters), at least as of PG 8.3 or later. Depending on what the adjacent columns are, the enum might not actually cost you anything --- the difference might well disappear into alignment padding anyway. But it's not going to save. Another possibility is to look at the char (not char) type, which also stores single ASCII-only characters. That's just one byte. But again, it might well not save you anything, depending on alignment considerations. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Brian Cox brian@ca.com writes: [r...@rdl64xeoserv01 log]# strace -p 7397 Process 7397 attached - interrupt to quit munmap(0x95393000, 1052672) = 0 munmap(0x95494000, 528384) = 0 munmap(0x95515000, 266240) = 0 brk(0x8603000) = 0x8603000 brk(0x85fb000) = 0x85fb000 _llseek(144, 0, [292618240], SEEK_END) = 0 brk(0x85eb000) = 0x85eb000 _llseek(65, 897179648, [897179648], SEEK_SET) = 0 read(65, \276\2\0\0\320\337\275\315\1\0\0\0|\3`\22\360\37\4 \0\0..., 8192) = 8192 _llseek(65, 471457792, [471457792], SEEK_SET) = 0 read(65, \276\2\0\0\320\337\275\315\1\0\0\0t\6\200\6\360\37\4 \0..., 8192) = 8192 read(65, \276\2\0\0\354\271\355\312\1\0\0\0\374\5`\10\360\37\4 ..., 8192) = 8192 read(65, \0\0\0\0\0\0\0\0\1\0\0\0\324\5\0\t\360\37\4 \0\0\0\0\0..., 8192) = 8192 brk(0x8613000) = 0x8613000 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95515000 mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95494000 mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95393000 munmap(0x95393000, 1052672) = 0 munmap(0x95494000, 528384) = 0 munmap(0x95515000, 266240) = 0 [ lather, rinse, repeat ] That is a pretty odd trace for a Postgres backend; apparently it's repeatedly acquiring and releasing a meg or two worth of memory, which is not very normal within a single query. Can you tell us more about the query it's running? An EXPLAIN plan would be particularly interesting. Also, could you try to determine which files 144 and 65 refer to (see lsof)? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Index Scan taking long time
Bryce Ewing br...@smx.co.nz writes: So it seems to me that once the index is in memory everything is fine with the world, but the loading of the index into memory is horrendous. So it would seem. What's the disk hardware on this machine? It's possible that part of the problem is table bloat, leading to the indexscan having to fetch many more pages than it would if the table were more compact. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Tom Lane [...@sss.pgh.pa.us] wrote: That is a pretty odd trace for a Postgres backend; apparently it's repeatedly acquiring and releasing a meg or two worth of memory, which is not very normal within a single query. Can you tell us more about the query it's running? An EXPLAIN plan would be particularly interesting. Also, could you try to determine which files 144 and 65 refer to (see lsof)? Here's the explain and a current strace and lsof. The strace shows even less I/O activity. Thanks, Brian cemdb=# explain select * from ts_stats_transetgroup_user_daily a where a.ts_id in (select b.ts_id from ts_stats_transetgroup_user_daily b,ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = '2009-6-16 01:00' and c.ts_interval_start_time '2009-6-16 02:00'); QUERY PLAN -- Hash IN Join (cost=128162.63..211221.43 rows=231127 width=779) Hash Cond: (a.ts_id = b.ts_id) - Seq Scan on ts_stats_transetgroup_user_daily a (cost=0.00..80511.26 rows=247451 width=779) - Hash (cost=126718.09..126718.09 rows=231127 width=8) - Hash Join (cost=82370.45..126718.09 rows=231127 width=8) Hash Cond: ((m.ts_transet_group_id = b.ts_transet_group_id) AND (c.ts_user_incarnation_id = b.ts_user_incarnation_id)) - Hash Join (cost=3.32..27316.61 rows=211716 width=16) Hash Cond: (c.ts_transet_incarnation_id = m.ts_transet_incarnation_id) - Index Scan using ts_stats_transet_user_interval_starttime on ts_stats_transet_user_interval c (cost=0.00..25857.75 rows=211716 width=16) Index Cond: ((ts_interval_start_time = '2009-06-16 01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2009-06-16 02:00:00-07'::timestamp with time zone)) - Hash (cost=2.58..2.58 rows=117 width=16) - Seq Scan on ts_transetgroup_transets_map m (cost=0.00..2.58 rows=117 width=16) - Hash (cost=80511.26..80511.26 rows=247451 width=24) - Seq Scan on ts_stats_transetgroup_user_daily b (cost=0.00..80511.26 rows=247451 width=24) (14 rows) [r...@rdl64xeoserv01 log]# strace -p 7397 Process 7397 attached - interrupt to quit mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95515000 mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95494000 mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95393000 munmap(0x95393000, 1052672) = 0 munmap(0x95494000, 528384) = 0 munmap(0x95515000, 266240) = 0 brk(0x8603000) = 0x8603000 brk(0x85fb000) = 0x85fb000 _llseek(164, 0, [201940992], SEEK_END) = 0 brk(0x85eb000) = 0x85eb000 brk(0x8613000) = 0x8613000 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95515000 mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95494000 mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95393000 munmap(0x95393000, 1052672) = 0 munmap(0x95494000, 528384) = 0 munmap(0x95515000, 266240) = 0 brk(0x8603000) = 0x8603000 brk(0x85fb000) = 0x85fb000 _llseek(164, 0, [201940992], SEEK_END) = 0 brk(0x85eb000) = 0x85eb000 brk(0x8613000) = 0x8613000 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95515000 mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95494000 mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95393000 munmap(0x95393000, 1052672) = 0 munmap(0x95494000, 528384) = 0 munmap(0x95515000, 266240) = 0 brk(0x8603000) = 0x8603000 brk(0x85fb000) = 0x85fb000 _llseek(164, 0, [201940992], SEEK_END) = 0 brk(0x85eb000) = 0x85eb000 brk(0x8613000) = 0x8613000 mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95515000 mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95494000 mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x95393000 munmap(0x95393000, 1052672) = 0 munmap(0x95494000,
Re: [PERFORM] very slow selects on a small table
Brian Cox brian@ca.com writes: Here's the explain and a current strace and lsof. The strace shows even less I/O activity. cemdb=# explain select * from ts_stats_transetgroup_user_daily a where a.ts_id in (select b.ts_id from ts_stats_transetgroup_user_daily b,ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = '2009-6-16 01:00' and c.ts_interval_start_time '2009-6-16 02:00'); Um, are you sure that is the query that PID 7397 is running? It doesn't match your previous pg_stat_activity printout, nor do I see anything about partitioning by PKs. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Tom Lane [...@sss.pgh.pa.us] wrote: Um, are you sure that is the query that PID 7397 is running? It doesn't match your previous pg_stat_activity printout, nor do I see anything about partitioning by PKs. Umm, indeed. I had to construct the query by hand and left out the partition part. Here's the full query. Also, I took the liberty of reducing the completely expanded column list (shown in part in the pg_stat_activity printout) in the actual query to *. Thanks, Brian cemdb=# explain select * from ts_stats_transetgroup_user_daily a where a.ts_id in (select b.ts_id from ts_stats_transetgroup_user_daily b,ts_stats_transet_user_interval c, ts_transetgroup_transets_map m where b.ts_transet_group_id = m.ts_transet_group_id and m.ts_transet_incarnation_id = c.ts_transet_incarnation_id and c.ts_user_incarnation_id = b.ts_user_incarnation_id and c.ts_interval_start_time = '2009-6-16 01:00' and c.ts_interval_start_time '2009-6-16 02:00') and a.ts_id 0 and a.ts_id 10 order by a.ts_id; QUERY PLAN Nested Loop IN Join (cost=82370.45..128489.59 rows=1 width=779) Join Filter: (b.ts_id = a.ts_id) - Index Scan using ts_stats_transetgroup_user_daily_pkey on ts_stats_transetgroup_user_daily a (cost=0.00..8.22 rows=1 width=779) Index Cond: ((ts_id 0) AND (ts_id 10)) - Hash Join (cost=82370.45..127026.87 rows=232721 width=8) Hash Cond: ((m.ts_transet_group_id = b.ts_transet_group_id) AND (c.ts_user_incarnation_id = b.ts_user_incarnation_id)) - Hash Join (cost=3.32..27507.92 rows=213176 width=16) Hash Cond: (c.ts_transet_incarnation_id = m.ts_transet_incarnation_id) - Index Scan using ts_stats_transet_user_interval_starttime on ts_stats_transet_user_interval c (cost=0.00..26039.02 rows=213176 width=16) Index Cond: ((ts_interval_start_time = '2009-06-16 01:00:00-07'::timestamp with time zone) AND (ts_interval_start_time '2009-06-16 02:00:00-07'::timestamp with time zone)) - Hash (cost=2.58..2.58 rows=117 width=16) - Seq Scan on ts_transetgroup_transets_map m (cost=0.00..2.58 rows=117 width=16) - Hash (cost=80511.26..80511.26 rows=247451 width=24) - Seq Scan on ts_stats_transetgroup_user_daily b (cost=0.00..80511.26 rows=247451 width=24) (14 rows) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Brian Cox brian@ca.com writes: Tom Lane [...@sss.pgh.pa.us] wrote: Um, are you sure that is the query that PID 7397 is running? It doesn't match your previous pg_stat_activity printout, nor do I see anything about partitioning by PKs. Umm, indeed. I had to construct the query by hand and left out the partition part. Here's the full query. Also, I took the liberty of reducing the completely expanded column list (shown in part in the pg_stat_activity printout) in the actual query to *. Okay ... I think the problem is right here: Nested Loop IN Join (cost=82370.45..128489.59 rows=1 width=779) Join Filter: (b.ts_id = a.ts_id) - Index Scan using ts_stats_transetgroup_user_daily_pkey on ts_stats_transetgroup_user_daily a (cost=0.00..8.22 rows=1 width=779) Index Cond: ((ts_id 0) AND (ts_id 10)) - Hash Join (cost=82370.45..127026.87 rows=232721 width=8) It's choosing this plan shape because it thinks that the indexscan on ts_stats_transetgroup_user_daily will return only one row, which I bet is off by something close to 10x. The memory usage pulsation corresponds to re-executing the inner hash join, from scratch (including rebuilding its hash table) for each outer row. Ouch. This seems like kind of a stupid plan anyway (which PG version was this exactly?) but certainly the big issue is the catastrophically bad rowcount estimate for the indexscan. Do you have ANALYZE stats for ts_stats_transetgroup_user_daily at all (look in pg_stats)? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Tom Lane [...@sss.pgh.pa.us] wrote: This seems like kind of a stupid plan anyway (which PG version was this exactly?) but certainly the big issue is the catastrophically bad rowcount estimate for the indexscan. Do you have ANALYZE stats for ts_stats_transetgroup_user_daily at all (look in pg_stats)? postgres 8.3.5. Yes, here's a count(*) from pg_stats: cemdb=# select count(*) from pg_stats where tablename='ts_stats_transetgroup_user_daily'; count --- 186 (1 row) Thanks, Brian -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Brian Cox brian@ca.com writes: Tom Lane [...@sss.pgh.pa.us] wrote: ... Do you have ANALYZE stats for ts_stats_transetgroup_user_daily at all (look in pg_stats)? postgres 8.3.5. Yes, here's a count(*) from pg_stats: 186 OK, so what's the entry for column ts_id? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Tom Lane [...@sss.pgh.pa.us] wrote: OK, so what's the entry for column ts_id? Is this what you requested? Brian cemdb=# select * from pg_stats where tablename='ts_stats_transetgroup_user_daily' and attname = 'ts_id'; schemaname |tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation +--+-+---+---++--+---+ -- +- public | ts_stats_transetgroup_user_daily | ts_id | 0 | 8 | -1 | | | {61,602537,605139,607918,610330,613206,615829,618440,621018,623430,625887,628165,630571,633290,636434,638845,641276,643702,645978,648385,650648,653220,655602,658138,660613,663114,665750,668440,670859,673162,675597,678199,681054,683455,686049,688753,691231,693942,696229,698598,6000101190,6000103723,6000105917,6000108273,6000110687,6000113114,6000115528,6000118024,6000121085,6000123876,6000126548,6000128749,6 000131260,6000133668,6000135988,6000138755,6000141251,6000143855,6000146302,6000148963,6000151424,6000153772,6000156222,6000159005,6000161293,6000163783,6000166624,6000168913,6000171220,6000173349,6000175584,6000177882,6000180605,6000183207,6000185420,6000187949,6000190128,6000192738,6000195452,6000197843,6000200173,6000202838,6000205245,6000207579,6000210566,6000212935,6000215382,6000218095,6000220940,6000223634,6000226196,6000228596,6000230733,6000232988,6000235066,6000237064,6000239736,6000242470,6000244915,6000247102,6000250068} | 0.94954 (1 row) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] very slow selects on a small table
Brian Cox brian@ca.com writes: Tom Lane [...@sss.pgh.pa.us] wrote: OK, so what's the entry for column ts_id? Is this what you requested? Brian Yup. So according to those stats, all ts_id values fall in the range 61 .. 6000250068. It's no wonder it's not expecting to find anything between 0 and 10. I think maybe you forgot to re-analyze after loading data ... although this being 8.3, I'd have expected autovacuum to update the stats at some point ... Recommendation: re-ANALYZE, check that the plan changes to something with a higher estimate for the number of rows for this table, and then abort and restart those processes. Lord knows how long you'll be waiting for them to finish with their current plans :-( regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance