Re: [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Charles Duffy [EMAIL PROTECTED] writes: The patch puts a CHECK_FOR_INTERRUPTS in qsort_comparetup. Just to close out this thread: this is now done for 8.2, per discussion here: http://archives.postgresql.org/pgsql-hackers/2006-10/msg00144.php regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
On 7/15/06, Tom Lane [EMAIL PROTECTED] wrote: Anyway, Qingqing's question still needs to be answered: how can a sort of under 30k items take so long? It happens because (as previously suggested by Tom) the dataset for the 'short' (~10k rows, .3 sec) sort has no rows whose leftmost fields evaluate to 'equal' when passed to the qsort compare function. The 'long' sort, (~30k rows, 78 sec) has plenty of rows whose first 6 columns all evaluate as 'equal' when the rows are compared. For the 'long' data, the compare moves on rightward until it encounters 'flato', which is a TEXT column with an average length of 7.5k characters (with some rows up to 400k). The first 6 columns are mostly INTEGER, so compares on them are relatively inexpensive. All the expensive compares on 'flato' account for the disproportionate difference in sort times, relative to the number of rows in each set. As for the potential for memory leaks - thinking about it. Thanks, Charles Duffy. Peter Eisentraut [EMAIL PROTECTED] writes: The merge sort is here: http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/msort.c?rev=1.21content-type=text/x-cvsweb-markupcvsroot=glibc It uses alloca, so we're good here. Uh ... but it also uses malloc, and potentially a honkin' big malloc at that (up to a quarter of physical RAM). So I'm worried again. Anyway, Qingqing's question still needs to be answered: how can a sort of under 30k items take so long? regards, tom lane Column | Type | Modifiers ---+-+--- record| integer | commr1| integer | envr1 | oid | docin | integer | creat | integer | flati | text| flato | text| doc | text| docst | integer | vlord | integer | vl0 | integer | vl1 | date| vl2 | text| vl3 | text| vl4 | text| vl5 | text| vl6 | text| vl7 | date| vl8 | text| vl9 | integer | ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Charles Duffy [EMAIL PROTECTED] writes: ... For the 'long' data, the compare moves on rightward until it encounters 'flato', which is a TEXT column with an average length of 7.5k characters (with some rows up to 400k). The first 6 columns are mostly INTEGER, so compares on them are relatively inexpensive. All the expensive compares on 'flato' account for the disproportionate difference in sort times, relative to the number of rows in each set. Yeah, and it's not just that it's text either. At those sizes, all the values will be toasted, which means each compare is paying the price of fetching multiple rows from the toast table. And decompressing them too, no doubt. These costs are most likely swamping the actual strcoll() (not that that's not bad enough compared to int4cmp). We could probably tweak the sorting code to forcibly detoast sort keys before beginning the sort, but I'm not entirely convinced that would be a win: reading and writing enormous sort keys won't be cheap either. Meanwhile, for a cheap solution: do you really need to sort on flato at all? Maybe sorting on substr(flato,1,100) would be good enough? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Qingqing, On 7/12/06, Qingqing Zhou [EMAIL PROTECTED] wrote: How long is that unacceptably long time? 30 seconds. the problem here is that 29247 doesn't look like a big number so I can't see why your patch solved the problem, unless the qsort_comparetup() function of the data type eats too many circles or the cpu is too slow. Well...when exhibiting the problem, execution stays inside qsort() for the entire 'delay period' - I don't think its off in some other recess which is also lacking in interrupt flag checking. As for the CPU - the machine is a 4 way Opteron, and otherwise performs well. It does seem odd that the in-memory sort takes as long as it does - the plan may suggest a reason. And - the patched version doesn't exhibit the problem. I just did a test to invoke a qsort on an integer field of a table with 5 million rows, and sent a SIGINT, the delay is 7 or 8 seconds. I suspect there are some other places doesn't check interrupts -- what's your query plan? Plan attached. Thanks, Charles Duffy db1=# explain analyze SELECT DISTINCT ON (NG.vl1, creat, NG.flati) NG.record, NG.commr1, NG.docst, NG.docin, NG.doc, NG.flato, NG.flati AS flati, NG.creat AS creat, reportxm.id, reportxm.hide, reportxm.read, reportxm.cc, reportxm.pending, reportxm.comment, reportxm.envr1, reportxm.time AS time FROM record_view AS ft, reportxm, record_view AS ft0 WHERE reportxm.regcompany=261333178 AND reportxm.hide=FALSE AND reportxm.envr1=NG.envr1 AND ft0.flati=NG.flati AND ((FALSE) OR ( ft0.vl0 IN (SELECT subst0.id FROM mlist1a AS subst0 WHERE (((subst0.lastname ILIKE '%monthly%')) OR ((subst0.surname ILIKE '%monthly%')) OR ((subst0.company ILIKE '%monthly%')) OR ((subst0.company_short ILIKE '%monthly%' AND ft0.vl0 IN (SELECT subst0.id FROM mlist1a AS subst0 WHERE (((subst0.lastname ILIKE '%report%')) OR ((subst0.surname ILIKE '%report%')) OR ((subst0.company ILIKE '%report%')) OR ((subst0.company_short ILIKE '%report%') OR (((ft0.vl2 ILIKE '%monthly%' AND ft0.vl2 ILIKE '%report%' AND NG.docst=1 ORDER BY NG.vl1 DESC, creat DESC LIMIT 1000 OFFSET 0; QUERY PLAN --- Limit (cost=133779.07..135568.96 rows=71 width=171) (actual time=80224.317..80224.744 rows=19 loops=1) - Unique (cost=133779.07..135568.96 rows=71 width=171) (actual time=80224.307..80224.676 rows=19 loops=1) - Sort (cost=133779.07..134226.54 rows=178989 width=171) (actual time=80224.300..80224.430 rows=91 loops=1) Sort Key: NG.vl1, NG.creat, NG.flati - Hash Join (cost=4938.63..118162.72 rows=178989 width=171) (actual time=79708.318..80222.796 rows=91 loops=1) Hash Cond: ((outer.envr1)::oid = inner.envr1) - Bitmap Heap Scan on reportxm (cost=530.88..110615.56 rows=28411 width=51) (actual time=581.525..1086.185 rows=105001 loops=1) Recheck Cond: (regcompany = 261333178) Filter: (NOT hide) - Bitmap Index Scan on reportxm_reg_index (cost=0.00..530.88 rows=56822 width=0) (actual time=103.483..103.483 rows=176807 loops=1) Index Cond: (regcompany = 261333178) - Hash (cost=4404.60..4404.60 rows=1260 width=124) (actual time=78972.309..78972.309 rows=20 loops=1) - Hash Join (cost=3344.68..4404.60 rows=1260 width=124) (actual time=78765.710..78972.200 rows=20 loops=1) Hash Cond: (outer.flati = inner.flati) - Subquery Scan ft0 (cost=2936.65..3966.22 rows=3550 width=32) (actual time=78211.051..78417.354 rows=20 loops=1) Filter: (((hashed subplan) AND (hashed subplan)) OR ((vl2 ~~* '%monthly%'::text) AND (vl2 ~~* '%report%'::text))) - Unique (cost=2909.44..3654.99 rows=14201 width=320) (actual time=78196.706..78364.502 rows=29247 loops=1) - Sort (cost=2909.44..2944.94 rows=14201 width=320) (actual time=78196.698..78239.799 rows=29247 loops=1) Sort Key: record, commr1, envr1, docin, creat, flati, flato, doc, docst, vlord, vl0, vl1, vl2, vl3, vl4, vl5, vl6, vl7, vl8, vl9 - Append (cost=0.00..1930.02 rows=14201 width=320) (actual time=0.052..396.577 rows=29247 loops=1) - Subquery Scan *SELECT* 1 (cost=0.00..872.30 rows=3965 width=320) (actual time=0.047..164.191 rows=16382
Re: [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Charles Duffy [EMAIL PROTECTED] writes: On 7/12/06, Qingqing Zhou [EMAIL PROTECTED] wrote: the problem here is that 29247 doesn't look like a big number so I can't see why your patch solved the problem, unless the qsort_comparetup() function of the data type eats too many circles or the cpu is too slow. Well...when exhibiting the problem, execution stays inside qsort() for the entire 'delay period' - I don't think its off in some other recess which is also lacking in interrupt flag checking. As for the CPU - the machine is a 4 way Opteron, and otherwise performs well. It does seem odd that the in-memory sort takes as long as it does - the plan may suggest a reason. Well, there's something awfully fishy here. Compare the two lower-level sorts in your EXPLAIN output: - Sort (cost=2909.44..2944.94 rows=14201 width=320) (actual time=78196.698..78239.799 rows=29247 loops=1) Sort Key: record, commr1, envr1, docin, creat, flati, flato, doc, docst, vlord, vl0, vl1, vl2, vl3, vl4, vl5, vl6, vl7, vl8, vl9 - Append (cost=0.00..1930.02 rows=14201 width=320) (actual time=0.052..396.577 rows=29247 loops=1) - Sort (cost=403.42..403.59 rows=71 width=320) (actual time=318.727..339.305 rows=10932 loops=1) Sort Key: record, commr1, envr1, docin, creat, flati, flato, doc, docst, vlord, vl0, vl1, vl2, vl3, vl4, vl5, vl6, vl7, vl8, vl9 - Append (cost=78.88..401.23 rows=71 width=320) (actual time=5.197..192.868 rows=10932 loops=1) The first one took 77843.222 msec to sort 29247 rows, the second took 146.437 msec to sort 10932 rows with what appears to be the same sort key. One of these things is not like the other ... What are the data types of the sort columns? Is there anything much different in the statistics of the two subqueries --- for example, maybe one produces a lot of output that only differs in the rightmost sort columns, where the other has keys that always differ in a leading column? I forget if you are running 8.1 or not, but if you are, turning on trace_sort might produce useful information. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Charles Duffy [EMAIL PROTECTED] writes: [ CHECK_FOR_INTERRUPTS inside qsort comparison routine ] It occurs to me that there's a nonzero risk of problems here, because if the interrupt occurs qsort() will lose control. I'm wondering whether there are any implementations of qsort() that allocate memory and then release it before returning. If so, interrupting the sort would lead to memory leaked permanently (till backend exit anyway). We might have to just tolerate this, but if it occurs on a lot of platforms I'd have second thoughts about applying the patch. Anyone familiar with the internals of glibc's qsort, in particular? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Tom Lane wrote: We might have to just tolerate this, but if it occurs on a lot of platforms I'd have second thoughts about applying the patch. Anyone familiar with the internals of glibc's qsort, in particular? Doesn't look like it's allocating any nonlocal memory: http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/qsort.c?rev=1.12content-type=text/x-cvsweb-markupcvsroot=glibc -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Peter Eisentraut [EMAIL PROTECTED] writes: Tom Lane wrote: We might have to just tolerate this, but if it occurs on a lot of platforms I'd have second thoughts about applying the patch. Anyone familiar with the internals of glibc's qsort, in particular? Doesn't look like it's allocating any nonlocal memory: http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/qsort.c?rev=1.12content-type=text/x-cvsweb-markupcvsroot=glibc But this file defines _quicksort() not qsort(). I was under the impression that the latter is actually a mergesort in glibc ... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Tom Lane wrote: Doesn't look like it's allocating any nonlocal memory: http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/qsort.c?rev=1. 12content-type=text/x-cvsweb-markupcvsroot=glibc But this file defines _quicksort() not qsort(). I was under the impression that the latter is actually a mergesort in glibc ... The merge sort is here: http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/msort.c?rev=1.21content-type=text/x-cvsweb-markupcvsroot=glibc It uses alloca, so we're good here. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] putting CHECK_FOR_INTERRUPTS in qsort_comparetup()
Charles Duffy [EMAIL PROTECTED] wrote We came up with this patch in response to a problem reported to us by a client. They had a query which took an unacceptably long time to respond to a cancel request (SIGINT). The client uses 8.1.4, so the patch is against that. How long is that unacceptably long time? Their work_mem setting was rather large (100). We determined that when it received SIGINT, the backend was always inside qsort(), so it wouldn't call ProcessInterrupts() again until it finished this large in-memory sort. Upon entering tuplesort_performsort(), state-memtupcount was 29247. I agree that we may need to consider to let qsort() check interrupts, but the problem here is that 29247 doesn't look like a big number so I can't see why your patch solved the problem, unless the qsort_comparetup() function of the data type eats too many circles or the cpu is too slow. I just did a test to invoke a qsort on an integer field of a table with 5 million rows, and sent a SIGINT, the delay is 7 or 8 seconds. I suspect there are some other places doesn't check interrupts -- what's your query plan? Regards, Qingqing ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq