On 7/15/06, Tom Lane <[EMAIL PROTECTED]> wrote:
Anyway, Qingqing's question still needs to be answered: how can a sort of under 30k items take so long?
It happens because (as previously suggested by Tom) the dataset for the 'short' (~10k rows, .3 sec) sort has no rows whose leftmost fields evaluate to 'equal' when passed to the qsort compare function. The 'long' sort, (~30k rows, 78 sec) has plenty of rows whose first 6 columns all evaluate as 'equal' when the rows are compared. For the 'long' data, the compare moves on rightward until it encounters 'flato', which is a TEXT column with an average length of 7.5k characters (with some rows up to 400k). The first 6 columns are mostly INTEGER, so compares on them are relatively inexpensive. All the expensive compares on 'flato' account for the disproportionate difference in sort times, relative to the number of rows in each set. As for the potential for memory leaks - thinking about it. Thanks, Charles Duffy.
Peter Eisentraut <[EMAIL PROTECTED]> writes: > The merge sort is here: > http://sourceware.org/cgi-bin/cvsweb.cgi/libc/stdlib/msort.c?rev=1.21&content-type=text/x-cvsweb-markup&cvsroot=glibc > It uses alloca, so we're good here. Uh ... but it also uses malloc, and potentially a honkin' big malloc at that (up to a quarter of physical RAM). So I'm worried again. Anyway, Qingqing's question still needs to be answered: how can a sort of under 30k items take so long? regards, tom lane
Column | Type | Modifiers -----------+---------+----------- record | integer | commr1 | integer | envr1 | oid | docin | integer | creat | integer | flati | text | flato | text | doc | text | docst | integer | vlord | integer | vl0 | integer | vl1 | date | vl2 | text | vl3 | text | vl4 | text | vl5 | text | vl6 | text | vl7 | date | vl8 | text | vl9 | integer |
---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org