Re: qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Ron
At 08:21 PM 2/15/2006, Tom Lane wrote: Ron <[EMAIL PROTECTED]> writes: > How are we choosing our pivots? See qsort.c: it looks like median of nine equally spaced inputs (ie, the 1/8th points of the initial input array, plus the end points), implemented as two rounds of median-of-three choices.

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Neil Conway
On Wed, 2006-02-15 at 18:28 -0500, Tom Lane wrote: > It seems clear that our qsort.c is doing a pretty awful job of picking > qsort pivots, while glibc is mostly managing not to make that mistake. > I haven't looked at the glibc code yet to see what they are doing > differently. glibc qsort is act

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 19:59 -0500, Tom Lane wrote: > I get > amazingly stable runtimes now --- I didn't have the patience to run 100 > trials, but in 30 trials I have slowest 11538 msec and fastest 11144 msec. > So this code path is definitely not very sensitive to this data > distribution. "The

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 20:00 +, Gary Doades wrote: > I have put together a test case that demonstrates the problem (see > below). I create a simple table, as close in structure to one of my > problem tables and populate an integer column with 100,000 zeros follow > by 100,000 random integers

Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Christopher Kings-Lynne
Ouch! That confirms my problem. I generated the random test case because it was easier than including the dump of my tables, but you can appreciate that tables 20 times the size are basically crippled when it comes to creating an index on them. I have to say that I restored a few gigabyte dum

Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Ron <[EMAIL PROTECTED]> writes: > How are we choosing our pivots? See qsort.c: it looks like median of nine equally spaced inputs (ie, the 1/8th points of the initial input array, plus the end points), implemented as two rounds of median-of-three choices. With half of the data inputs zero, it's n

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
I wrote: > Gary Doades <[EMAIL PROTECTED]> writes: >> Ouch! That confirms my problem. I generated the random test case because >> it was easier than including the dump of my tables, but you can >> appreciate that tables 20 times the size are basically crippled when it >> comes to creating an ind

Re: qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Ron
This behavior is consistent with the pivot choosing algorithm assuming certain distribution(s) for the data. For instance, median-of-three partitioning is known to be pessimal when the data is geometrically or hyper-geometrically distributed. Also, care must be taken that sometimes is not whe

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes: > Ouch! That confirms my problem. I generated the random test case because > it was easier than including the dump of my tables, but you can > appreciate that tables 20 times the size are basically crippled when it > comes to creating an index on them. Ac

Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes: > Is this likely to hit me in a random fashion during normal operation, > joins, sorts, order by for example? Yup, anytime you're passing data with that kind of distribution through a sort. > So the options are: > 1) Fix the included qsort.c code and use t

Re: qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Gary Doades
Tom Lane wrote: For some reason I hadn't immediately twigged to the fact that your test script is just N repetitions of the exact same structure with random data. So it's not so surprising that you get random variations in behavior with different test data sets. > It seems clear that our qsort

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 16:51 -0500, Tom Lane wrote: > Gary Doades <[EMAIL PROTECTED]> writes: > > Interestingly, if I don't delete the table after a run, but just drop > > and re-create the index repeatedly it stays a pretty consistent time, > > either repeatedly good or repeatedly bad! > > This

qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes: > If I run the script again, it is not always the first case that is slow, > it varies from run to run, which is why I repeated it quite a few times > for the test. For some reason I hadn't immediately twigged to the fact that your test script is just N re

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades
Tom Lane wrote: So it sure looks like this script does expose a problem on BSD-derived qsorts. Curiously, the case that's much the worst for me is the third in the script, while the shortest time is the first case, which was slow for Gary. So I'd venture that the *BSD code has been tweaked som

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes: > Interestingly, if I don't delete the table after a run, but just drop > and re-create the index repeatedly it stays a pretty consistent time, > either repeatedly good or repeatedly bad! This is consistent with the theory of a data-dependent performance p

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Josh Rovero
Jeremy Haile wrote: Thanks for everyone's feedback. I will definitely take the hardware comments into consideration when purchasing future hardware. I am located in Atlanta, GA. If Dell has such a bad reputation with this list, does anyone have good vendor recommendations? I can recomme

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes: > Please enable trace_sort=on and then repeat tests and post the > accompanying log file. I did this on my Fedora machine with port/qsort.c, and got the results attached. Curiously, this run has the spikes in completely different places than the prior one d

Re: [PERFORM] Postgres slower than MS ACCESS

2006-02-15 Thread Tom Lane
"Jay Greenfield" <[EMAIL PROTECTED]> writes: > I did a run with checkpoint_segments @ 30 (from 3 in 4.35 min run) and > posted a time of 6.78 minutes. Any idea why this would increase the time? The first time through might take longer while the machine creates empty xlog segment files (though I'd

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades
Tom Lane wrote: > I tried forcing PG to use src/port/qsort.c on the Fedora machine, and lo and behold: new Fedora 4 machine: shortest 434 msec, longest 8530 msec So it sure looks like this script does expose a problem on BSD-derived qsorts. Curiously, the case that's much the worst for

Re: [PERFORM] Postgres slower than MS ACCESS

2006-02-15 Thread Jay Greenfield
I've been vacuuming between each test run. Not vacuuming results in times all the way up to 121 minutes. For a direct comparison with Access, the vacuuming time with Postgres should really be included as this is not required with Access. By removing all of the indexes I have been able to get

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Tom Lane
I wrote: > Interesting. I tried your test script and got fairly close times > for all the cases on two different machines: > old HPUX machine: shortest 5800 msec, longest 7960 msec > new Fedora 4 machine: shortest 461 msec, longest 608 msec > So what this looks like to me is a corner

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 20:00 +, Gary Doades wrote: > I have put together a test case Please enable trace_sort=on and then repeat tests and post the accompanying log file. I think this is simply the sort taking longer depending upon the data distribution, but I'd like to know for sure. Thank

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades
Tom Lane wrote: shared_buffers is unlikely to impact index build time noticeably in recent PG releases. maintenance_work_mem would affect it a lot, though. What setting were you using for that? Also, i tried upping maintenance_work_mem to 65536 and it didn't make much difference (maybe 10% f

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades
Tom Lane wrote: Interesting. I tried your test script and got fairly close times for all the cases on two different machines: old HPUX machine: shortest 5800 msec, longest 7960 msec new Fedora 4 machine: shortest 461 msec, longest 608 msec (the HPUX machine was doing other stuff

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Jeremy Haile
Thanks for everyone's feedback. I will definitely take the hardware comments into consideration when purchasing future hardware. I am located in Atlanta, GA. If Dell has such a bad reputation with this list, does anyone have good vendor recommendations? Although most of the responses were ha

Re: [PERFORM] Strange Create Index behaviour

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes: > Platform: FreeBSD 6.0, Postgresql 8.1.2 compiled from the ports collection. > If the column that is having the index created has a certain > distribution of values then create index takes a very long time. If the > data values (integer in this case) a fa

[PERFORM] Strange Create Index behaviour

2006-02-15 Thread Gary Doades
Platform: FreeBSD 6.0, Postgresql 8.1.2 compiled from the ports collection. Not sure if this belongs in performance or bugs.. A pg_restore of my 2.5GB database was taking up to 2 hours to complete instead of the expected 10-15 minutes. Checking the server it was mostly CPU bound. Testing furth

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Ron
At 11:21 AM 2/15/2006, Jeremy Haile wrote: We are a small company looking to put together the most cost effective solution for our production database environment. Currently in production Postgres 8.1 is running on this machine: Dell 2850 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache 4 GB DDR2 400 Mhz

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Scott Marlowe
On Wed, 2006-02-15 at 12:44, Christopher Browne wrote: > After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] ("Joshua D. > Drake") belched out: > > You should probably review the archives for PostgreSQL user > > experience with Dell's before you purchase one. > > Hear, hear! We found De

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Christopher Browne
After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] ("Joshua D. Drake") belched out: > Jeremy Haile wrote: >> We are a small company looking to put together the most cost effective >> solution for our production database environment. Currently in >> production Postgres 8.1 is running on t

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Greg Stark
"Joshua D. Drake" <[EMAIL PROTECTED]> writes: > Jeremy Haile wrote: > > We are a small company looking to put together the most cost effective > > solution for our production database environment. Currently in > > production Postgres 8.1 is running on this machine: > > > > Dell 2850 > > 2 x 3.0

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Joshua D. Drake
Jeremy Haile wrote: We are a small company looking to put together the most cost effective solution for our production database environment. Currently in production Postgres 8.1 is running on this machine: Dell 2850 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache 4 GB DDR2 400 Mhz 2 x 73 GB 10K SCSI RAID

Re: [PERFORM] out of memory

2006-02-15 Thread Scott Marlowe
On Wed, 2006-02-15 at 11:18, [EMAIL PROTECTED] wrote: > Here the result with hashAgg to false : > Nested Loop (cost=2487858.08..2490896.58 rows=1001 width=34) (actual > time=1028044.781..1030251.260 rows=1000 loops=1) >-> Subquery Scan "day" (cost=2487858.08..2487870.58 rows=1000 width=16)

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Mark Lewis
Machine 1: $2000 Machine 2: $2000 Machine 3: $2000 Knowing how to rig them together and maintain them in a fully fault- tolerant way: priceless. (Sorry for the off-topic post, I couldn't resist). -- Mark Lewis On Wed, 2006-02-15 at 09:19 -0800, Craig A. James wrote: > Jeremy Haile wrote: > > W

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Craig A. James
Jeremy Haile wrote: We are a small company looking to put together the most cost effective solution for our production database environment. Currently in production Postgres 8.1 is running on this machine: Dell 2850 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache 4 GB DDR2 400 Mhz 2 x 73 GB 10K SCSI RAID

Re: [PERFORM] out of memory

2006-02-15 Thread martial . bizel
Here the result with hashAgg to false : Nested Loop (cost=2487858.08..2490896.58 rows=1001 width=34) (actual time=1028044.781..1030251.260 rows=1000 loops=1) -> Subquery Scan "day" (cost=2487858.08..2487870.58 rows=1000 width=16) (actual time=1027996.748..1028000.969 rows=1000 loops=1)

Re: [PERFORM] out of memory

2006-02-15 Thread Scott Marlowe
On Wed, 2006-02-15 at 09:55, [EMAIL PROTECTED] wrote: > Good morning, > > > > > I've increased sort_mem until 2Go !! > and the error "out of memory" appears again. > > Here the request I try to pass with her explain plan, > > Nested Loop (cost=2451676.23..2454714.73 rows=1001 width=34) >

Re: [PERFORM] out of memory

2006-02-15 Thread martial . bizel
You're right, release is 7.4.7. there's twenty millions records "query" > On Tue, 2006-02-14 at 11:36, Tom Lane wrote: > > [EMAIL PROTECTED] writes: > > > Yes, I've launched ANALYZE command before sending request. > > > I precise that's postgres version is 7.3.4 > > > > Can't possibly be 7.3.4, t

[PERFORM] Reliability recommendations

2006-02-15 Thread Jeremy Haile
We are a small company looking to put together the most cost effective solution for our production database environment. Currently in production Postgres 8.1 is running on this machine: Dell 2850 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache 4 GB DDR2 400 Mhz 2 x 73 GB 10K SCSI RAID 1 (for xlog and OS) 4

Re: [PERFORM] explain hashAggregate

2006-02-15 Thread martial . bizel
Good morning, I try to understand how optimizer uses HashAggregate instead of GroupAggregate and I want to know what is exactly this two functionnality (benefits /inconvenients) In my case, I've this explain plan. --- Nested Loop (cost=2451676.23..2454714.73 rows=1001 width=

[PERFORM] out of memory

2006-02-15 Thread martial . bizel
Good morning, I've increased sort_mem until 2Go !! and the error "out of memory" appears again. Here the request I try to pass with her explain plan, Nested Loop (cost=2451676.23..2454714.73 rows=1001 width=34) -> Subquery Scan "day" (cost=2451676.23..2451688.73 rows=1000 width=16)

[PERFORM] Stored proc and optimizer question

2006-02-15 Thread Antal Attila
Hi! I have a question about the query optimizer and the function scan. See the next case: CREATE TABLE a (id SERIAL PRIMARY KEY, userid INT4, col TEXT); CREATE TABLE b (id SERIAL PRIMARY KEY, userid INT4, a_id INT4 REFERENCES a (id), col TEXT); CREATE INDEX idx_a_uid ON a(userid); CREATE INDEX

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread Tom Lane
"FERREIRA, William (VALTECH)" <[EMAIL PROTECTED]> writes: > with PostgreSQL 8.1.3, here are my parameters (it's the default configuration) > wal_sync_method = fsync > wal_buffers = 8 > checkpoint_segments = 3 > bgwriter_lru_percent = 1.0 > bgwriter_lru_maxpages = 5 > bgwriter_all_percent = 0.333 >

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread Jignesh K. Shah
Actually fsync is not the default on solaris (verify using "show all;) (If you look closely in postgresql.conf it is commented out and mentioned as default but show all tells a different story) In all my cases I saw the default as wal_sync_method | open_datasync Also I had se

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread FERREIRA, William (VALTECH)
with PostgreSQL 8.1.3, here are my parameters (it's the default configuration) wal_sync_method = fsync wal_buffers = 8 checkpoint_segments = 3 bgwriter_lru_percent = 1.0 bgwriter_lru_maxpages = 5 bgwriter_all_percent = 0.333 bgwriter_all_maxpages = 5 and you think times can be improved again ?

Re: [PERFORM] Postgres slower than MS ACCESS

2006-02-15 Thread Jeff Trout
On Feb 14, 2006, at 3:56 PM, Jay Greenfield wrote: How do you get 4,000+ lines of explain analyze for one update query in a database with only one table? Something a bit fishy there. Perhaps you mean explain verbose, though I don't really see how that'd be so long either, but it'd be clos

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread Jignesh K. Shah
What's your postgresql.conf parameter for the equivalent ones that I suggested? I believe your wal_buffers and checkpoint_segments could be bigger. If that's the case then yep you are fine. As for the background writer I am seeing mixed results yet so not sure about that. But thanks for the

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread FERREIRA, William (VALTECH)
i tested the last version version of PostgreSQL and for the same test : before : 40mn and now : 12mn :) faster than Oracle (exactly what i wanted :p ) thanks to everybody Will -Message d'origine- De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] la part de Albert Cervera Areny E

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread Markus Schaber
Hi, Ferreira, FERREIRA, William (VALTECH) wrote: > i load data from files using copy method. > Files contain between 2 and 7 millions of rows, spread on 5 tables. > > For loading all the data, it takes 40mn, and the same processing takes 17mn > with Oracle. > I think that this time can be impro

Re: [PERFORM] could not send data to client: Broken pipe

2006-02-15 Thread Richard Huxton
Pradeep Parmar wrote: Hi, I'm using Postgres 7.4. I have a web application built with php4 using postgres7.4 I was going through /var/log/messages of my linux box ( SLES 9). I encountered the following messages quite a few times. postgres[20199]: [4-1] ERROR: could not send data to client: B

Re: [PERFORM] copy and postgresql.conf

2006-02-15 Thread FERREIRA, William (VALTECH)
i'm using Solaris8 i tried changing only postgresql parameters and time has increased of 10mn i keep in mind your idea, we will soon upgraded to solaris 10 regards Will -Message d'origine- De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Envoyé : mardi 14 février 2006 22:47 À :