At 08:21 PM 2/15/2006, Tom Lane wrote:
Ron <[EMAIL PROTECTED]> writes:
> How are we choosing our pivots?
See qsort.c: it looks like median of nine equally spaced inputs (ie,
the 1/8th points of the initial input array, plus the end points),
implemented as two rounds of median-of-three choices.
On Wed, 2006-02-15 at 18:28 -0500, Tom Lane wrote:
> It seems clear that our qsort.c is doing a pretty awful job of picking
> qsort pivots, while glibc is mostly managing not to make that mistake.
> I haven't looked at the glibc code yet to see what they are doing
> differently.
glibc qsort is act
On Wed, 2006-02-15 at 19:59 -0500, Tom Lane wrote:
> I get
> amazingly stable runtimes now --- I didn't have the patience to run 100
> trials, but in 30 trials I have slowest 11538 msec and fastest 11144 msec.
> So this code path is definitely not very sensitive to this data
> distribution.
"The
On Wed, 2006-02-15 at 20:00 +, Gary Doades wrote:
> I have put together a test case that demonstrates the problem (see
> below). I create a simple table, as close in structure to one of my
> problem tables and populate an integer column with 100,000 zeros follow
> by 100,000 random integers
Ouch! That confirms my problem. I generated the random test case because
it was easier than including the dump of my tables, but you can
appreciate that tables 20 times the size are basically crippled when it
comes to creating an index on them.
I have to say that I restored a few gigabyte dum
Ron <[EMAIL PROTECTED]> writes:
> How are we choosing our pivots?
See qsort.c: it looks like median of nine equally spaced inputs (ie,
the 1/8th points of the initial input array, plus the end points),
implemented as two rounds of median-of-three choices. With half of the
data inputs zero, it's n
I wrote:
> Gary Doades <[EMAIL PROTECTED]> writes:
>> Ouch! That confirms my problem. I generated the random test case because
>> it was easier than including the dump of my tables, but you can
>> appreciate that tables 20 times the size are basically crippled when it
>> comes to creating an ind
This behavior is consistent with the pivot choosing algorithm
assuming certain distribution(s) for the data. For instance,
median-of-three partitioning is known to be pessimal when the data is
geometrically or hyper-geometrically distributed. Also, care must be
taken that sometimes is not whe
Gary Doades <[EMAIL PROTECTED]> writes:
> Ouch! That confirms my problem. I generated the random test case because
> it was easier than including the dump of my tables, but you can
> appreciate that tables 20 times the size are basically crippled when it
> comes to creating an index on them.
Ac
Gary Doades <[EMAIL PROTECTED]> writes:
> Is this likely to hit me in a random fashion during normal operation,
> joins, sorts, order by for example?
Yup, anytime you're passing data with that kind of distribution
through a sort.
> So the options are:
> 1) Fix the included qsort.c code and use t
Tom Lane wrote:
For some reason I hadn't immediately twigged to the fact that your test
script is just N repetitions of the exact same structure with random data.
So it's not so surprising that you get random variations in behavior
with different test data sets.
> It seems clear that our qsort
On Wed, 2006-02-15 at 16:51 -0500, Tom Lane wrote:
> Gary Doades <[EMAIL PROTECTED]> writes:
> > Interestingly, if I don't delete the table after a run, but just drop
> > and re-create the index repeatedly it stays a pretty consistent time,
> > either repeatedly good or repeatedly bad!
>
> This
Gary Doades <[EMAIL PROTECTED]> writes:
> If I run the script again, it is not always the first case that is slow,
> it varies from run to run, which is why I repeated it quite a few times
> for the test.
For some reason I hadn't immediately twigged to the fact that your test
script is just N re
Tom Lane wrote:
So it sure looks like this script does expose a problem on BSD-derived
qsorts. Curiously, the case that's much the worst for me is the third
in the script, while the shortest time is the first case, which was slow
for Gary. So I'd venture that the *BSD code has been tweaked som
Gary Doades <[EMAIL PROTECTED]> writes:
> Interestingly, if I don't delete the table after a run, but just drop
> and re-create the index repeatedly it stays a pretty consistent time,
> either repeatedly good or repeatedly bad!
This is consistent with the theory of a data-dependent performance
p
Jeremy Haile wrote:
Thanks for everyone's feedback. I will definitely take the hardware
comments into consideration when purchasing future hardware. I am
located in Atlanta, GA. If Dell has such a bad reputation with this
list, does anyone have good vendor recommendations?
I can recomme
Simon Riggs <[EMAIL PROTECTED]> writes:
> Please enable trace_sort=on and then repeat tests and post the
> accompanying log file.
I did this on my Fedora machine with port/qsort.c, and got the results
attached. Curiously, this run has the spikes in completely different
places than the prior one d
"Jay Greenfield" <[EMAIL PROTECTED]> writes:
> I did a run with checkpoint_segments @ 30 (from 3 in 4.35 min run) and
> posted a time of 6.78 minutes. Any idea why this would increase the time?
The first time through might take longer while the machine creates empty
xlog segment files (though I'd
Tom Lane wrote:
> I tried forcing PG to use src/port/qsort.c on the Fedora machine,
and lo and behold:
new Fedora 4 machine: shortest 434 msec, longest 8530 msec
So it sure looks like this script does expose a problem on BSD-derived
qsorts. Curiously, the case that's much the worst for
I've been vacuuming between each test run.
Not vacuuming results in times all the way up to 121 minutes. For a direct
comparison with Access, the vacuuming time with Postgres should really be
included as this is not required with Access.
By removing all of the indexes I have been able to get
I wrote:
> Interesting. I tried your test script and got fairly close times
> for all the cases on two different machines:
> old HPUX machine: shortest 5800 msec, longest 7960 msec
> new Fedora 4 machine: shortest 461 msec, longest 608 msec
> So what this looks like to me is a corner
On Wed, 2006-02-15 at 20:00 +, Gary Doades wrote:
> I have put together a test case
Please enable trace_sort=on and then repeat tests and post the
accompanying log file.
I think this is simply the sort taking longer depending upon the data
distribution, but I'd like to know for sure.
Thank
Tom Lane wrote:
shared_buffers is unlikely to impact index build time noticeably in
recent PG releases. maintenance_work_mem would affect it a lot, though.
What setting were you using for that?
Also, i tried upping maintenance_work_mem to 65536 and it didn't make
much difference (maybe 10% f
Tom Lane wrote:
Interesting. I tried your test script and got fairly close times
for all the cases on two different machines:
old HPUX machine: shortest 5800 msec, longest 7960 msec
new Fedora 4 machine: shortest 461 msec, longest 608 msec
(the HPUX machine was doing other stuff
Thanks for everyone's feedback. I will definitely take the hardware
comments into consideration when purchasing future hardware. I am
located in Atlanta, GA. If Dell has such a bad reputation with this
list, does anyone have good vendor recommendations?
Although most of the responses were ha
Gary Doades <[EMAIL PROTECTED]> writes:
> Platform: FreeBSD 6.0, Postgresql 8.1.2 compiled from the ports collection.
> If the column that is having the index created has a certain
> distribution of values then create index takes a very long time. If the
> data values (integer in this case) a fa
Platform: FreeBSD 6.0, Postgresql 8.1.2 compiled from the ports collection.
Not sure if this belongs in performance or bugs..
A pg_restore of my 2.5GB database was taking up to 2 hours to complete
instead of the expected 10-15 minutes. Checking the server it was mostly
CPU bound. Testing furth
At 11:21 AM 2/15/2006, Jeremy Haile wrote:
We are a small company looking to put together the most cost effective
solution for our production database environment. Currently in
production Postgres 8.1 is running on this machine:
Dell 2850
2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
4 GB DDR2 400 Mhz
On Wed, 2006-02-15 at 12:44, Christopher Browne wrote:
> After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] ("Joshua D.
> Drake") belched out:
> > You should probably review the archives for PostgreSQL user
> > experience with Dell's before you purchase one.
>
> Hear, hear! We found De
After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] ("Joshua D. Drake")
belched out:
> Jeremy Haile wrote:
>> We are a small company looking to put together the most cost effective
>> solution for our production database environment. Currently in
>> production Postgres 8.1 is running on t
"Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> Jeremy Haile wrote:
> > We are a small company looking to put together the most cost effective
> > solution for our production database environment. Currently in
> > production Postgres 8.1 is running on this machine:
> >
> > Dell 2850
> > 2 x 3.0
Jeremy Haile wrote:
We are a small company looking to put together the most cost effective
solution for our production database environment. Currently in
production Postgres 8.1 is running on this machine:
Dell 2850
2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
4 GB DDR2 400 Mhz
2 x 73 GB 10K SCSI RAID
On Wed, 2006-02-15 at 11:18, [EMAIL PROTECTED] wrote:
> Here the result with hashAgg to false :
> Nested Loop (cost=2487858.08..2490896.58 rows=1001 width=34) (actual
> time=1028044.781..1030251.260 rows=1000 loops=1)
>-> Subquery Scan "day" (cost=2487858.08..2487870.58 rows=1000 width=16)
Machine 1: $2000
Machine 2: $2000
Machine 3: $2000
Knowing how to rig them together and maintain them in a fully fault-
tolerant way: priceless.
(Sorry for the off-topic post, I couldn't resist).
-- Mark Lewis
On Wed, 2006-02-15 at 09:19 -0800, Craig A. James wrote:
> Jeremy Haile wrote:
> > W
Jeremy Haile wrote:
We are a small company looking to put together the most cost effective
solution for our production database environment. Currently in
production Postgres 8.1 is running on this machine:
Dell 2850
2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
4 GB DDR2 400 Mhz
2 x 73 GB 10K SCSI RAID
Here the result with hashAgg to false :
Nested Loop (cost=2487858.08..2490896.58 rows=1001 width=34) (actual
time=1028044.781..1030251.260 rows=1000 loops=1)
-> Subquery Scan "day" (cost=2487858.08..2487870.58 rows=1000 width=16)
(actual time=1027996.748..1028000.969 rows=1000 loops=1)
On Wed, 2006-02-15 at 09:55, [EMAIL PROTECTED] wrote:
> Good morning,
>
>
>
>
> I've increased sort_mem until 2Go !!
> and the error "out of memory" appears again.
>
> Here the request I try to pass with her explain plan,
>
> Nested Loop (cost=2451676.23..2454714.73 rows=1001 width=34)
>
You're right, release is 7.4.7.
there's twenty millions records "query"
> On Tue, 2006-02-14 at 11:36, Tom Lane wrote:
> > [EMAIL PROTECTED] writes:
> > > Yes, I've launched ANALYZE command before sending request.
> > > I precise that's postgres version is 7.3.4
> >
> > Can't possibly be 7.3.4, t
We are a small company looking to put together the most cost effective
solution for our production database environment. Currently in
production Postgres 8.1 is running on this machine:
Dell 2850
2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache
4 GB DDR2 400 Mhz
2 x 73 GB 10K SCSI RAID 1 (for xlog and OS)
4
Good morning,
I try to understand how optimizer uses HashAggregate instead of GroupAggregate
and I want to know what is exactly this two functionnality (benefits
/inconvenients)
In my case, I've this explain plan.
---
Nested Loop (cost=2451676.23..2454714.73 rows=1001 width=
Good morning,
I've increased sort_mem until 2Go !!
and the error "out of memory" appears again.
Here the request I try to pass with her explain plan,
Nested Loop (cost=2451676.23..2454714.73 rows=1001 width=34)
-> Subquery Scan "day" (cost=2451676.23..2451688.73 rows=1000 width=16)
Hi!
I have a question about the query optimizer and the function scan. See
the next case:
CREATE TABLE a (id SERIAL PRIMARY KEY, userid INT4, col TEXT);
CREATE TABLE b (id SERIAL PRIMARY KEY, userid INT4, a_id INT4 REFERENCES
a (id), col TEXT);
CREATE INDEX idx_a_uid ON a(userid);
CREATE INDEX
"FERREIRA, William (VALTECH)" <[EMAIL PROTECTED]> writes:
> with PostgreSQL 8.1.3, here are my parameters (it's the default configuration)
> wal_sync_method = fsync
> wal_buffers = 8
> checkpoint_segments = 3
> bgwriter_lru_percent = 1.0
> bgwriter_lru_maxpages = 5
> bgwriter_all_percent = 0.333
>
Actually fsync is not the default on solaris (verify using "show all;)
(If you look closely in postgresql.conf it is commented out and
mentioned as default but show all tells a different story)
In all my cases I saw the default as
wal_sync_method | open_datasync
Also I had se
with PostgreSQL 8.1.3, here are my parameters (it's the default configuration)
wal_sync_method = fsync
wal_buffers = 8
checkpoint_segments = 3
bgwriter_lru_percent = 1.0
bgwriter_lru_maxpages = 5
bgwriter_all_percent = 0.333
bgwriter_all_maxpages = 5
and you think times can be improved again ?
On Feb 14, 2006, at 3:56 PM, Jay Greenfield wrote:
How do you get 4,000+ lines of explain analyze for one update
query in a
database with only one table? Something a bit fishy there.
Perhaps you
mean explain verbose, though I don't really see how that'd be so long
either, but it'd be clos
What's your postgresql.conf parameter for the equivalent ones that I
suggested?
I believe your wal_buffers and checkpoint_segments could be bigger. If
that's the case then yep you are fine.
As for the background writer I am seeing mixed results yet so not sure
about that.
But thanks for the
i tested the last version version of PostgreSQL
and for the same test :
before : 40mn
and now : 12mn :)
faster than Oracle (exactly what i wanted :p )
thanks to everybody
Will
-Message d'origine-
De : [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] la part de Albert
Cervera Areny
E
Hi, Ferreira,
FERREIRA, William (VALTECH) wrote:
> i load data from files using copy method.
> Files contain between 2 and 7 millions of rows, spread on 5 tables.
>
> For loading all the data, it takes 40mn, and the same processing takes 17mn
> with Oracle.
> I think that this time can be impro
Pradeep Parmar wrote:
Hi,
I'm using Postgres 7.4. I have a web application built with php4 using
postgres7.4
I was going through /var/log/messages of my linux box ( SLES 9). I
encountered the following messages quite a few times.
postgres[20199]: [4-1] ERROR: could not send data to client: B
i'm using Solaris8
i tried changing only postgresql parameters
and time has increased of 10mn
i keep in mind your idea, we will soon upgraded to solaris 10
regards
Will
-Message d'origine-
De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Envoyé : mardi 14 février 2006 22:47
À :
51 matches
Mail list logo