Re: [PERFORM] Large Database Performance suggestions

2004-10-21 Thread Scott Marlowe
On Thu, 2004-10-21 at 21:14, Joshua Marsh wrote: > Hello everyone, > > I am currently working on a data project that uses PostgreSQL > extensively to store, manage and maintain the data. We haven't had > any problems regarding database size until recently. The three major > tables we use never g

Re: [PERFORM] mmap (was First set of OSDL Shared Mem scalability results, some wierdness ...

2004-10-21 Thread Tom Lane
Sean Chittenden <[EMAIL PROTECTED]> writes: > When a backend wishes to write a page, the following steps are taken: > ... > 2) Backend mmap(2)'s a second copy of the page(s) being written to, > this time with the MAP_PRIVATE flag set. > ... > 5) Once the WAL logging is complete and it has hit the

Re: [PERFORM] Large Database Performance suggestions

2004-10-21 Thread Tom Lane
Joshua Marsh <[EMAIL PROTECTED]> writes: > ... We did some original testing and with a server with 8GB or RAM and > found we can do operations on data file up to 50 million fairly well, > but performance drop dramatically after that. What you have to ask is *why* does it drop dramatically? There

Re: [PERFORM] Large Database Performance suggestions

2004-10-21 Thread Gavin Sherry
On Thu, 21 Oct 2004, Joshua Marsh wrote: > Recently, we have found customers who are wanting to use our service > with data files between 100 million and 300 million records. At that > size, each of the three major tables will hold between 150 million and > 700 million records. At this size, I c

[PERFORM] Large Database Performance suggestions

2004-10-21 Thread Joshua Marsh
Hello everyone, I am currently working on a data project that uses PostgreSQL extensively to store, manage and maintain the data. We haven't had any problems regarding database size until recently. The three major tables we use never get bigger than 10 million records. With this size, we can do

Re: [PERFORM] Performance Anomalies in 7.4.5

2004-10-21 Thread Gary Doades
On 21 Oct 2004 at 15:50, Thomas F.O'Connell wrote: > If not, should I be REINDEXing manually, as well as VACUUMing manually > after large data imports (whether via COPY or INSERT)? Or will a VACUUM > FULL ANALYZE be enough? > It's not the vacuuming that's important here, just the analyze. If

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Jim C. Nasby
Note that most people are now moving away from raw devices for databases in most applicaitons. The relatively small performance gain isn't worth the hassles. On Thu, Oct 21, 2004 at 12:27:27PM +0200, Steinar H. Gunderson wrote: > On Thu, Oct 21, 2004 at 08:58:01AM +0100, Matt Clark wrote: > > I su

Re: [PERFORM] OS desicion

2004-10-21 Thread Jim C. Nasby
On Wed, Oct 20, 2004 at 09:38:51AM -0700, Josh Berkus wrote: > Tom, > > > You are asking the wrong question. The best OS is the OS you (and/or > > the customer) knows and can administer competently. > > I'll have to 2nd this. I'll 3rd but add one tidbit: FreeBSD will schedule disk I/O based

Re: [PERFORM] Does PostgreSQL run with Oracle?

2004-10-21 Thread Jim C. Nasby
On Fri, Oct 15, 2004 at 10:19:48AM -0700, Steve Atkins wrote: > On Fri, Oct 15, 2004 at 11:54:44AM -0500, [EMAIL PROTECTED] wrote: > > My basic question to the community is "is PostgreSQL approximately as fast > > as Oracle?" > > > I'm currently running single processor UltraSPARC workstations, an

Re: [PERFORM] Performance Anomalies in 7.4.5

2004-10-21 Thread Tom Lane
"Thomas F.O'Connell" <[EMAIL PROTECTED]> writes: > -> Nested Loop (cost=0.00..0.01 rows=1 width=8) (actual > time=1.771..298305.531 rows=2452 loops=1) > Join Filter: ("inner".id = "outer".id) > -> Seq Scan on userdata u (cost=0.00..0.00 rows=1 width=8) > (actual time

Re: [PERFORM] Performance Anomalies in 7.4.5

2004-10-21 Thread Thomas F.O'Connell
The irony is that I had just disabled pg_autovacuum the previous day during analysis of a wider issue affecting imports of data into the system. -tfo -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203

Re: [PERFORM] Performance Anomalies in 7.4.5

2004-10-21 Thread Dennis Bjorklund
On Thu, 21 Oct 2004, Thomas F.O'Connell wrote: > Aggregate (cost=0.02..0.02 rows=1 width=8) (actual > time=298321.421..298321.422 rows=1 loops=1) > -> Nested Loop (cost=0.00..0.01 rows=1 width=8) (actual > time=1.771..298305.531 rows=2452 loops=1) > Join Filter: ("inner".id

Re: [PERFORM] Performance Anomalies in 7.4.5

2004-10-21 Thread Thomas F . O'Connell
I know, I know: I should've done this before I posted. REINDEXing and VACUUMing mostly fixed this problem. Which gets me back to where I was yesterday, reviewing an import process (that existed previously) that populates tables in this system that seems to allow small data sets to cause sim

[PERFORM] Performance Anomalies in 7.4.5

2004-10-21 Thread Thomas F.O'Connell
I'm seeing some weird behavior on a repurposed server that was wiped clean and set up to run as a database and application server with postgres and Apache, as well as some command-line PHP scripts. The box itself is a quad processor (2.4 GHz Intel Xeons) Debian woody GNU/Linux (2.6.2) system

Re: [PERFORM] mmap (was First set of OSDL Shared Mem scalability results, some wierdness ...

2004-10-21 Thread Sean Chittenden
However the really major difficulty with using mmap is that it breaks the scheme we are currently using for WAL, because you don't have any way to restrict how soon a change in an mmap'd page will go to disk. (No, I don't believe that mlock guarantees this. It says that the page will not be remove

[PERFORM] Links to OSDL test results up

2004-10-21 Thread Josh Berkus
Simon, Folks, I've put links to all of my OSDL-STP test results up on the TestPerf project: http://pgfoundry.org/forum/forum.php?thread_id=164&forum_id=160 SHare&Enjoy! -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Sean Chittenden
As someone else noted, this doesn't belong in the filesystem (rather the kernel's block I/O layer/buffer cache). But I agree, an API by which we can tell the kernel what kind of I/O behavior to expect would be good. [snip] The closest API to what you're describing that I'm aware of is posix_fad

Re: [PERFORM] Simple machine-killing query!

2004-10-21 Thread Josh Berkus
Victor, > [explain] select * from BIGMA where string not in (select * from DIRTY); >                                QUERY PLAN > >  Seq Scan on bigma  (cost=0.00..24582291.25 rows=500 width=145) >    Filter: (NOT (subplan)) >

Re: [PERFORM] Simple machine-killing query!

2004-10-21 Thread Victor Ciurus
Well guys, Your replies have been more than helpful to me, showing me both the learning stuff I still have to get in my mind about real SQL and the wonder called PostgreSQL and a very good solution from Tom Lane (thanks a lot sir!)! Indeed, changing mem_sort and other server parmeters along with

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Steinar H. Gunderson
On Thu, Oct 21, 2004 at 10:20:55AM -0400, Tom Lane wrote: >> ... I have no idea how much you can improve over the "best" >> filesystems out there, but having two layers of journalling (both WAL _and_ >> FS journalling) on top of each other don't make all that much sense to me. > Which is why settin

Re: [PERFORM] Simple machine-killing query!

2004-10-21 Thread Tom Lane
Victor Ciurus <[EMAIL PROTECTED]> writes: > What I am requested to do is to keep all records from 'BIGMA' that do > not apear in 'DIRTY' > So far I have tried solving this by going for: > [explain] select * from BIGMA where string not in (select * from DIRTY); >QUER

Re: [PERFORM] Simple machine-killing query!

2004-10-21 Thread Aaron Werman
Sounds like you need some way to match a subset of the data first, rather than try indices that are bigger than the data. Can you add operation indices, perhaps on the first 10 bytes of the keys in both tables or on a integer hash of all of the strings? If so you could join on the exact set differe

Re: [PERFORM] Simple machine-killing query!

2004-10-21 Thread Stephan Szabo
On Thu, 21 Oct 2004, Victor Ciurus wrote: > Hi all, > > I'm writing this because I've reached the limit of my imagination and > patience! So here is it... > > 2 tables: > 1 containing 27 million variable lenght, alpha-numeric records > (strings) in 1 (one) field. (10 - 145 char lenght per record)

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Jan Dittmer
Neil Conway wrote: > Also, I would imagine Win32 provides some means to inform the kernel > about your expected I/O pattern, but I haven't checked. Does anyone know > of any other relevant APIs? See CreateFile, Parameter dwFlagsAndAttributes http://msdn.microsoft.com/library/default.asp?url=/li

[PERFORM] Simple machine-killing query!

2004-10-21 Thread Victor Ciurus
Hi all, I'm writing this because I've reached the limit of my imagination and patience! So here is it... 2 tables: 1 containing 27 million variable lenght, alpha-numeric records (strings) in 1 (one) field. (10 - 145 char lenght per record) 1 containing 2.5 million variable lenght, alpha-numeric r

Re: [PERFORM] create index with substr function

2004-10-21 Thread george young
As previously suggested by Stephan Szabo, you need to create a helper function, e.g.: create or replace function after9(text)returns text language plpgsql immutable as ' begin return substr($1, 10); end; '; You may need the "immutable" specification is to allow the function's use in an in

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Tom Lane
"Steinar H. Gunderson" <[EMAIL PROTECTED]> writes: > ... I have no idea how much you can improve over the "best" > filesystems out there, but having two layers of journalling (both WAL _and_ > FS journalling) on top of each other don't make all that much sense to me. Which is why setting the FS to

Re: [PERFORM] futex results with dbt-3

2004-10-21 Thread Mark Wong
On Thu, Oct 21, 2004 at 07:45:53AM +0200, Manfred Spraul wrote: > Mark Wong wrote: > > >Here are some other details, per Manfred's request: > > > >Linux 2.6.8.1 (on a gentoo distro) > > > > > How complicated are Tom's test scripts? His immediate reply was that I > should retest with Fedora, to

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Steinar H. Gunderson
On Thu, Oct 21, 2004 at 12:44:10PM +0200, Leeuw van der, Tim wrote: > Hacking PG internally to handle raw devices will meet with strong > resistance from large portions of the development team. I don't expect > (m)any core devs of PG will be excited about rewriting the entire I/O > architecture of

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Neil Conway
Matt Clark wrote: I'm thinking along the lines of an FS that's aware of PG's strategies and requirements and therefore optimised to make those activities as efiicient as possible - possibly even being aware of PG's disk layout and treating files differently on that basis. As someone else noted, thi

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Aaron Werman
The intuitive thing would be to put pg into a file system. /Aaron On Thu, 21 Oct 2004 12:44:10 +0200, Leeuw van der, Tim <[EMAIL PROTECTED]> wrote: > Hi, > > I guess the difference is in 'severe hacking inside PG' vs. 'some unknown amount of > hacking that doesn't touch PG code'. > > Hacking

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Leeuw van der, Tim
Hi, I guess the difference is in 'severe hacking inside PG' vs. 'some unknown amount of hacking that doesn't touch PG code'. Hacking PG internally to handle raw devices will meet with strong resistance from large portions of the development team. I don't expect (m)any core devs of PG will be e

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Steinar H. Gunderson
On Thu, Oct 21, 2004 at 08:58:01AM +0100, Matt Clark wrote: > I suppose I'm just idly wondering really. Clearly it's against PG > philosophy to build an FS or direct IO management into PG, but now it's so > relatively easy to plug filesystems into the main open-source Oses, It > struck me that the

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Matt Clark
> Looking at that list, I got the feeling that you'd want to > push that PG-awareness down into the block-io layer as well, > then, so as to be able to optimise for (perhaps) conflicting > goals depending on what the app does; for the IO system to be > able to read the apps mind it needs to hav

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Pierre-Frédéric Caillaud
Reiser4 ? On Thu, 21 Oct 2004 08:58:01 +0100, Matt Clark <[EMAIL PROTECTED]> wrote: I suppose I'm just idly wondering really. Clearly it's against PG philosophy to build an FS or direct IO management into PG, but now it's so relatively easy to plug filesystems into the main open-source Oses,

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Leeuw van der, Tim
Hiya, Looking at that list, I got the feeling that you'd want to push that PG-awareness down into the block-io layer as well, then, so as to be able to optimise for (perhaps) conflicting goals depending on what the app does; for the IO system to be able to read the apps mind it needs to have so

Re: [PERFORM] Free PostgreSQL Training, Philadelphia, Oct 30

2004-10-21 Thread Alban Medici (NetCentrex)
Nobody got a plane to came from europe :-) ??? As a poor frenchie I will not come ... Have a good time Alban -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Aaron Mulder Sent: mercredi 20 octobre 2004 15:11 To: [EMAIL PROTECTED] Subject: Re: [PERFORM]

[PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Matt Clark
I suppose I'm just idly wondering really. Clearly it's against PG philosophy to build an FS or direct IO management into PG, but now it's so relatively easy to plug filesystems into the main open-source Oses, It struck me that there might be some useful changes to, say, XFS or ext3, that could be