Re: [PERFORM] Projecting currentdb to more users
>From AMD's suit against Intel. Perhaps relevant to some PG/AMD issues. "...125. Intel has designed its compiler purposely to degrade performance when a program is run on an AMD platform. To achieve this, Intel designed the compiler to compile code along several alternate code paths. Some paths are executed when the program runs on an Intel platform and others are executed when the program is operated on a computer with an AMD microprocessor. (The choice of code path is determined when the program is started, using a feature known as "CPUID" which identifies the computer's microprocessor.) By design, the code paths were not created equally. If the program detects a "Genuine Intel" microprocessor, it executes a fully optimized code path and operates with the maximum efficiency. However, if the program detects an "Authentic AMD" microprocessor, it executes a different code path that will degrade the program's performance or cause it to crash..." ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Trying to figure out pgbench
I had a similar experience. regardless of scaling, etc, I got same results. almost like flags are not active. did pgbench -I template1 and pgbench -c 10 -t 50 -v -d 1 and played around from there This is on IBM pSeries, AIX5.3, PG8.0.2 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of William Yu Sent: Tuesday, June 21, 2005 12:05 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Trying to figure out pgbench My Dual Core Opteron server came in last week. I tried to do some benchmarks with pgbench to get some numbers on the difference between 1x1 -> 2x1 -> 2x2 but no matter what I did, I kept getting the same TPS on all systems. Any hints on what the pgbench parameters I should be using? In terms of production use, it definitely can handle more load. Previously, Apache/Perl had to run on a separate server to avoid a ~50% penalty. Now, the numbers are +15% performance even with Apache/Perl running on the same box as PostgreSQL. How much more load of course is what I'd like to quantify. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Whence the Opterons?
Richard, thanks for info. "...the RH supplied Postgres binary has issues..." Would you have the time to provide a bit more info? Version of PG? Nature of issues? Methods that resolved? Thanks again, -- Ross -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard Rowell Sent: Friday, June 10, 2005 8:34 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Whence the Opterons? I will second the nod to Penguin computing. We have a bit of Penguin hardware here (though the majority is Dell). We did have issues with one machine a couple of years ago, but Penguin was very pro-active in addressing that. We recently picked up a Dual Opteron system from them and have been very pleased with it so far. I would be careful of the RHES that it ships with though. We had machine lockups immediately after the suggested kernel update (had to down grade manually). Also, the RH supplied Postgres binary has issues, so you would need to compile Postgres yourself until the next RH update. On Fri, 2005-05-06 at 14:39 -0700, Mischa Sandberg wrote: > After reading the comparisons between Opteron and Xeon processors for > Linux, I'd like to add an Opteron box to our stable of Dells and > Sparcs, for comparison. > > IBM, Sun and HP have their fairly pricey Opteron systems. > The IT people are not swell about unsupported purchases off ebay. > Anyone care to suggest any other vendors/distributors? Looking for > names with national support, so that we can recommend as much to our > customers. > > Many thanks in advance. -- -- Richard Rowell [EMAIL PROTECTED] Bowman Systems (318) 213-8780 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] test - pls delete and ignore
You just couldn't help yourself, could you? :-) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] [PORTS] Which library has these symbols? -- Eureka
Title: RE: [PORTS] Which library has these symbols? -- Eureka 64-bit PG 8.0.2. is up and running on AIX5.3/power5 YES! ! ! The major thing: setting some quirky LDFLAGS. Anyone interested in details, please ping. Thanks to Nick Addington, Vincent Vanwynsberghe, my SA, Sergey, and Tom Lane (for good-natured nudging) My Next Task: Finding a Stress Test Harness to Load, and Query Data. Anyone have ideas? I am eagerly awaiting the DESTRUCTION of Oracle around here, and "yes" I am an oracle DBA and think it's very good technology. Smiling, Ross Mohan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Mohan, Ross Sent: Wednesday, May 25, 2005 1:11 PM To: [EMAIL PROTECTED] Subject: Re: [PORTS] Which library has these symbols? Tom, they're all over the place, repeated in different libraries, kind of a pain. Didn't realize that. I'll just give linker a bunch of LIBPATH and LIBNAME directives and have it run around. # ar -t ./postgresql-8.0.2/src/interfaces/ecpg/ecpglib/libecpg.a | egrep 'dirmod|path|pgstr|pgsleep' path.o # ar -t ./postgresql-8.0.2/src/interfaces/ecpg/pgtypeslib/libpgtypes.a | egrep 'dirmod|path|pgstr|pgsleep' pgstrcasecmp.o # ar -t ./postgresql-8.0.2/src/interfaces/libpq/libpq.a | egrep 'dirmod|path|pgstr|pgsleep' pgstrcasecmp.o # ar -t ./postgresql-8.0.2/src/port/libpgport.a | egrep 'dirmod|path|pgstr|pgsleep' dirmod.o path.o pgsleep.o pgstrcasecmp.o # ar -t ./postgresql-8.0.2/src/port/libpgport_srv.a | egrep 'dirmod|path|pgstr|pgsleep' dirmod_srv.o path.o pgsleep.o pgstrcasecmp.o I **really** want this in 64bit..funny this problem only shows up in 64, not 32 mode. Thanks for commenting --- That's ALWAYS welcome! -- Ross -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED]] Sent: Tuesday, May 24, 2005 10:53 PM To: Mohan, Ross Cc: [EMAIL PROTECTED] Subject: Re: [PORTS] Which library has these symbols? "Mohan, Ross" <[EMAIL PROTECTED]> writes: > So Close, Yet So Far! The specific symbols being complained of should be in libpgport_srv (see src/port). Dunno why your platform is ignoring that library. When you find out, let us know ;-) regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Prefetch - OffTopic
Yes, that would be a sufficient (although not necessary) condition for being well and fine with kdB. Last time I used APL was.pre-Gregorian, so yea, that's scary to me, too. ( Of course, one can use C/ODBC or Java/JDBC to reach kdB; once there, one uses SQL92, or proprietary kSQL. ) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Chris Browne Sent: Tuesday, May 10, 2005 4:14 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Prefetch - OffTopic [EMAIL PROTECTED] ("Mohan, Ross") writes: > for time-series and "insane fast", nothing beats kdB, I believe > > www.kx.com ... Which is well and fine if you're prepared to require that all of the staff that interact with data are skilled APL hackers. Skilled enough that they're all ready to leap into Whitney's ASCII-based variant, K. -- (format nil "[EMAIL PROTECTED]" "cbbrowne" "acm.org") http://www.ntlug.org/~cbbrowne/functional.html Rules of the Evil Overlord #78. "I will not tell my Legions of Terror "And he must be taken alive!" The command will be: ``And try to take him alive if it is reasonably practical.''" <http://www.eviloverlord.com/> ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Prefetch - OffTopic
for time-series and "insane fast", nothing beats kdB, I believe www.kx.com Not trying to Quisling-out PG here, just hoping to respond to Mr. Olson -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Tuesday, May 10, 2005 2:54 PM To: Greg Stark Cc: [EMAIL PROTECTED]; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Prefetch Greg Stark <[EMAIL PROTECTED]> writes: > Actually forcing things to use indexes is the wrong direction to go if > you're trying to process lots of data and want to stream it off disk > as rapidly as possible. I would think about whether you can structure > your data such that you can use sequential scans. Agreed. > In your application that might be hard. It sounds like you would need > more or less one table per stock ticker which would really be hard to > manage. Actually, in a previous lifetime I used to do pretty much the same stuff Matt is working on. The reason I suggested parallelizing is that what you want is usually not so much the 200day moving average of FOO, as the 200day moving averages of a whole bunch of things. If your input table contains time-ordered data for all those things, then a seqscan works out pretty well. > One thing you might look into is using the CLUSTER command. But > postgres doesn't maintain the cluster ordering so it would require > periodically rerunning it. If the desired sort order is time-based, it falls out pretty much for free in this application, because historical data doesn't change -- you are only interested in appending at the right. In said previous lifetime, we used Postgres for tracking our actual transactions, but we built a custom file format for storing the individual tick data. That's not stuff you need transactional semantics for; the historical data is what it is. Besides, you need to compress it as much as you can because there's always too much of it. Machines are faster and disk space cheaper than they were at the time, but I'd still question the wisdom of using a Postgres row for each daily bar, let alone finer-grain data. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Table Partitioning: Will it be supported in Future?
Maybe he needs to spend $7K on performance improvements? ;-) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Josh Berkus Sent: Tuesday, April 26, 2005 8:00 PM To: Richard Huxton Cc: [EMAIL PROTECTED]; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Table Partitioning: Will it be supported in Future? Richard, > I believe these are being worked on at the moment. You might want to > search the archives of the hackers mailing list to see if the plans > will suit your needs. Actually, this is being discussed through the Bizgres project: www.bizgres.org. However, I agree that a 1GB table is not in need of partitioning. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon
Joel, thanks. A couple of things jump out there for me, not a problem for a routine ODBC connection, but perhaps in the "lotsa stuff" context of your current explorations, it might be relevant? I am completely shooting from the hip, here, but...if it were my goose to cook, I'd be investigating Session("StringConn") = "DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=; PWD=;ReadOnly=0;Protocol=6.4; || Protocol? Is this related to version? is the driver wy old? FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0; ShowSystemTables=0;ConnSettings=;Fetch=100; || Fetch great for OLTP, lousy for batch? Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=8190; || what ARE the datatypes and sizes in your particular case? Debug=0; || a run with debug=1 probably would spit up something interesting CommLog=0;Optimizer=1; || Optimizer? that's a new one on me Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Parse=0;CancelAsFreeStmt=;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableCursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseServerSidePrepare=0" || that's about all I can see, prima facie. I'll be very curious to know if ODBC is any part of your performance equation. HTH, Ross -Original Message- From: Joel Fradkin [mailto:[EMAIL PROTECTED] Sent: Thursday, April 21, 2005 10:54 AM To: Mohan, Ross Cc: [EMAIL PROTECTED]; PostgreSQL Perform Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon Here is the connect string I am using. It could be horrid as I cut it from ODBC program. Session("StringConn") = "DRIVER={PostgreSQL};DATABASE=wazagua;SERVER=192.168.123.252;PORT=5432;UID=; PWD=;ReadOnly=0;Protocol=6.4;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0; ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVar charSize=254;MaxLongVarcharSize=8190;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;Us eDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Pa rse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableC ursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseS erverSidePrepare=0" Joel Fradkin -Original Message- From: Mohan, Ross [mailto:[EMAIL PROTECTED] Sent: Thursday, April 21, 2005 9:42 AM To: [EMAIL PROTECTED] Subject: RE: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the like. Maybe one of the ODBC cognoscenti here can chime in more concretely ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon
FWIW, ODBC has variables to tweak, as well. fetch/buffer sizes, and the like. Maybe one of the ODBC cognoscenti here can chime in more concretely -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joel Fradkin Sent: Thursday, April 21, 2005 10:36 AM To: 'Tom Lane'; 'John A Meinel' Cc: 'Postgresql Performance' Subject: Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon I suspect he's using pgadmin. Yup I was, but I did try running on the linux box in psql, but it was running to the screen and took forever because of that. The real issue is returning to my app using ODBC is very slow (Have not tested the ODBC for MYSQL, MSSQL is ok (the two proc dell is running out of steam but been good until this year when we about doubled our demand by adding sears as a client). Using odbc to postgres on some of the views (Josh from Command is having me do some very specific testing) is timing out with a 10 minute time limit. These are pages that still respond using MSSQL (this is wehere production is using the duel proc and the test is using the 4 proc). I have a tool that hooks to all three databases so I can try it with that and see if I get different responses. Joel ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to improve db performance with $7K?
Alex et al., I wonder if thats something to think about adding to Postgresql? A setting for multiblock read count like Oracle (Although || I would think so, yea. GMTA: I was just having this micro-chat with Mr. Jim Nasby. having said that I believe that Oracle natively caches pages much more aggressively that postgresql, which allows the OS to do the file caching). || Yea...and it can rely on what is likely a lot more robust and nuanced caching algorithm, but...i don't know enough (read: anything) about PG's to back that comment up. Alex Turner netEconomist P.S. Oracle changed this with 9i, you can change the Database block size on a tablespace by tablespace bassis making it smaller for OLTP tablespaces and larger for Warehousing tablespaces (at least I think it's on a tablespace, might be on a whole DB). ||Yes, it's tspace level. On 4/19/05, Jim C. Nasby <[EMAIL PROTECTED]> wrote: > On Mon, Apr 18, 2005 at 06:41:37PM -, Mohan, Ross wrote: > > Don't you think "optimal stripe width" would be > > a good question to research the binaries for? I'd > > think that drives the answer, largely. (uh oh, pun alert) > > > > EG, oracle issues IO requests (this may have changed _just_ > > recently) in 64KB chunks, regardless of what you ask for. So when I > > did my striping (many moons ago, when the Earth was young...) I did > > it in 128KB widths, and set the oracle "multiblock read count" > > according. For oracle, any stripe size under 64KB=stupid, anything > > much over 128K/258K=wasteful. > > > > I am eager to find out how PG handles all this. > > AFAIK PostgreSQL requests data one database page at a time (normally > 8k). Of course the OS might do something different. > -- > Jim C. Nasby, Database Consultant [EMAIL PROTECTED] > Give your computer some brain candy! www.distributed.net Team #1828 > > Windows: "Where do you want to go today?" > Linux: "Where do you want to go tomorrow?" > FreeBSD: "Are you guys coming, or what?" > > ---(end of > broadcast)--- > TIP 4: Don't 'kill -9' the postmaster > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How to improve db performance with $7K?
right, the oracle system uses a second "low latency" bus to manage locking information (at the block level) via a distributed lock manager. (but this is slightly different albeit related to a clustered file system and OS-managed locking, eg) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dawid Kuroczko Sent: Wednesday, April 20, 2005 4:56 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? On 4/19/05, Mohan, Ross <[EMAIL PROTECTED]> wrote: > Clustered file systems is the first/best example that > comes to mind. Host A and Host B can both request from diskfarm, eg. Something like a Global File System? http://www.redhat.com/software/rha/gfs/ (I believe some other company did develop it some time in the past; hmm, probably the guys doing LVM stuff?). Anyway the idea is that two machines have same filesystem mounted and they share it. The locking I believe is handled by communication between computers using "host to host" SCSI commands. I never used it, I've only heard about it from a friend who used to work with it in CERN. Regards, Dawid ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] How to improve db performance with $7K?
kewl. Well, 8k request out of PG kernel might turn into an "X"Kb request at disk/OS level, but duly noted. Did you scan the code for this, or are you pulling this recollection from the cognitive archives? :-) -Original Message- From: Jim C. Nasby [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 19, 2005 8:12 PM To: Mohan, Ross Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? On Mon, Apr 18, 2005 at 06:41:37PM -0000, Mohan, Ross wrote: > Don't you think "optimal stripe width" would be > a good question to research the binaries for? I'd > think that drives the answer, largely. (uh oh, pun alert) > > EG, oracle issues IO requests (this may have changed _just_ > recently) in 64KB chunks, regardless of what you ask for. > So when I did my striping (many moons ago, when the Earth > was young...) I did it in 128KB widths, and set the oracle > "multiblock read count" according. For oracle, any stripe size > under 64KB=stupid, anything much over 128K/258K=wasteful. > > I am eager to find out how PG handles all this. AFAIK PostgreSQL requests data one database page at a time (normally 8k). Of course the OS might do something different. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to improve db performance with $7K?
Well, more like they both are allowed to issue disk requests and the magical "clustered file system" manages locking, etc. In reality, any disk is only reading/writing to one part of the disk at any given time, of course, but that in the multiple initiator deal, multiple streams of requests from multiple hosts can be queued. -Original Message- From: Bruce Momjian [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 19, 2005 12:16 PM To: Mohan, Ross Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Mohan, Ross wrote: > Clustered file systems is the first/best example that > comes to mind. Host A and Host B can both request from diskfarm, eg. So one host writes to part of the disk and another host writes to a different part? --- > -Original Message- > From: Bruce Momjian [mailto:[EMAIL PROTECTED] > Sent: Tuesday, April 19, 2005 12:10 PM > To: Mohan, Ross > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] How to improve db performance with $7K? > > > Mohan, Ross wrote: > > The only part I am pretty sure about is that real-world experience > > shows SCSI is better for a mixed I/O environment. Not sure why, > > exactly, but the command queueing obviously helps, and I am not sure > > what else does. > > > > || TCQ is the secret sauce, no doubt. I think NCQ (the SATA version > > || of per se drive request reordering) > >should go a looong way (but not all the way) toward making SATA > > 'enterprise acceptable'. Multiple > >initiators (e.g. more than one host being able to talk to a drive) is a > > biggie, too. AFAIK only SCSI > >drives/controllers do that for now. > > What is 'multiple initiators' used for in the real world? > > -- > Bruce Momjian| http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup.| Newtown Square, Pennsylvania 19073 > > ---(end of > broadcast)--- > TIP 6: Have you searched our list archives? > >http://archives.postgresql.org > -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How to improve db performance with $7K?
Clustered file systems is the first/best example that comes to mind. Host A and Host B can both request from diskfarm, eg. -Original Message- From: Bruce Momjian [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 19, 2005 12:10 PM To: Mohan, Ross Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Mohan, Ross wrote: > The only part I am pretty sure about is that real-world experience > shows SCSI is better for a mixed I/O environment. Not sure why, > exactly, but the command queueing obviously helps, and I am not sure > what else does. > > || TCQ is the secret sauce, no doubt. I think NCQ (the SATA version > || of per se drive request reordering) >should go a looong way (but not all the way) toward making SATA > 'enterprise acceptable'. Multiple >initiators (e.g. more than one host being able to talk to a drive) is a > biggie, too. AFAIK only SCSI >drives/controllers do that for now. What is 'multiple initiators' used for in the real world? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How to improve db performance with $7K?
Good question. If the SCSI system was moving the head from track 1 to 10, and a request then came in for track 5, could the system make the head stop at track 5 on its way to track 10? That is something that only the controller could do. However, I have no idea if SCSI does that. || SCSI, AFAIK, does NOT do this. What SCSI can do is allow "next" request insertion into head of request queue (queue-jumping), and/or defer request ordering to done by drive per se (queue re-ordering). I have looked, in vain, for evidence that SCSI somehow magically "stops in the middle of request to pick up data" (my words, not yours) The only part I am pretty sure about is that real-world experience shows SCSI is better for a mixed I/O environment. Not sure why, exactly, but the command queueing obviously helps, and I am not sure what else does. || TCQ is the secret sauce, no doubt. I think NCQ (the SATA version of per se drive request reordering) should go a looong way (but not all the way) toward making SATA 'enterprise acceptable'. Multiple initiators (e.g. more than one host being able to talk to a drive) is a biggie, too. AFAIK only SCSI drives/controllers do that for now. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to improve db performance with $7K?
Don't you think "optimal stripe width" would be a good question to research the binaries for? I'd think that drives the answer, largely. (uh oh, pun alert) EG, oracle issues IO requests (this may have changed _just_ recently) in 64KB chunks, regardless of what you ask for. So when I did my striping (many moons ago, when the Earth was young...) I did it in 128KB widths, and set the oracle "multiblock read count" according. For oracle, any stripe size under 64KB=stupid, anything much over 128K/258K=wasteful. I am eager to find out how PG handles all this. - Ross p.s. 'You want a database record? I gotcher record right here' http://en.wikipedia.org/wiki/Akashic_Records -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Turner Sent: Monday, April 18, 2005 2:21 PM To: Jacques Caron Cc: Greg Stark; William Yu; pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? So I wonder if one could take this stripe size thing further and say that a larger stripe size is more likely to result in requests getting served parallized across disks which would lead to increased performance? Again, thanks to all people on this list, I know that I have learnt a _hell_ of alot since subscribing. Alex Turner netEconomist On 4/18/05, Alex Turner <[EMAIL PROTECTED]> wrote: > Ok - well - I am partially wrong... > > If you're stripe size is 64Kb, and you are reading 256k worth of data, > it will be spread across four drives, so you will need to read from > four devices to get your 256k of data (RAID 0 or 5 or 10), but if you > are only reading 64kb of data, I guess you would only need to read > from one disk. > > So my assertion that adding more drives doesn't help is pretty > wrong... particularly with OLTP because it's always dealing with > blocks that are smaller that the stripe size. > > Alex Turner > netEconomist > > On 4/18/05, Jacques Caron <[EMAIL PROTECTED]> wrote: > > Hi, > > > > At 18:56 18/04/2005, Alex Turner wrote: > > >All drives are required to fill every request in all RAID levels > > > > No, this is definitely wrong. In many cases, most drives don't > > actually have the data requested, how could they handle the request? > > > > When reading one random sector, only *one* drive out of N is ever > > used to service any given request, be it RAID 0, 1, 0+1, 1+0 or 5. > > > > When writing: > > - in RAID 0, 1 drive > > - in RAID 1, RAID 0+1 or 1+0, 2 drives > > - in RAID 5, you need to read on all drives and write on 2. > > > > Otherwise, what would be the point of RAID 0, 0+1 or 1+0? > > > > Jacques. > > > > > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How to improve db performance with $7K?
Greg, et al. I never found any evidence of a "stop and get an intermediate request" functionality in the TCQ protocol. IIRC, what is there is 1) Ordered 2) Head First 3) Simple implemented as choices. *VERY* roughly, that'd be like (1) disk subsystem satisfies requests as submitted, (2) let's the "this" request be put at the very head of the per se disk queue after the currently-running disk request is complete, and (3) is "let the per se disk and it's software reorder the requests on-hand as per it's onboard software". (N.B. in the last, it's the DISK not the controller making those decisions). (N.B. too, that this last is essentially what NCQ (cf. TCQ) is doing ) I know we've been batting around a hypothetical case of SCSI where it "stops and gets smth. on the way", but I can find no proof (yet) that this is done, pro forma, by SCSI drives. In other words, SCSI is a necessary, but not sufficient cause for intermediate reading. FWIW - Ross -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Friday, April 15, 2005 2:02 PM To: Tom Lane Cc: Kevin Brown; pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Tom Lane <[EMAIL PROTECTED]> writes: > Yes, you can probably assume that blocks with far-apart numbers are > going to require a big seek, and you might even be right in supposing > that a block with an intermediate number should be read on the way. > But you have no hope at all of making the right decisions at a more > local level --- say, reading various sectors within the same cylinder > in an optimal fashion. You don't know where the track boundaries are, > so you can't schedule in a way that minimizes rotational latency. > You're best off to throw all the requests at the drive together and > let the drive sort it out. Consider for example three reads, one at the beginning of the disk, one at the very end, and one in the middle. If the three are performed in the logical order (assuming the head starts at the beginning), then the drive has to seek, say, 4ms to get to the middle and 4ms to get to the end. But if the middle block requires a full rotation to reach it from when the head arrives that adds another 8ms of rotational delay (assuming a 7200RPM drive). Whereas the drive could have seeked over to the last block, then seeked back in 8ms and gotten there just in time to perform the read for free. I'm not entirely convinced this explains all of the SCSI drives' superior performance though. The above is about a worst-case scenario. should really only have a small effect, and it's not like the drive firmware can really schedule things perfectly either. I think most of the difference is that the drive manufacturers just don't package their high end drives with ATA interfaces. So there are no 10k RPM ATA drives and no 15k RPM ATA drives. I think WD is making fast SATA drives but most of the manufacturers aren't even doing that. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Spend 7K *WHERE*? WAS Intel SRCS16 SATA raid? and How to Improve w/7K$?
Sorry to blend threads, but in my kinda longish, somewhat thankless, essentially anonymous, and quite average career as a dba, I have found that the 7K would be best spent on a definitive end-to-end "application critical path" test (pretty easy to instrument apps and lash on test harnesses these days). If it's "the disk subsystem", then by all means, spend the 7K there. If the "7K$" is for "hardware only", then disk is always a good choice. For a really small shop, maybe it's an upgrade to a dual CPU opteron MOBO, eg. dunno. If, however, in the far-more-likely case that the application code or system/business process is the throttle point, it'd be a great use of money to have a test report showing that to the "higher ups". That's where the best scalability bang-for-buck can be made. - Ross p.s. having said this, and as already been noted "7K" ain't going to buy that muchmaybe the ability to go RAID 10? p.p.s Why don't we start a PGSQL-7K listserv, to handle this EPIC thread? :-) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Friday, April 15, 2005 11:20 AM To: Alex Turner Cc: Dave Held; pgsql-performance@postgresql.org; [EMAIL PROTECTED] Subject: Re: [PERFORM] Intel SRCS16 SATA raid? This is a different thread that the $7k server thread. Greg Stark started it and wrote: "I'm also wondering about whether I'm better off with one of these SATA raid controllers or just going with SCSI drives." Rick [EMAIL PROTECTED] wrote on 04/15/2005 10:01:56 AM: > The original thread was how much can I get for $7k > > You can't fit a 15k RPM SCSI solution into $7K ;) Some of us are ona budget! > > 10k RPM SATA drives give acceptable performance at a good price, thats > really the point here. > > I have never really argued that SATA is going to match SCSI > performance on multidrive arrays for IO/sec. But it's all about the > benjamins baby. If I told my boss we need $25k for a database > machine, he'd tell me that was impossible, and I have $5k to do it. If > I tell him $7k - he will swallow that. We don't _need_ the amazing > performance of a 15k RPM drive config. Our biggest hit is reads, so > we can buy 3xSATA machines and load balance. It's all about the > application, and buying what is appropriate. I don't buy a Corvette > if all I need is a malibu. > > Alex Turner > netEconomist > > On 4/15/05, Dave Held <[EMAIL PROTECTED]> wrote: > > > -Original Message- > > > From: Alex Turner [mailto:[EMAIL PROTECTED] > > > Sent: Thursday, April 14, 2005 6:15 PM > > > To: Dave Held > > > Cc: pgsql-performance@postgresql.org > > > Subject: Re: [PERFORM] Intel SRCS16 SATA raid? > > > > > > Looking at the numbers, the raptor with TCQ enabled was close or > > > beat the Atlas III 10k drive on most benchmarks. > > > > And I would be willing to bet that the Atlas 10k is not using the > > same generation of technology as the Raptors. > > > > > Naturaly a 15k drive is going to be faster in many areas, but it > > > is also much more expensive. It was only 44% better on the server > > > tests than the raptor with TCQ, but it costs nearly 300% more > > > ($538 cdw.com, $180 newegg.com). > > > > State that in terms of cars. Would you be willing to pay 300% more > > for a car that is 44% faster than your competitor's? Of course you > > would, because we all recognize that the cost of speed/performance > > does not scale linearly. Naturally, you buy the best speed that you > > can afford, but when it comes to hard drives, the only major feature > > whose price tends to scale anywhere close to linearly is capacity. > > > > > Note also that the 15k drive was the only drive that kept up with > > > the raptor on raw transfer speed, which is going to matter for > > > WAL. > > > > So get a Raptor for your WAL partition. ;) > > > > > [...] > > > The Raptor drives can be had for as little as $180/ea, which is > > > quite a good price point considering they can keep up with their > > > SCSI 10k RPM counterparts on almost all tests with NCQ enabled > > > (Note that 3ware controllers _don't_ support NCQ, although they > > > claim their HBA based queueing is 95% as good as NCQ on the > > > drive). > > > > Just keep in mind the points made by the Seagate article. You're > > buying much more than just performance for that $500+. You're also > > buying vibrational tolerance, high MTBF, better internal > > environmental controls, and a pretty significant margin on seek > > time, which is probably your most important feature for disks > > storing tables. An interesting test would be to stick several drives > > in a cabi
Re: [PERFORM] How to improve db performance with $7K?
I've been doing some reading up on this, trying to keep up here, and have found out that (experts, just yawn and cover your ears) 1) some SATA drives (just type II, I think?) have a "Phase Zero" implementation of Tagged Command Queueing (the special sauce for SCSI). 2) This SATA "TCQ" is called NCQ and I believe it basically allows the disk software itself to do the reordering (this is called "simple" in TCQ terminology) It does not yet allow the TCQ "head of queue" command, allowing the current tagged request to go to head of queue, which is a simple way of manifesting a "high priority" request. 3) SATA drives are not yet multi-initiator? Largely b/c of 2 and 3, multi-initiator SCSI RAID'ed drives are likely to whomp SATA II drives for a while yet (read: a year or two) in multiuser PostGres applications. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Thursday, April 14, 2005 2:04 PM To: Kevin Brown Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Kevin Brown <[EMAIL PROTECTED]> writes: > Greg Stark wrote: > > > > I think you're being misled by analyzing the write case. > > > > Consider the read case. When a user process requests a block and > > that read makes its way down to the driver level, the driver can't > > just put it aside and wait until it's convenient. It has to go ahead > > and issue the read right away. > > Well, strictly speaking it doesn't *have* to. It could delay for a > couple of milliseconds to see if other requests come in, and then > issue the read if none do. If there are already other requests being > fulfilled, then it'll schedule the request in question just like the > rest. But then the cure is worse than the disease. You're basically describing exactly what does happen anyways, only you're delaying more requests than necessary. That intervening time isn't really idle, it's filled with all the requests that were delayed during the previous large seek... > Once the first request has been fulfilled, the driver can now schedule > the rest of the queued-up requests in disk-layout order. > > I really don't see how this is any different between a system that has > tagged queueing to the disks and one that doesn't. The only > difference is where the queueing happens. And *when* it happens. Instead of being able to issue requests while a large seek is happening and having some of them satisfied they have to wait until that seek is finished and get acted on during the next large seek. If my theory is correct then I would expect bandwidth to be essentially equivalent but the latency on SATA drives to be increased by about 50% of the average seek time. Ie, while a busy SCSI drive can satisfy most requests in about 10ms a busy SATA drive would satisfy most requests in 15ms. (add to that that 10k RPM and 15kRPM SCSI drives have even lower seek times and no such IDE/SATA drives exist...) In reality higher latency feeds into a system feedback loop causing your application to run slower causing bandwidth demands to be lower as well. It's often hard to distinguish root causes from symptoms when optimizing complex systems. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Intel SRCS16 SATA raid?
sorry, don't remember whether it's SCSI or SATA II, but IIRC the Areca controllers are just stellar for things. If you do get SATA for db stuff..especially multiuser...i still haven't seen anything to indicate an across-the-board primacy for SATA over SCSI. I'd go w/SCSI, or if SATA for $$$ reasons, I'd be sure to have many spindles and RAID 10. my 0.02. I'm surely not an expert of any kind. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Greg Stark Sent: Thursday, April 14, 2005 10:55 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] Intel SRCS16 SATA raid? Our vendor is trying to sell us on an Intel SRCS16 SATA raid controller instead of the 3ware one. Poking around it seems this does come with Linux drivers and there is a battery backup option. So it doesn't seem to be completely insane. Anyone have any experience with these controllers? I'm also wondering about whether I'm better off with one of these SATA raid controllers or just going with SCSI drives. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to improve db performance with $7K?
Imagine a system in "furious activity" with two (2) process regularly occuring Process One: Long read (or write). Takes 20ms to do seek, latency, and stream off. Runs over and over. Process Two: Single block read ( or write ). Typical database row access. Optimally, could be submillisecond. happens more or less randomly. Let's say process one starts, and then process two. Assume, for sake of this discussion, that P2's block lies w/in P1's swath. (But doesn't have to...) Now, everytime process two has to wait at LEAST 20ms to complete. In a queue-reordering system, it could be a lot faster. And me, looking for disk service times on P2, keep wondering "why does a single diskblock read keep taking >20ms?" Sit doesn't need to be "a read" or "a write". It doesn't need to be "furious activity" (two processes is not furious, even for a single user desktop.) This is not a "corner case", and while it doesn't take into account kernel/drivecache/UBC buffering issues, I think it shines a light on why command re-ordering might be useful. YMMV. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kevin Brown Sent: Thursday, April 14, 2005 4:36 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Greg Stark wrote: > I think you're being misled by analyzing the write case. > > Consider the read case. When a user process requests a block and that > read makes its way down to the driver level, the driver can't just put > it aside and wait until it's convenient. It has to go ahead and issue > the read right away. Well, strictly speaking it doesn't *have* to. It could delay for a couple of milliseconds to see if other requests come in, and then issue the read if none do. If there are already other requests being fulfilled, then it'll schedule the request in question just like the rest. > In the 10ms or so that it takes to seek to perform that read > *nothing* gets done. If the driver receives more read or write > requests it just has to sit on them and wait. 10ms is a lifetime for a > computer. In that time dozens of other processes could have been > scheduled and issued reads of their own. This is true, but now you're talking about a situation where the system goes from an essentially idle state to one of furious activity. In other words, it's a corner case that I strongly suspect isn't typical in situations where SCSI has historically made a big difference. Once the first request has been fulfilled, the driver can now schedule the rest of the queued-up requests in disk-layout order. I really don't see how this is any different between a system that has tagged queueing to the disks and one that doesn't. The only difference is where the queueing happens. In the case of SCSI, the queueing happens on the disks (or at least on the controller). In the case of SATA, the queueing happens in the kernel. I suppose the tagged queueing setup could begin the head movement and, if another request comes in that requests a block on a cylinder between where the head currently is and where it's going, go ahead and read the block in question. But is that *really* what happens in a tagged queueing system? It's the only major advantage I can see it having. > The same thing would happen if you had lots of processes issuing lots > of small fsynced writes all over the place. Postgres doesn't really do > that though. It sort of does with the WAL logs, but that shouldn't > cause a lot of seeking. Perhaps it would mean that having your WAL > share a spindle with other parts of the OS would have a bigger penalty > on IDE drives than on SCSI drives though? Perhaps. But I rather doubt that has to be a huge penalty, if any. When a process issues an fsync (or even a sync), the kernel doesn't *have* to drop everything it's doing and get to work on it immediately. It could easily gather a few more requests, bundle them up, and then issue them. If there's a lot of disk activity, it's probably smart to do just that. All fsync and sync require is that the caller block until the data hits the disk (from the point of view of the kernel). The specification doesn't require that the kernel act on the calls immediately or write only the blocks referred to by the call in question. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Building postmaster with Profiling Support WAS "Tweaking a C Function I wrote"
Adam - Is compiling postmaster with profiling support just a flag in the build/make? Or is there something more involved? I'd like to be able to do this in the future and so am curious about means/methods. If this is a RTFM, just let me know that (am currently Reading The F Manual), but if you have any "special sauce" here, that'd be of great interest. Thanks -Ross -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adam Palmblad Sent: Wednesday, April 06, 2005 7:23 PM To: pgsql-performance@postgresql.org Subject: [PERFORM] Tweaking a C Function I wrote I wanted to see if I could squeeze any more performance out of a C set returning function I wrote. As such, I looked to a profiler. Is it possible to get profile information on the function I wrote? I've got postmaster and my function compiled with profiling support, and can find the gmon.out files... can I actually look at the call tree that occurs when my function is being executed or will I be limited to viewing calls to functions in the postmaster binary? -Adam ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: RE : RE: [PERFORM] Postgresql vs SQLserver for thisapplication ?
31Million tuples were loaded in approx 279 seconds, or approx 112k rows per second. > I'd love to see PG get into this range..i am a big fan of PG (just a > rank newbie) but I gotta think the underlying code to do this has to > be not-too-complex. I'd say we're there. ||Yes! PG is there, assuredly! So VERY cool! I made a newbie error of conflating COPY with INSERT. I don't know if I could get oracle to do much more than about 500-1500 rows/sec...PG is quite impressive. Makes one wonder why corporations positively insist on giving oracle yearly. -Original Message- From: Rod Taylor [mailto:[EMAIL PROTECTED] Sent: Wednesday, April 06, 2005 12:41 PM To: Mohan, Ross Cc: pgsql-performance@postgresql.org Subject: Re: RE : RE: [PERFORM] Postgresql vs SQLserver for thisapplication ? On Wed, 2005-04-06 at 16:12 +, Mohan, Ross wrote: > I wish I had a Dell system and run case to show you Alex, but I > don't... however...using Oracle's "direct path" feature, it's pretty > straightforward. > > We've done 110,000 rows per second into index-less tables on a big > system (IBM Power5 chips, Hitachi SAN). ( Yes, I am sure: over 100K a > second. Sustained for almost 9 minutes. ) Just for kicks I did a local test on a desktop machine (single CPU, single IDE drive) using COPY from STDIN for a set of integers in via a single transaction, no indexes. 1572864 tuples were loaded in 13715.613ms, which is approx 115k rows per second. Okay, no checkpoints and I didn't cross an index boundary, but I also haven't tuned the config file beyond bumping up the buffers. Lets try again with more data this time. 31Million tuples were loaded in approx 279 seconds, or approx 112k rows per second. > I'd love to see PG get into this range..i am a big fan of PG (just a > rank newbie) but I gotta think the underlying code to do this has to > be not-too-complex. I'd say we're there. > -Original Message- > From: Alex Turner [mailto:[EMAIL PROTECTED] > Sent: Wednesday, April 06, 2005 11:38 AM > To: [EMAIL PROTECTED] > Cc: pgsql-performance@postgresql.org; Mohan, Ross > Subject: Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ? > > > I think everyone was scared off by the 5000 inserts per second number. > > I've never seen even Oracle do this on a top end Dell system with > copious SCSI attached storage. > > Alex Turner > netEconomist > > On Apr 6, 2005 3:17 AM, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > > > Unfortunately. > > > > But we are in the the process to choose Postgresql with pgcluster. > > I'm > > currently running some tests (performance, stability...) Save the > > money on the license fees, you get it for your hardware ;-) > > > > I still welcome any advices or comments and I'll let you know how > > the > > project is going on. > > > > Benjamin. > > > > > > > > "Mohan, Ross" <[EMAIL PROTECTED]> > > > > 05/04/2005 20:48 > > > > Pour :<[EMAIL PROTECTED]> > > cc : > > Objet :RE: [PERFORM] Postgresql vs SQLserver for this > > application ? > > > > > > You never got answers on this? Apologies, I don't have one, but'd be > > curious to hear about any you did get > > > > thx > > > > Ross > > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of > > [EMAIL PROTECTED] > > Sent: Monday, April 04, 2005 4:02 AM > > To: pgsql-performance@postgresql.org > > Subject: [PERFORM] Postgresql vs SQLserver for this application ? > > > > > > hi all. > > > > We are designing a quite big application that requires a > > high-performance database backend. The rates we need to obtain are at > > least 5000 inserts per second and 15 selects per second for one > > connection. There should only be 3 or 4 simultaneous connections. > > I think our main concern is to deal with the constant flow of data coming > > from the inserts that must be available for selection as fast as possible. > > (kind of real time access ...) > > > > As a consequence, the database should rapidly increase up to more > > than one hundred gigs. We still have to determine how and when we > > shoud backup old data to prevent the application from a performance > > drop. We intend to develop some kind of real-time partionning on our > > main table keep the flows up. &g
Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?
How close to this is PG's COPY? I get surprisingly good results using COPY with jdbc on smallish systems (now if that patch would make into the mainstream PG jdbc support!) I think COPY has a bit more overhead than what a Bulkload feature may have, but I suspect it's not that much more. || Steve, I do not know. But am reading the docs now, and should figure it out. Ask me later if you remember. Oracle's "direct path" is a way of just slamming blocks filled with rows into the table, above the high water mark. It sidesteps freelist management and all manner of intrablock issues. There is a "payback", but the benefits far far outweigh the costs. > Now...if you ask me "can this work without Power5 and Hitachi SAN?" my > answer is..you give me a top end Dell and SCSI III on 15K disks and > I'll likely easily match it, yea. > > I'd love to see PG get into this range..i am a big fan of PG (just a > rank newbie) but I gotta think the underlying code to do this has to > be not-too-complex. It may not be that far off if you can use COPY instead of INSERT. But comparing Bulkload to INSERT is a bit apples<->orangish. || Oh! I see! I had no idea I was doing that! Thanks for pointing it out clearly to me. Yea, I would say a full transactional INSERT of 5K rows/sec into an indexed-table is a near-mythology without significant caveats (parallelized, deferred buffering, etc.) -- Steve Wampler -- [EMAIL PROTECTED] The gods that smiled on your birth are now laughing out loud. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ?
I wish I had a Dell system and run case to show you Alex, but I don't... however...using Oracle's "direct path" feature, it's pretty straightforward. We've done 110,000 rows per second into index-less tables on a big system (IBM Power5 chips, Hitachi SAN). ( Yes, I am sure: over 100K a second. Sustained for almost 9 minutes. ) Yes, this is an exception, but oracle directpath/InsertAppend/BulkLoad feature enabled us to migrate a 4 TB database...really quickly. Now...if you ask me "can this work without Power5 and Hitachi SAN?" my answer is..you give me a top end Dell and SCSI III on 15K disks and I'll likely easily match it, yea. I'd love to see PG get into this range..i am a big fan of PG (just a rank newbie) but I gotta think the underlying code to do this has to be not-too-complex. Best, Ross -Original Message- From: Alex Turner [mailto:[EMAIL PROTECTED] Sent: Wednesday, April 06, 2005 11:38 AM To: [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org; Mohan, Ross Subject: Re: RE : RE: [PERFORM] Postgresql vs SQLserver for this application ? I think everyone was scared off by the 5000 inserts per second number. I've never seen even Oracle do this on a top end Dell system with copious SCSI attached storage. Alex Turner netEconomist On Apr 6, 2005 3:17 AM, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > Unfortunately. > > But we are in the the process to choose Postgresql with pgcluster. I'm > currently running some tests (performance, stability...) Save the > money on the license fees, you get it for your hardware ;-) > > I still welcome any advices or comments and I'll let you know how the > project is going on. > > Benjamin. > > > > "Mohan, Ross" <[EMAIL PROTECTED]> > > 05/04/2005 20:48 > > Pour :<[EMAIL PROTECTED]> > cc : > Objet :RE: [PERFORM] Postgresql vs SQLserver for this > application ? > > > You never got answers on this? Apologies, I don't have one, but'd be > curious to hear about any you did get > > thx > > Ross > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf > Of [EMAIL PROTECTED] > Sent: Monday, April 04, 2005 4:02 AM > To: pgsql-performance@postgresql.org > Subject: [PERFORM] Postgresql vs SQLserver for this application ? > > > hi all. > > We are designing a quite big application that requires a > high-performance database backend. The rates we need to obtain are at > least 5000 inserts per second and 15 selects per second for one > connection. There should only be 3 or 4 simultaneous connections. > I think our main concern is to deal with the constant flow of data coming > from the inserts that must be available for selection as fast as possible. > (kind of real time access ...) > > As a consequence, the database should rapidly increase up to more > than one hundred gigs. We still have to determine how and when we > shoud backup old data to prevent the application from a performance > drop. We intend to develop some kind of real-time partionning on our > main table keep the flows up. > > At first, we were planning to use SQL Server as it has features that > in my opinion could help us a lot : > - replication > - clustering > > Recently we started to study Postgresql as a solution for our project : > - it also has replication > - Postgis module can handle geographic datatypes (which would > facilitate our developments) > - We do have a strong knowledge on Postgresql administration > (we use it for production processes) > - it is free (!) and we could save money for hardware > purchase. > > Is SQL server clustering a real asset ? How reliable are Postgresql > replication tools ? Should I trust Postgresql performance for this > kind of needs ? > > My question is a bit fuzzy but any advices are most welcome... > hardware,tuning or design tips as well :)) > > Thanks a lot. > > Benjamin. > > > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Reading recommendations
I can see that PG'ers have a wicked sense of humor. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Steve Wampler Sent: Wednesday, March 30, 2005 3:58 PM To: [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Reading recommendations [EMAIL PROTECTED] wrote: >>Mohan, Ross wrote: >> >>>VOIP over BitTorrent? >> >>Now *that* I want to see. Aught to be at least as interesting as the >>"TCP/IP over carrier pigeon" experiment - and more challenging to >>boot! >> > > > It was very challenging. I worked on the credit window sizing and > retransmission timer estimation algorithms. We took into account > weather patterns, size and age of the bird, feeding times, and the > average number of times a bird circles before determining magnetic > north. Interestingly, packet size had little effect in the final > algorithms. > > I would love to share them with all of you, but they're classified. Ah, but VOIPOBT requires many people all saying the same thing at the same time. The synchronization alone (since you need to distribute these people adequately to avoid overloading a trunk line...) is probably sufficiently hard to make it interesting. Then there are the problems of different accents, dilects, and languages ;) -- Steve Wampler -- [EMAIL PROTECTED] The gods that smiled on your birth are now laughing out loud. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Reading recommendations
Yea, the upside is that you get better than the 1 byte/hour rate for pigeon-net. Downside is that simply because you use BiTorrent, the RIAA accuses you of everything from CD piracy to shipping pr*n to cyberterrorism, and you spend the next four years in Gitmo, comparing notes with your cellmates in Camp X-Ray, and watching pigeons fly overhead. -Original Message- From: Steve Wampler [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 30, 2005 11:52 AM To: Mohan, Ross Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Reading recommendations Mohan, Ross wrote: > VOIP over BitTorrent? Now *that* I want to see. Aught to be at least as interesting as the "TCP/IP over carrier pigeon" experiment - and more challenging to boot! -- Steve Wampler -- [EMAIL PROTECTED] The gods that smiled on your birth are now laughing out loud. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Reading recommendations
VOIP over BitTorrent? ;-) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, March 30, 2005 11:27 AM To: Michael Fuhr Cc: Marc Burgauer; pgsql-performance@postgresql.org; [EMAIL PROTECTED] Subject: Re: [PERFORM] Reading recommendations [EMAIL PROTECTED] wrote on 03/30/2005 10:58:21 AM: > > Allow telecommute from across the pond and I might be interested :-) Please post phone bills to this list. > > -- > Michael Fuhr ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])