Re: [PERFORM] wal_buffers
On Oct 5, 2005, at 8:23 AM, Ian Westmacott wrote: Can anyone tell me what precisely a WAL buffer contains, so that I can compute an appropriate setting for wal_buffers (in 8.0.3)? I know the documentation suggests there is little evidence that supports increasing wal_buffers, but we are inserting a large amount of data that, I believe, easily exceeds the default 64K in a single transaction. We are also very sensitive to write latency. As background, we are doing a sustained insert of 2.2 billion rows in 1.3 million transactions per day. Thats about 1700 rows per transaction, at (roughly) 50 bytes per row. Ian, The WAL Configuration chapter (25.2) has a pretty good discussion of how wal_buffers is used: http://www.postgresql.org/docs/8.0/static/wal-configuration.html You might also take a look at Josh Berkus' recent testing on this setting: http://www.powerpostgresql.com/ -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC Strategic Open Source: Open Your i™ http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-469-5150 615-469-5151 (fax) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PERFORM] A Better External Sort?
On Wed, Oct 05, 2005 at 04:55:51PM -0700, Luke Lonergan wrote: > In COPY, we found lots of libc functions like strlen() being called > ridiculous numbers of times, in one case it was called on every > timestamp/date attribute to get the length of TZ, which is constant. That > one function call was in the system category, and was responsible for > several percent of the time. What? strlen is definitely not in the kernel, and thus won't count as system time. /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] A Better External Sort?
Michael, On 10/5/05 8:33 AM, "Michael Stone" <[EMAIL PROTECTED]> wrote: > real0m8.889s > user0m0.877s > sys 0m8.010s > > it's not in disk wait state (in fact the whole read was cached) but it's > only getting 1MB/s. You've proven my point completely. This process is bottlenecked in the CPU. The only way to improve it would be to optimize the system (libc) functions like "fread" where it is spending most of it's time. In COPY, we found lots of libc functions like strlen() being called ridiculous numbers of times, in one case it was called on every timestamp/date attribute to get the length of TZ, which is constant. That one function call was in the system category, and was responsible for several percent of the time. By the way, system routines like fgetc/getc/strlen/atoi etc, don't appear in gprof profiles of dynamic linked objects, nor by default in oprofile results. If the bottleneck is in I/O, you will see the time spent in disk wait, not in system. - Luke ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PERFORM] A Better External Sort?
I'm putting in as much time as I can afford thinking about pg related performance issues. I'm doing it because of a sincere desire to help understand and solve them, not to annoy people. If I didn't believe in pg, I would't be posting thoughts about how to make it better. It's probably worth some review (suggestions marked with a "+": +I came to the table with a possibly better way to deal with external sorts (that now has branched into 2 efforts: short term improvements to the existing code, and the original from-the-ground-up idea). That suggestion was based on a great deal of prior thought and research, despite what some others might think. Then we were told that our IO limit was lower than I thought. +I suggested that as a "Quick Fix" we try making sure we do IO transfers in large enough chunks based in the average access time of the physical device in question so as to achieve the device's ASTR (ie at least 600KB per access for a 50MBps ASTR device with a 12ms average access time.) whenever circumstances allowed us. As far as I know, this experiment hasn't been tried yet. I asked some questions about physical layout and format translation overhead being possibly suboptimal that seemed to be agreed to, but specifics as to where we are taking the hit don't seem to have been made explicit yet. +I made the "from left field" suggestion that perhaps a pg native fs format would be worth consideration. This is a major project, so the suggestion was to at least some extent tongue-in-cheek. +I then made some suggestions about better code instrumentation so that we can more accurately characterize were the bottlenecks are. We were also told that evidently we are CPU bound far before one would naively expect to be based on the performance specifications of the components involved. Double checking among the pg developer community led to some differing opinions as to what the actual figures were and under what circumstances they were achieved. Further discussion seems to have converged on both accurate values and a better understanding as to the HW and SW needed; _and_ we've gotten some RW confirmation as to what current reasonable expectations are within this problem domain from outside the pg community. +Others have made some good suggestions in this thread as well. Since I seem to need to defend my tone here, I'm not detailing them here. That should not be construed as a lack of appreciation of them. Now I've asked for the quickest path to detailed understanding of the pg IO subsystem. The goal being to get more up to speed on its coding details. Certainly not to annoy you or anyone else. At least from my perspective, this for the most part seems to have been an useful and reasonable engineering discussion that has exposed a number of important things. Regards, Ron ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Indexes on ramdisk
> What kind of order of improvement do you need to see? > A lot since the load on the system is expected to increase by up to 100% over the next 6 months. > What period are these number for? Were they collected over 1 hour, 1 day, 1 > month? > I thought I mentioned that in the earlier post but it was from a 2 hour period. It's a busy system. > How much Cache do you have on the controller? > 64Mbytes but I don't think that's an issue. As I mentioned in the first post the table that is the bottleneck has indexes on 15 columns and is seeing a lot of inserts, deletes and updates. The indexes are spread out over the 5 mirrors but it's still a couple of writes per mirror for each operation. I'm going to order an SSD which should give us a lot more headroom than trying to rearrange the RAID setup. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Is There Any Way ....
A blast from the past is forwarded below. douglas Begin forwarded message: From: Tom Lane <[EMAIL PROTECTED]> Date: August 23, 2005 3:23:43 PM EDT To: Donald Courtney <[EMAIL PROTECTED]> Cc: pgsql-performance@postgresql.org, Frank Wiles <[EMAIL PROTECTED]>, gokulnathbabu manoharan <[EMAIL PROTECTED]> Subject: Re: [PERFORM] Caching by Postgres Donald Courtney <[EMAIL PROTECTED]> writes: I am not alone in having the *expectation* that a database should have some cache size parameter and the option to skip the file system. If I use oracle, sybase, mysql and maxdb they all have the ability to size a data cache and move to 64 bits. And you're not alone in holding that opinion despite having no shred of evidence that it's worthwhile expanding the cache that far. However, since we've gotten tired of hearing this FUD over and over, 8.1 will have the ability to set shared_buffers as high as you want. I expect next we'll be hearing from people complaining that they set shared_buffers to use all of RAM and performance went into the tank ... regards, tom lane On Oct 4, 2005, at 11:06 PM, Ron Peacetree wrote: Unfortunately, no matter what I say or do, I'm not going to please or convince anyone who has already have made their minds up to the extent that they post comments like Mr Trainor's below. His response style pretty much proves my earlier point that this is presently a religious issue within the pg community. The absolute best proof would be to build a version of pg that does what Oracle and DB2 have done and implement it's own DB specific memory manager and then compare the performance between the two versions on the same HW, OS, and schema. The second best proof would be to set up either DB2 or Oracle so that they _don't_ use their memory managers and compare their performance to a set up that _does_ use said memory managers on the same HW, OS, and schema. I don't currently have the resources for either experiment. Some might even argue that IBM (where Codd and Date worked) and Oracle just _might_ have had justification for the huge effort they put into developing such infrastructure. Then there's the large library of research on caching strategies in just about every HW and SW domain, including DB theory, that points put that the more context dependent, ie application or domain specific awareness, caching strategies are the better they are. Maybe after we do all we can about physical IO and sorting performance I'll take on the religious fanatics on this one. One problem set at a time. Ron
Re: [PERFORM] Text/Varchar performance...
Cristian, > Hello, just a little question, It's preferable to use Text Fields or > varchar(255) fields in a table? Are there any performance differences in > the use of any of them? TEXT, VARCHAR, and CHAR use the same underlying storage mechanism. This means that TEXT is actually the "fastest" since it doesn't check length or space-pad. However, that's unlikely to affect you unless you've millions of records; you should use the type which makes sense given your application. For "large text fields" I always use TEXT. BTW, in PostgreSQL VARCHAR is not limited to 255; I think we support up to 1GB of text or something preposterous. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Ultra-cheap NVRAM device
Chris wrote: > [EMAIL PROTECTED] (Dan Harris) writes: > > On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote: > > > >> I thought this might be interesting, not the least due to the > >> extremely low > >> price ($150 + the price of regular DIMMs): > > > > Replying before my other post came through.. It looks like their > > benchmarks are markedly improved since the last article I read on > > this. There may be more interest now.. > > It still needs a few more generations worth of improvement. > > 1. It's still limited to SATA speed > 2. It's not ECC smart 3. Another zero (or two) on the price tag :). While it looks like a fun toy to play with, for it to replace hard drives in server environments they need to provide more emphasis and effort in assuring people their drive is reliable. If they really wanted it to be adopted in server environments, it would have been packaged in a 3.5" drive, not a pci card, since that's what we all hot swap (especially since it already uses SATA interface). They would also have allowed use of 2 and 4gb DIMS, and put in a small hard drive that the memory paged to when powered off, and completely isolated the power supply...hard to pack all that in 60$. That said, we are in the last days of the hard disk. I think it is only a matter of months before we see a sub 1000$ part which have zero latency in the 20-40 GB range. Once that happens economies of scale will kick in and hard drives will become basically a backup device. Merlin ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Text/Varchar performance...
On Wed, Oct 05, 2005 at 12:21:35PM -0600, Cristian Prieto wrote: > Hello, just a little question, It's preferable to use Text Fields or > varchar(255) fields in a table? Are there any performance differences in the > use of any of them? They are essentially the same. Note that you can have varchar without length (well, up to about a gigabyte or so after compression), and you can have varchar with a length well above 255 (say, 10). /* Steinar */ -- Homepage: http://www.sesse.net/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PERFORM] A Better External Sort?
On Wed, 2005-10-05 at 12:14 -0400, Ron Peacetree wrote: > I've now gotten verification from multiple working DBA's that DB2, Oracle, and > SQL Server can achieve ~250MBps ASTR (with as much as ~500MBps ASTR in > setups akin to Oracle RAC) when attached to a decent (not outrageous, but > decent) HD subsystem... > > I've not yet had any RW DBA verify Jeff Baker's supposition that ~1GBps ASTR > is > attainable. Cache based bursts that high, yes. ASTR, no. I find your tone annoying. That you do not have access to this level of hardware proves nothing, other than pointing out that your repeated emails on this list are based on supposition. If you want 1GB/sec STR you need: 1) 1 or more Itanium CPUs 2) 24 or more disks 3) 2 or more SATA controllers 4) Linux Have fun. -jwb ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Text/Varchar performance...
Hello, just a little question, It's preferable to use Text Fields or varchar(255) fields in a table? Are there any performance differences in the use of any of them? Thanks a lot for your answer! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] A Better External Sort?
First I wanted to verify that pg's IO rates were inferior to The Competition. Now there's at least an indication that someone else has solved similar problems. Existence proofs make some things easier ;-) Is there any detailed programmer level architectual doc set for pg? I know "the best doc is the code", but the code in isolation is often the Slow Path to understanding with systems as complex as a DBMS IO layer. Ron -Original Message- From: "Joshua D. Drake" <[EMAIL PROTECTED]> Sent: Oct 5, 2005 1:18 PM Subject: Re: [HACKERS] [PERFORM] A Better External Sort? The source is freely available for your perusal. Please feel free to point us in specific directions in the code where you may see some benefit. I am positive all of us that can, would put resources into fixing the issue had we a specific direction to attack. Sincerely, Joshua D. Drake ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Ultra-cheap NVRAM device
[EMAIL PROTECTED] (Dan Harris) writes: > On Oct 3, 2005, at 5:02 AM, Steinar H. Gunderson wrote: > >> I thought this might be interesting, not the least due to the >> extremely low >> price ($150 + the price of regular DIMMs): > > Replying before my other post came through.. It looks like their > benchmarks are markedly improved since the last article I read on > this. There may be more interest now.. It still needs a few more generations worth of improvement. 1. It's still limited to SATA speed 2. It's not ECC smart What I'd love to see would be something that much smarter, or, at least, that pushes the limits of SATA speed, and which has both a battery on board and enough CF storage to cope with outages. -- output = reverse("gro.mca" "@" "enworbbc") http://www.ntlug.org/~cbbrowne/linuxxian.html We all live in a yellow subroutine, a yellow subroutine, a yellow subroutine... ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Is There Any Way ....
From: Kevin Grittner <[EMAIL PROTECTED]> Sent: Oct 5, 2005 2:16 AM Subject: Re: [PERFORM] Is There Any Way >First off, Mr. Trainor's response proves nothing about anyone or >anything except Mr. Trainor. > Fair Enough. I apologize for the inappropriately general statement. >I'm going to offer an opinion on the caching topic. I don't have >any benchmarks; I'm offering a general sense of the issue based on >decades of experience, so I'll give a short summary of that. > >I've been earning my living by working with computers since 1972, > ~1978 for me. So to many on this list, I also would be an "old fart". > I've pretty much spent my entire career thinking about and making advances in RW distributed computing and parallel processing as first a programmer and then a systems architect. >Now on to the meat of it. > I agree with your comments just about across the board. I also agree with the poster(s) who noted that the "TLC factor" and the 2x every 18months pace of increasing HW performance and RAM capacity make this stuff a moving target. OTOH, there are some fundamentals that don't seem to change no matter how far or fast the computing field evolves. As usual, the proper answers involve finding a sometimes nontrivial balance between building on known precedent and not being trapped by doctrine. Ron ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] A Better External Sort?
> We have to fix this. > Ron > The source is freely available for your perusal. Please feel free to point us in specific directions in the code where you may see some benefit. I am positive all of us that can, would put resources into fixing the issue had we a specific direction to attack. Sincerely, Joshua D. Drake -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PERFORM] A Better External Sort?
I've now gotten verification from multiple working DBA's that DB2, Oracle, and SQL Server can achieve ~250MBps ASTR (with as much as ~500MBps ASTR in setups akin to Oracle RAC) when attached to a decent (not outrageous, but decent) HD subsystem... I've not yet had any RW DBA verify Jeff Baker's supposition that ~1GBps ASTR is attainable. Cache based bursts that high, yes. ASTR, no. The DBA's in question run RW installations that include Solaris, M$, and Linux OS's for companies that just about everyone on these lists are likely to recognize. Also, the implication of these pg IO limits is that money spent on even moderately priced 300MBps SATA II based RAID HW is wasted $'s. In total, this situation is a recipe for driving potential pg users to other DBMS. 25MBps in and 15MBps out is =BAD=. Have we instrumented the code in enough detail that we can tell _exactly_ where the performance drainage is? We have to fix this. Ron -Original Message- From: Luke Lonergan <[EMAIL PROTECTED]> Sent: Oct 5, 2005 11:24 AM To: Michael Stone <[EMAIL PROTECTED]>, Martijn van Oosterhout Cc: pgsql-hackers@postgresql.org, pgsql-performance@postgresql.org Subject: Re: [HACKERS] [PERFORM] A Better External Sort? Nope - it would be disk wait. COPY is CPU bound on I/O subsystems faster that 50 MB/s on COPY (in) and about 15 MB/s (out). - Luke -Original Message- From: Michael Stone [mailto:[EMAIL PROTECTED] Sent: Wed Oct 05 09:58:41 2005 To: Martijn van Oosterhout Cc: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org Subject:Re: [HACKERS] [PERFORM] A Better External Sort? On Sat, Oct 01, 2005 at 06:19:41PM +0200, Martijn van Oosterhout wrote: >COPY TO /dev/null WITH binary >13MB/s55% user 45% system (ergo, CPU bound) [snip] >the most expensive. But it does point out that the whole process is >probably CPU bound more than anything else. Note that 45% of that cpu usage is system--which is where IO overhead would end up being counted. Until you profile where you system time is going it's premature to say it isn't an IO problem. Mike Stone ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] A Better External Sort?
On Wed, Oct 05, 2005 at 11:24:07AM -0400, Luke Lonergan wrote: Nope - it would be disk wait. I said I/O overhead; i.e., it could be the overhead of calling the kernel for I/O's. E.g., the following process is having I/O problems: time dd if=/dev/sdc of=/dev/null bs=1 count=1000 1000+0 records in 1000+0 records out 1000 bytes transferred in 8.887845 seconds (1125132 bytes/sec) real0m8.889s user0m0.877s sys 0m8.010s it's not in disk wait state (in fact the whole read was cached) but it's only getting 1MB/s. Mike Stone ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Is There Any Way ....
On Tue, 4 Oct 2005 23:06:54 -0400 (EDT) Ron Peacetree <[EMAIL PROTECTED]> wrote: > Then there's the large library of research on caching strategies > in just about every HW and SW domain, including DB theory, > that points put that the more context dependent, ie application > or domain specific awareness, caching strategies are the better > they are. Isn't this also a very strong argument for putting your caching into your application and not at the database level? As you say the more "application or domain specific" it is the better. I don't see how PostgreSQL is going to magically determine what is perfect for everyone's differing needs and implement it for you. Even rudimentary controls such "always keep this table/index/whatever in RAM" aren't as fine grained or specific enough to get full benefit. My suggestion is to use something like memcached to store your data in, based on the particular needs of your application. This puts all of the control in the hands of the programmer where, in my opinion, it belongs. Just to clarify, I'm not entirely against the idea, but I certainly think there are other areas of PostgreSQL we should be focusing our efforts. - Frank Wiles <[EMAIL PROTECTED]> http://www.wiles.org - ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] A Better External Sort?
Nope - it would be disk wait. COPY is CPU bound on I/O subsystems faster that 50 MB/s on COPY (in) and about 15 MB/s (out). - Luke -Original Message- From: Michael Stone [mailto:[EMAIL PROTECTED] Sent: Wed Oct 05 09:58:41 2005 To: Martijn van Oosterhout Cc: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org Subject:Re: [HACKERS] [PERFORM] A Better External Sort? On Sat, Oct 01, 2005 at 06:19:41PM +0200, Martijn van Oosterhout wrote: >COPY TO /dev/null WITH binary >13MB/s55% user 45% system (ergo, CPU bound) [snip] >the most expensive. But it does point out that the whole process is >probably CPU bound more than anything else. Note that 45% of that cpu usage is system--which is where IO overhead would end up being counted. Until you profile where you system time is going it's premature to say it isn't an IO problem. Mike Stone ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Indexes on ramdisk
What kind of order of improvement do you need to see? What period are these number for? Were they collected over 1 hour, 1 day, 1 month? How much Cache do you have on the controller? You can certainly get more speed by adding more disk and possibly by adding more controller RAM/a second controller. 10 disks isn't really that many for a totally kick-ass DB server. You can acheive more block writes with RAID 10s than with RAID 1s. Wether it's cost effective is dependant on lots of factors like your chassis and drive enclosures etc. vs SSD. SSD will be faster, but last I heard was expensive, and I checked a few websites but couldn't get much price info. Normaly when you can't get price info, thats a bad sign ;). If you are doing large chunks of writes to a small number of tables, then you might be better off with a single large RAID 10 for your tablespace than with seperate RAID 1s. If you are writing 5 to 1 more table data than index data, you are hurting yourself by seperating on to multiple RAID 1s instead of a single RAID 10 which could write at 2-3x for the 5, and 2-3x for the 1 and only suffer a single seek penalty but get data onto disk twice to three times as fast (depending how many RAID 1s you join). Try unseperating RAID 1s, and combine to a RAID 10. for indexes and tablespaces. The controller will re-sequence your writes/reads to help with effeciency, and dbwriter is there to make things go easier. You can at least get some idea by doing an iostat and see how many IOs and how much throughput is happening. That will rappidly help determine if you are bound by IOs or by MB/sec. Worst case I'm wrong, but IMHO it's worth a try. Alex Turner NetEconomistOn 10/4/05, Emil Briggs <[EMAIL PROTECTED]> wrote: > Talk about your IO system a bit. There might be obvious ways to improve.>> What System/Motherboard are you using?> What Controller Cards are you using?> What kind of Disks do you have (SATA, SCSI 7.6k 10k 15k)> What denominations (9, 18, 36, 72, 143, 80, 160, 200 240Gig)?> What kind of RAIDs do you have setup (How many drives what stripe sizes,> how many used for what).> What levels of RAID are you using (0,1,10,5,50)? >It's a quad opteron system. RAID controller is a 4 channel LSILogic Megaraid320 connected to 10 15k 36.7G SCSI disks. The disks are configured in 5mirrored partitions. The pg_xlog is on one mirror and the data and indexes are spread over the other 4 using tablespaces. These numbers frompg_stat_user_tables are from about 2 hours earlier today on this one table.idx_scan 20578690idx_tup_fetch 35866104841 n_tup_ins1940081n_tup_upd 1604041n_tup_del1880424---(end of broadcast)---TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Indexes on ramdisk
> It's a quad opteron system. RAID controller is a 4 channel LSILogic > Megaraid > 320 connected to 10 15k 36.7G SCSI disks. The disks are configured in 5 > mirrored partitions. The pg_xlog is on one mirror and the data and indexes > are spread over the other 4 using tablespaces. These numbers from > pg_stat_user_tables are from about 2 hours earlier today on this one > table. > > > idx_scan 20578690 > idx_tup_fetch 35866104841 > n_tup_ins1940081 > n_tup_upd 1604041 > n_tup_del1880424 Is your raid controller configured to buffer your writes? How much RAM are you packing? Are you running 64 bit? Merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Is There Any Way ....
** Low Priority ** Human feedback from testers and users has proven pretty effective at catching errors in the "human assisted" cache configuration. When people setting up the servers have missed the named cache configuration, and all they had was the single general purpose cache, it has been caught because of user complaints on performance. There was an attempt made to simulate database queries -- hitting a client side cache on some of the roughly100 tables (out of 300 in the well normalized schema) which fit this pattern of usage. It didn't prove very cost effective. It just makes more sense to allow the DBAs to tweek database performance through database configuration changes than to jump through that many hoops in application code to try to achieve it where it becomes an issue. As far as I know, you can't use this technique in Microsoft SQL Server or Oracle. They are using Sybase Adaptive Server Enterprise (ASE). I believe named caches were added in version 12.0, long after Microsoft split off with their separate code stream based on the Sybase effort. -Kevin >>> "Dario" <[EMAIL PROTECTED]> 10/05/05 6:16 AM >>> I'm sure there will be cases when some human assisted caching algorithm will perform better than an mathetical statistical based design, but it will also depend on the "human". And it probably will make thing worse when workload changes and human doesn't realize. It must be considered that, today, hardware cost is not the %90 of budget that it used to be. Throwing hardware at the system can be as much expensive as throwing certified "it stuff". (just think in coffee budget! :-) ) If you need to improve "user perception", you can do others things. Like caching a table in your client (with a trigger for any change on table X updating a table called "timestamp_table_change" and a small select to this table, you can easily know when you must update your client). If it is a application server, serving http request, then "user perception" will be sticked to bandwidth AND application server (some of them have cache for request). FYI, I don't recall a mechanism in MSSQL to cache a table in buffers. Oracle has some structures to allow that. (you know) It uses his own buffer. Since version 9i, you can set three different data buffers, one (recycled cache) for low usage tables (I mean tables with blocks which don't have too much chance to be queried again, like a very large historical table) , one for high usage tables (keep cache), and the regular one (difference is in algorithm). And you must also set a buffer cache size for tablespaces with different block size. But there is no such thing as "create table x keep entirenly in buffer". And above all things, oracle doc always states "first, tune design, then tune queries, then start tunning engine". greetings. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] wal_buffers
Can anyone tell me what precisely a WAL buffer contains, so that I can compute an appropriate setting for wal_buffers (in 8.0.3)? I know the documentation suggests there is little evidence that supports increasing wal_buffers, but we are inserting a large amount of data that, I believe, easily exceeds the default 64K in a single transaction. We are also very sensitive to write latency. As background, we are doing a sustained insert of 2.2 billion rows in 1.3 million transactions per day. Thats about 1700 rows per transaction, at (roughly) 50 bytes per row. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] SELECT LIMIT 1 VIEW Performance Issue
[to K C:] sorry, was out on vactation all last week. I was visualizing the problem incorrectly anyways... Jim wrote: > That function is not immutable, it should be defined as stable. That is 100% correct: however now and then I declare stable functions as immutable in some cases because the planner treats them differently with no side effects...this is a hack of course...see my earlier suggestion to try both immutable and stable versions. I can give a pretty good example of when this can make a big difference. > PostgreSQL doesn't pre-compile functions, at least not until 8.1 (and > I'm not sure how much those are pre-compiled, though they are > syntax-checked at creation). Do you get the same result time when you > run it a second time? What time do you get from running just the > function versus the SQL in the function? plpgsql functions are at least partially compiled (sql functions afaik are not), in that a internal state is generated following the first execution. This is the cause of all those infernal 'invalid table oid' errors. > Also, remember that every layer you add to the cake means more work for > the database. If speed is that highly critical you'll probably want to > not wrap things in functions, and possibly not use views either. The overhead of the function/view is totally inconsequential next to the planner choosing a suboptimal plan. The purpose of the function is to coerce the planner into choosing the correct plan. > Also, keep in mind that getting below 1ms doesn't automatically mean > you'll be able to scale to 1000TPS. Things will definately change when > you load the system down, so if performance is that critical you should > start testing with the system under load if you're not already. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Is There Any Way ....
I'm sure there will be cases when some human assisted caching algorithm will perform better than an mathetical statistical based design, but it will also depend on the "human". And it probably will make thing worse when workload changes and human doesn't realize. It must be considered that, today, hardware cost is not the %90 of budget that it used to be. Throwing hardware at the system can be as much expensive as throwing certified "it stuff". (just think in coffee budget! :-) ) If you need to improve "user perception", you can do others things. Like caching a table in your client (with a trigger for any change on table X updating a table called "timestamp_table_change" and a small select to this table, you can easily know when you must update your client). If it is a application server, serving http request, then "user perception" will be sticked to bandwidth AND application server (some of them have cache for request). FYI, I don't recall a mechanism in MSSQL to cache a table in buffers. Oracle has some structures to allow that. (you know) It uses his own buffer. Since version 9i, you can set three different data buffers, one (recycled cache) for low usage tables (I mean tables with blocks which don't have too much chance to be queried again, like a very large historical table) , one for high usage tables (keep cache), and the regular one (difference is in algorithm). And you must also set a buffer cache size for tablespaces with different block size. But there is no such thing as "create table x keep entirenly in buffer". And above all things, oracle doc always states "first, tune design, then tune queries, then start tunning engine". greetings. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Is There Any Way ....
Hey, you can say what you want about my style, but you still haven't pointed to even one article from the vast literature that you claim supports your argument. And I did include a smiley. Your original email that PostgreSQL is wrong and that you are right led me to believe that you, like others making such statements, would not post your references. You remind me of Ted Nelson, who wanted the computing center at the University of Illinois at Chicago to change their systems just for him. BTW, I'm a scientist -- I haven't made my mind up about anything. I really am interested in what you say, if there is any real work backing up your claims such that it would impact average cases. Any app designer can conceive of many ways to game the server to their app's advantage -- I'm not interested in that potboiler. douglas On Oct 4, 2005, at 11:06 PM, Ron Peacetree wrote: Unfortunately, no matter what I say or do, I'm not going to please or convince anyone who has already have made their minds up to the extent that they post comments like Mr Trainor's below. His response style pretty much proves my earlier point that this is presently a religious issue within the pg community. The absolute best proof would be to build a version of pg that does what Oracle and DB2 have done and implement it's own DB specific memory manager and then compare the performance between the two versions on the same HW, OS, and schema. The second best proof would be to set up either DB2 or Oracle so that they _don't_ use their memory managers and compare their performance to a set up that _does_ use said memory managers on the same HW, OS, and schema. I don't currently have the resources for either experiment. Some might even argue that IBM (where Codd and Date worked) and Oracle just _might_ have had justification for the huge effort they put into developing such infrastructure. Then there's the large library of research on caching strategies in just about every HW and SW domain, including DB theory, that points put that the more context dependent, ie application or domain specific awareness, caching strategies are the better they are. Maybe after we do all we can about physical IO and sorting performance I'll take on the religious fanatics on this one. One problem set at a time. Ron ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PERFORM] A Better External Sort?
On Tue, Oct 04, 2005 at 12:43:10AM +0300, Hannu Krosing wrote: Just FYI, I run a count(*) on a 15.6GB table on a lightly loaded db and it run in 163 sec. (Dual opteron 2.6GHz, 6GB RAM, 6 x 74GB 15k disks in RAID10, reiserfs). A little less than 100MB sec. And none of that 15G table is in the 6G RAM? Mike Stone ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Query in SQL statement
R, Rajesh (STSD) wrote: Am trying to port a mysql statement to postgres. Please help me in finding the error in this, Can I recommend the reference section of the manuals for this sort of thing? There is an excellent section detailing the valid SQL for the CREATE TABLE command. Also - the pgsql-hackers list is for discussion of database development, and the performance list is for performance problems. This would be better posted on pgsql-general or -sql or -novice. CREATE SEQUENCE ai_id; This line is causing the first error: > ERROR: relation "ai_id" already exists That's because you've already successfully created the sequence, so it already exists. Either drop it and recreate it, or stop trying to recreate it. CREATE TABLE badusers ( id int DEFAULT nextval('ai_id') NOT NULL, UserName varchar(30), Date datetime DEFAULT '-00-00 00:00:00' NOT NULL, Well, "Date" is a type-name, "datetime" isn't and even if it was "-00-00" isn't a valid date is it? Reason varchar(200), Admin varchar(30) DEFAULT '-', PRIMARY KEY (id), KEY UserName (UserName), KEY Date (Date) The word "KEY" isn't valid here either - are you trying to define an index? If so, see the "CREATE INDEX" section of the SQL reference. http://www.postgresql.org/docs/8.0/static/sql-commands.html If you reply to this message, please remove the pgsql-hackers CC: -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Which one FreeBSD or Linux
ALÝ ÇELÝK wrote: FreeBSD or Linux , which system has better performance for PostgreSQL Depends on the underlying hardware and your experience. I'd recommend going with whichever you are more familiar, so long as it will support the hardware you need to buy. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] A Better External Sort?
On Sat, Oct 01, 2005 at 06:19:41PM +0200, Martijn van Oosterhout wrote: COPY TO /dev/null WITH binary 13MB/s55% user 45% system (ergo, CPU bound) [snip] the most expensive. But it does point out that the whole process is probably CPU bound more than anything else. Note that 45% of that cpu usage is system--which is where IO overhead would end up being counted. Until you profile where you system time is going it's premature to say it isn't an IO problem. Mike Stone ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Query in SQL statement
R, Rajesh (STSD) wrote: Thanks. I've already understood that I need to post it in another list. Sorry for wasting your precious time. No time wasted. It was a perfectly reasonable question, just to the wrong lists. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] A Better External Sort?
> In my original example, a sequential scan of the 1TB of 2KB > or 4KB records, => 250M or 500M records of data, being sorted > on a binary value key will take ~1000x more time than reading > in the ~1GB Btree I described that used a Key+RID (plus node > pointers) representation of the data. Imho you seem to ignore the final step your algorithm needs of collecting the data rows. After you sorted the keys the collect step will effectively access the tuples in random order (given a sufficiently large key range). This random access is bad. It effectively allows a competing algorithm to read the whole data at least 40 times sequentially, or write the set 20 times sequentially. (Those are the random/sequential ratios of modern discs) Andreas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Query in SQL statement
Am trying to port a mysql statement to postgres. Please help me in finding the error in this, CREATE SEQUENCE ai_id; CREATE TABLE badusers ( id int DEFAULT nextval('ai_id') NOT NULL, UserName varchar(30), Date datetime DEFAULT '-00-00 00:00:00' NOT NULL, Reason varchar(200), Admin varchar(30) DEFAULT '-', PRIMARY KEY (id), KEY UserName (UserName), KEY Date (Date) ); Am always getting foll. Errors, ERROR: relation "ai_id" already exists ERROR: syntax error at or near "(" at character 240 Thanks, Rajesh R ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PERFORM] A Better External Sort?
On Thu, Sep 29, 2005 at 10:06:52AM -0700, Luke Lonergan wrote: > Josh, > > On 9/29/05 9:54 AM, "Josh Berkus" wrote: > > > Following an index creation, we see that 95% of the time required > > is the external sort, which averages 2mb/s. This is with seperate > > drives for the WAL, the pg_tmp, the table and the index. I've > > confirmed that increasing work_mem beyond a small minimum (around > > 128mb) had no benefit on the overall index creation speed. > > Yp! That about sums it up - regardless of taking 1 or 2 passes > through the heap being sorted, 1.5 - 2 MB/s is the wrong number. > This is not necessarily an algorithmic problem, but is a > optimization problem with Postgres that must be fixed before it can > be competitive. > > We read/write to/from disk at 240MB/s and so 2 passes would run at a > net rate of 120MB/s through the sort set if it were that efficient. > > Anyone interested in tackling the real performance issue? (flame > bait, but for a worthy cause :-) I'm not sure that it's flamebait, but what do I know? Apart from the nasty number (1.5-2 MB/s), what other observations do you have to hand? Any ideas about what things are not performing here? Parts of the code that could bear extra scrutiny? Ideas on how to fix same in a cross-platform way? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] [HACKERS] Query in SQL statement
I think this question may be more appropriate for [EMAIL PROTECTED] Anyrate for the below. Sounds like you maybe already have a table or sequence called ai_id; Try doing a DROP SEQUENCE ai_id; First Also if you plan to use this sequence only for this table it would be better to use serial8 which will automatically create the sequence for you. Then you don't even need that first part. Also you should avoid naming fields things like Date which tend to be keywords in many kinds of databases. Try changing your logic to something like CREATE TABLE badusers ( id serial8, UserName varchar(30), Date timestamp DEFAULT now() NOT NULL, Reason varchar(200), Admin varchar(30) DEFAULT '-', PRIMARY KEY (id) ); CREATE INDEX badusers_username ON badusers USING btree (username); CREATE INDEX badusers_date ON badusers USING btree (date); -Original Message- From: R, Rajesh (STSD) [mailto:[EMAIL PROTECTED] Sent: Thursday, September 29, 2005 9:05 AM To: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org Subject: [HACKERS] Query in SQL statement Am trying to port a mysql statement to postgres. Please help me in finding the error in this, CREATE SEQUENCE ai_id; CREATE TABLE badusers ( id int DEFAULT nextval('ai_id') NOT NULL, UserName varchar(30), Date datetime DEFAULT '-00-00 00:00:00' NOT NULL, Reason varchar(200), Admin varchar(30) DEFAULT '-', PRIMARY KEY (id), KEY UserName (UserName), KEY Date (Date) ); Am always getting foll. Errors, ERROR: relation "ai_id" already exists ERROR: syntax error at or near "(" at character 240 Thanks, Rajesh R ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] A Better External Sort?
On 9/30/05, Ron Peacetree <[EMAIL PROTECTED]> wrote: > 4= I'm sure we are paying all sorts of nasty overhead for essentially > emulating the pg "filesystem" inside another filesystem. That means > ~2x as much overhead to access a particular piece of data. > > The simplest solution is for us to implement a new VFS compatible > filesystem tuned to exactly our needs: pgfs. > > We may be able to avoid that by some amount of hacking or > modifying of the current FSs we use, but I suspect it would be more > work for less ROI. On this point, Reiser4 fs already implements a number of things which would be desirable for PostgreSQL. For example: write()s to reiser4 filesystems are atomic, so there is no risk of torn pages (this is enabled because reiser4 uses WAFL like logging where data is not overwritten but rather relocated). The filesystem is modular and extensible so it should be easy to add whatever additional semantics are needed. I would imagine that all that would be needed is some more atomicity operations (single writes are already atomic, but I'm sure it would be useful to batch many writes into a transaction),some layout and packing controls, and some flush controls. A step further would perhaps integrate multiversioning directly into the FS (the wandering logging system provides the write side of multiversioning, a little read side work would be required.). More importantly: the file system was intended to be extensible for this sort of application. It might make a good 'summer of code' project for someone next year, ... presumably by then reiser4 will have made it into the mainline kernel by then. :) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster