[PERFORM] Replication
Looking for replication solutions, I find: Slony-I Seems good, single master only, master is a single point of failure, no good failover system for electing a new master or having a failed master rejoin the cluster. Slave databases are mostly for safety or for parallelizing queries for performance. Suffers from O(N^2) communications (N = cluster size). Slony-II Seems brilliant, a solid theoretical foundation, at the forefront of computer science. But can't find project status -- when will it be available? Is it a pipe dream, or a nearly-ready reality? PGReplication Appears to be a page that someone forgot to erase from the old GBorg site. PGCluster Seems pretty good, but web site is not current, there are releases in use that are not on the web site, and also seems to always be a couple steps behind the current release of Postgres. Two single-points failure spots, load balancer and the data replicator. Is this a good summary of the status of replication? Have I missed any important solutions or mischaracterized anything? Thanks! Craig (Sorry about the premature send of this message earlier, please ignore.) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning
Greg Smith wrote: If you're going to the trouble of building a tool for offering configuration advice, it can be widly more effective if you look inside the database after it's got data in it, and preferably after it's been running under load for a while, and make your recommendations based on all that information. There are two completely different problems that are getting mixed together in this discussion. Several people have tried to distinguish them, but let's be explicit: 1. Generating a resonable starting configuration for neophyte users who have installed Postgres for the first time. 2. Generating an optimal configuration for a complex, running system that's loaded with data. The first problem is easy: Any improvement would be welcome and would give most users a better initial experience. The second problem is nearly impossible. Forget the second problem (or put it on the "let's find someone doing a PhD project" list), and focus on the first. From my limited experience, a simple questionaire could be used to create a pretty good starting configuration file. Furthermore, many of the answers can be discovered automatically: 1. How much memory do you have? 2. How many disks do you have? a. Which disk contains the OS? b. Which disk(s) have swap space? c. Which disks are "off limits" (not to be used by Postgres) 3. What is the general nature of your database? a. Mostly static (few updates, lots of access) b. Mostly archival (lots of writes, few reads) c. Very dynamic (data are added, updated, and deleted a lot) 4. Do you have a lot of small, fast transactions or a few big, long transactions? 5. How big do you expect your database to be? 6. How many simultaneous users do you expect? 7. What are the users you want configured initially? 8. Do you want local access only, or network access? With these few questions (and perhaps a couple more), a decent set of startup files could be created that would give good, 'tho not optimal, performance for most people just getting started. I agree with an opinion posted a couple days ago: The startup configuration is one of the weakest features of Postgres. It's not rocket science, but there are several files, and it's not obvious to the newcomer that the files even exist. Here's just one example: A coworker installed Postgres and couldn't get it to work at all. He struggled for hours. When he contacted me, I tried his installation and it worked fine. He tried it, and he couldn't connect. I asked him, "Are you using localhost?" He said yes, but what he meant was he was using the local *network*, 192.168.0.5, whereas I was using "localhost". He didn't have network access enabled. So, four hours wasted. This is the sort of thing that makes experienced users say, "Well, duh!" But there are many number of these little traps and obscure configuration parameters that make the initial Postgres experience a poor one. It wouldn't take much to make a big difference to new users. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: Filesystem fragmentation (Re: [PERFORM] Fragmentation of WAL files)
Bill Moran wrote: In response to Heikki Linnakangas <[EMAIL PROTECTED]>: Can anyone else confirm this? I don't know if this is a windows-only issue, but I don't know of a way to check fragmentation in unix. I can confirm that it's only a Windows problem. No UNIX filesystem that I'm aware of suffers from fragmentation. What do you mean by suffering? All filesystems fragment files at some point. When and how differs from filesystem to filesystem. And some filesystems might be smarter than others in placing the fragments. To clarify my viewpoint: To my knowledge, there is no Unix filesystem that _suffers_ from fragmentation. Specifically, all filessytems have some degree of fragmentation that occurs, but every Unix filesystem that I am aware of has built-in mechanisms to mitigate this and prevent it from becoming a performance issue. More specifically, this problem was solved on UNIX file systems way back in the 1970's and 1980's. No UNIX file system (including Linux) since then has had significant fragmentation problems, unless the file system gets close to 100% full. If you run below 90% full, fragmentation shouldn't ever be a significant performance problem. The word "fragmentation" would have dropped from the common parlance if it weren't for MS Windoz. Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Basic Q on superfluous primary keys
Merlin Moncure wrote: In the context of this debate, I see this argument all the time, with the implied suffix: 'If only we used integer keys we would not have had this problem...'. Either the customer identifies parts with a part number or they don't...and if they do identify parts with a number and recycle the numbers, you have a problem...period. On the contrary. You create a new record with the same part number. You mark the old part number "obsolete". Everything else (the part's description, and all the relationships that it's in, such as order history, catalog inclusion, revision history, etc.) is unaffected. New orders are placed against the new part number's DB record; for safety the old part number can have a trigger that prevent new orders from being placed. Since the part number is NOT the primary key, duplicate part numbers are not a problem. If you had instead used the part number as the primary key, you'd be dead in the water. You can argue that the customer is making a dumb decision by reusing catalog numbers, and I'd agree. But they do it, and as database designers we have to handle it. In my particular system, we aggregate information from several hundred companies, and this exact scenario happens frequently. Since we're only aggregating information, we have no control over the data that these companies provide. If we'd used catalog numbers for primary keys, we'd have big problems. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Basic Q on superfluous primary keys
Merlin Moncure wrote: Since the part number is NOT the primary key, duplicate part numbers are not a problem. If you had instead used the part number as the primary key, you'd be dead in the water. You are redefining the primary key to be (part_number, obsoletion_date). Now, if you had not anticipated that in the original design (likely enough), you do have to refactor queries that join on the table...so what? If that's too much work, you can use a view to take care of the problem (which may be a good idea anyways). *you have to refactor the system anyways because you are now allowing duplicate part numbers where previously (from the perspective of the user), they were unique *. The hidden advantage of pushing the full key through the database is it tends to expose holes in the application/business logic. Chances are some query is not properly distinguishing obsoleted parts and now the real problems come...surrogate keys do not remove complexity, they simply sweep it under the rug. This really boils down to an object-oriented perspective. I have an object, a customer's catalog entry. It has properties such as catalog number, description, etc, and whether it's obsolete or not. Management of the object (its relation to other objects, its history, etc.) should NOT depend on the object's specific definition. This is true whether the object is represented in Lisp, C++, Perl, or (in this case) an SQL schema. Good object oriented design abstracts the object and its behavior from management of the object. In C++, Perl, etc., we manage objects via a pointer or object reference. In SQL, we reference objects by an *arbitrary* integer that is effectively a pointer to the object. What you're suggesting is that I should break the object-oriented encapsulation by pulling out specific fields of the object, exposing those internal object details to the applications, and spreading those details across the whole schema. And I argue that this is wrong, because it breaks encapsulation. By exposing the details of the object, if the details change, *all* of your relationships break, and all of your applications have to change. And I've never seen a system where breaking object-oriented encapsulation was a good long-term solution. Systems change, and object-oriented techniques were invented to help manage change. This is one of the reasons the Postgres project was started way back when: To bring object-oriented techniques to the relational-database world. Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Basic Q on superfluous primary keys
Merlin Moncure wrote: In the context of this debate, I see this argument all the time, with the implied suffix: 'If only we used integer keys we would not have had this problem...'. Either the customer identifies parts with a part number or they don't...and if they do identify parts with a number and recycle the numbers, you have a problem...period. On the contrary. You create a new record with the same part number. You mark the old part number "obsolete". Everything else (the part's description, and all the relationships that it's in, such as order history, catalog inclusion, revision history, etc.) is unaffected. New orders are placed against the new part number's DB record; for safety the old part number can have a trigger that prevent new orders from being placed. Since the part number is NOT the primary key, duplicate part numbers are not a problem. If you had instead used the part number as the primary key, you'd be dead in the water. You can argue that the customer is making a dumb decision by reusing catalog numbers, and I'd agree. But they do it, and as database designers we have to handle it. In my particular system, we aggregate information from several hundred companies, and this exact scenario happens frequently. Since we're only aggregating information, we have no control over the data that these companies provide. If we'd used catalog numbers for primary keys, we'd have big problems. Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Basic Q on superfluous primary keys
Merlin Moncure wrote: Using surrogate keys is dangerous and can lead to very bad design habits that are unfortunately so prevalent in the software industry they are virtually taught in schools. ... While there is nothing wrong with them in principle (you are exchanging one key for another as a performance optimization), they make it all too easy to create denormalized designs and tables with no real identifying criteria, etc,... Wow, that's the opposite of everything I've ever been taught, and all my experience in the last few decades. I can't recall ever seeing a "natural" key that was immutable. In my business (chemistry), we've seen several disasterous situations were companies picked keys they thought were natural and immutable, and years down the road they discovered (for example) that chemical compounds they thought were pure were in fact isotopic mixtures, or simply the wrong molecule (as analytical techniques improved). Or during a corporate takeover, they discovered that two companies using the same "natural" keys had as much as 10% differences in their multi-million-compound databases. These errors led to six-month to year-long delays, as each of the conflicting chemical record had to be examined by hand by a PhD chemist to reclassify it. In other businesses, almost any natural identifier you pick is subject to simple typographical errors. When you discover the errors in a field you've used as a primary key, it can be quite hard to fix, particularly if you have distributed data across several systems and schemas. We've always recommended to our customers that all primary keys be completely information free. They should be not based on any information or combination of information from the data records. Every time the customer has not followed this advice, they've later regretted it. I'm sure there are situations where a natural key is appropriate, but I haven't seen it in my work. Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] DELETE with filter on ctid
Spiegelberg, Greg wrote: We have a query which generates a small set of rows (~1,000) which are to be used in a DELETE on the same table. The problem we have is that we need to join on 5 different columns and it takes far too long. You may have encountered the same problem I did: You *must* run ANALYZE on a temporary table before you use in another query. It's surprising that this is true even for very small tables (a few hundred to a few thousand rows), but it is. I had a case where I created a "scratch" table like yours, and the before/after ANALYZE performance was the difference between 30 seconds and a few milliseconds for the same query. Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Can't drop tablespace or user after disk gone
I had a 'scratch' database for testing, which I deleted, and then disk went out. No problem, no precious data. But now I can't drop the tablespace, or the user who had that as the default tablespace. I thought about removing the tablespace from pg_tablespaces, but it seems wrong to be monkeying with the system tables. I still can't drop the user, and can't drop the tablespace. What's the right way to clear out Postgres when a disk fails and there's no reason to repair the disk? Thanks, Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [HACKERS] EXISTS optimization
Kevin Grittner wrote: Management has simply given a mandate that the software be independent of OS and database vendor, and to use Java to help with the OS independence. ... we write all of our queries in ANSI SQL in our own query tool, parse it, and generate Java classes to run it. A better solution, and one I've used for years, is to use OS- or database-specific features, but carefully encapsulate them in a single module, for example, "database_specific.java". For example, when I started supporting both Oracle and Postgres, I encountered the MAX() problem, which (at the time) was very slow in Postgres, but could be replaced by "select X from MYTABLE order by X desc limit 1". So I created a function, "GetColumnMax()" that encapsulates the database-specific code for this. Similar functions encapsulate and a number of other database-specific optimizations. Another excellent example: I have a function called "TableExists(name)". To the best of my knowledge, there simply is no ANSI SQL for this, so what do you do? Encapsulate it in one place. The result? When I port to a new system, I know exactly where to find all of the non-ANSI SQL. I started this habit years ago with C/C++ code, which has the same problem: System calls are not consistent across the varients of Unix, Windows, and other OS's. So you put them all in one file called "machine_dependent.c". Remember the old adage: There is no such thing as portable code, only code that has been ported. Cheers, Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Performance of count(*)
Tom Lane wrote: "Craig A. James" <[EMAIL PROTECTED]> writes: Steve Atkins wrote: As long as you're ordering by some row in the table then you can do that in straight SQL. select a, b, ts from foo where (stuff) and foo > X order by foo limit 10 Then, record the last value of foo you read, and plug it in as X the next time around. We've been over this before in this forum: It doesn't work as advertised. Look for postings by me regarding the fact that there is no way to tell the optimizer the cost of executing a function. There's one, for example, on Oct 18, 2006. You mean http://archives.postgresql.org/pgsql-performance/2006-10/msg00283.php ? I don't see anything there that bears on Steve's suggestion. (The complaint is obsolete as of CVS HEAD anyway.) Mea culpa, it's October 8, not October 18: http://archives.postgresql.org/pgsql-performance/2006-10/msg00143.php The relevant part is this: "My example, discussed previously in this forum, is a classic. I have a VERY expensive function (it's in the class of NP-complete problems, so there is no faster way to do it). There is no circumstance when my function should be used as a filter, and no circumstance when it should be done before a join. But PG has no way of knowing the cost of a function, and so the optimizer assigns the same cost to every function. Big disaster. "The result? I can't use my function in any WHERE clause that involves any other conditions or joins. Only by itself. PG will occasionally decide to use my function as a filter instead of doing the join or the other WHERE conditions first, and I'm dead. "The interesting thing is that PG works pretty well for me on big tables -- it does the join first, then applies my expensive functions. But with a SMALL (like 50K rows) table, it applies my function first, then does the join. A search that completes in 1 second on a 5,000,000 row database can take a minute or more on a 50,000 row database." Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance of count(*)
Steve Atkins wrote: As long as you're ordering by some row in the table then you can do that in straight SQL. select a, b, ts from foo where (stuff) and foo > X order by foo limit 10 Then, record the last value of foo you read, and plug it in as X the next time around. We've been over this before in this forum: It doesn't work as advertised. Look for postings by me regarding the fact that there is no way to tell the optimizer the cost of executing a function. There's one, for example, on Oct 18, 2006. I think the problem is more that most web developers aren't very good at using the database, and tend to fall back on simplistic, wrong, approaches to displaying the data. There's a lot of monkey-see, monkey-do in web UI design too, which doesn't help. Thanks, I'm sure your thoughtful comments will help me solve my problem. Somehow. ;-) Craig ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Performance of count(*)
Tino Wildenhain wrote: You guys can correct me if I'm wrong, but the key feature that's missing from Postgres's flexible indexing is the ability to maintain state across queries. Something like this: select a, b, my_index_state() from foo where ... offset 100 limit 10 using my_index(prev_my_index_state); Yes, you are wrong :-) The technique is called "CURSOR" if you maintain persistent connection per session (e.g. stand allone application or clever pooling webapplication) That's my whole point: If relational databases had a simple mechanism for storing their internal state in an external application, the need for cursors, connection pools, and all those other tricks would be eliminated. As I said earlier, relational technology was invented in an earlier era, and hasn't caught up with the reality of modern web apps. If its a naive web application you just store your session in tables where you can easily maintain the scroll state as well. One thing I've learned in 25 years of software development is that people who use my software have problems I never imagined. I've been the one who was naive when I said similar things about my customers, and was later embarrassed to learn that their problems were more complex than I ever imagined. Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Performance of count(*)
Tino Wildenhain wrote: Craig A. James schrieb: ... In our case (for a variety of reasons, but this one is critical), we actually can't use Postgres indexing at all -- we wrote an entirely separate indexing system for our data... ...There is no need to store or maintain this information along with postgres when you can store and maintain it directly in postgres as well. Whether we store our data inside or outside Postgres misses the point (in fact, most of our data is stored IN Postgres). It's the code that actually performs the index operation that has to be external to Postgres. On top of that, postgres has a very flexible and extensible index system. You guys can correct me if I'm wrong, but the key feature that's missing from Postgres's flexible indexing is the ability to maintain state across queries. Something like this: select a, b, my_index_state() from foo where ... offset 100 limit 10 using my_index(prev_my_index_state); The my_index_state() function would issue something like a "cookie", an opaque text or binary object that would record information about how it got from row 1 through row 99. When you issue the query above, it could start looking for row 100 WITHOUT reexamining rows 1-99. This could be tricky in a OLTP environment, where the "cookie" could be invalidated by changes to the database. But in warehouse read-mostly or read-only environments, it could yield vastly improved performance for database web applications. If I'm not mistaken, Postgres (nor Oracle, MySQL or other RDBMS) can't do this. I would love to be corrected. The problem is that relational databases were invented before the web and its stateless applications. In the "good old days", you could connect to a database and work for hours, and in that environment cursors and such work well -- the RDBMS maintains the internal state of the indexing system. But in a web environment, state information is very difficult to maintain. There are all sorts of systems that try (Enterprise Java Beans, for example), but they're very complex. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance of count(*)
Brian Hurt wrote: One of our biggest single problems is this very thing. It's not a Postgres problem specifically, but more embedded in the idea of a relational database: There are no "job status" or "rough estimate of results" or "give me part of the answer" features that are critical to many real applications. For the "give me part of the answer", I'm wondering if cursors wouldn't work (and if not, why not)? There is no mechanism in Postgres (or any RDB that I know of) to say, "Give me rows 1000 through 1010", that doesn't also execute the query on rows 1-1000. In other words, the RDBMS does the work for 1010 rows, when only 10 are needed -- 100 times more work than is necessary. Limit/Offset will return the correct 10 rows, but at the cost of doing the previous 1000 rows and discarding them. Web applications are stateless. To use a cursor, you'd have to keep it around for hours or days, and create complex "server affinity" code to direct a user back to the same server of your server farm (where that cursor is being held), on the chance that the user will come back and ask for rows 1000 through 1010, then a cursor isn't up to the task. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance of count(*)
Michael Stone wrote: On Thu, Mar 22, 2007 at 01:30:35PM +0200, [EMAIL PROTECTED] wrote: approximated count? why? who would need it? where you can use it? Do a google query. Look at the top of the page, where it says "results N to M of about O". For user interfaces (which is where a lot of this count(*) stuff comes from) you quite likely don't care about the exact count... Right on, Michael. One of our biggest single problems is this very thing. It's not a Postgres problem specifically, but more embedded in the idea of a relational database: There are no "job status" or "rough estimate of results" or "give me part of the answer" features that are critical to many real applications. In our case (for a variety of reasons, but this one is critical), we actually can't use Postgres indexing at all -- we wrote an entirely separate indexing system for our data, one that has the following properties: 1. It can give out "pages" of information (i.e. "rows 50-60") without rescanning the skipped pages the way "limit/offset" would. 2. It can give accurate estimates of the total rows that will be returned. 3. It can accurately estimate the time it will take. For our primary business-critical data, Postgres is merely a storage system, not a search system, because we have to do the "heavy lifting" in our own code. (To be fair, there is no relational database that can handle our data.) Many or most web-based search engines face these exact problems. Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Determining server load from client
Dan Harris wrote: I've found that it would be helpful to be able to tell how busy my dedicated PG server is ... I have seen some other nice back-end things exposed through PG functions ( e.g. database size on disk ) and wondered if there was anything applicable to this. I'd write a simple pg-perl function to do this. You can access operating-system calls to find out the system's load. But notice that you need "Untrusted Perl" to do this, so you can only do it on a system where you trust every application that connects to your database. Something like this: create or replace function get_stats() returns text as ' open(STAT, "; close STAT; return join("", @stats); ' language plperlu; See http://www.postgresql.org/docs/8.1/interactive/plperl-trusted.html Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] strange performance regression between 7.4 and 8.1
Carlos, Now, yet another thing that you (Craig) seem to be missing: you're simply putting the expense of all this time under the expenses column in exchange for solving the particular problem... More like I was trying to keep my response short ;-). I think we're all in agreement on pretty much everything: 1. Understand your problem 2. Find potential solutions 3. Find the technical, economic AND situational tradeoffs 4. Choose the best course of action My original comment was directed at item #3. I was trying to remind everyone that a simple cost analysis may point to solutions that simply aren't possible, given business constraints. I know we also agree that we should constantly fight corporate stupidity and short-sighted budgetary oversight. But that's a second battle, one that goes on forever. Sometimes you just have to get the job done within the current constraints. 'Nuff said. Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] strange performance regression between 7.4 and 8.1
Rodrigo Madera wrote: I would just like to note here that this is an example of inefficient strategy. We could all agree (up to a certain economical point) that Alex saved the most expensive one thousand dollars of his life. I don't know the financial status nor the size of your organization, but I'm sure that you have selected the path that has cost you more. In the future, an investment on memory for a (let's say) rather small database should be your first attempt. Alex may have made the correct, rational choice, given the state of accounting at most corporations. Corporate accounting practices and the budgetary process give different weights to cash and labor. Labor is fixed, and can be grossly wasted without (apparently) affecting the quarterly bottom line. Cash expenditures come directly off profits. It's shortsighted and irrational, but nearly 100% of corporations operate this way. You can waste a week of your time and nobody complains, but spend a thousand dollars, and the company president is breathing down your neck. When we answer a question on this forum, we need to understand that the person who needs help may be under irrational, but real, constraints, and offer appropriate advice. Sure, it's good to fight corporate stupidity, but sometimes you just want to get the system back online. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Estimate the size of the SQL file generated by pg_dump utility
Bill Moran wrote: I'm curious, what problem does the disclaimer cause? I wrote the following TOS for my personal system: https://www.potentialtech.com/cms/node/9 Excerpt of the relevant part: "If you send me email, you are granting me the unrestricted right to use the contents of that email however I see fit, unless otherwise agreed in writing beforehand. You have no rights to the privacy of any email that you send me. If I feel the need, I will forward emails to authorities or make their contents publicly available. By sending me email you consent to this policy and agree that it overrides any disclaimers or policies that may exist elsewhere." I have no idea if that's legally binding or not, but I've talked to a few associates who have some experience in law, and they all argue that email disclaimers probably aren't legally binding anyway -- so the result is undefined. No, it's not legally binding. Agreements are only binding if both parties agree, and someone sending you email has not consented to your statement. If I send you something with a copyright mark, you'd better respect it unless you have a signed agreement granting you rights. Federal law always wins. Disclaimers are bad for two reasons. First, they're powerless. Just because Acme Corp. attaches a disclaimer doesn't mean they've absolved themselves of responsibility for the actions of their employees. Second, they're insulting to the employees. It's a big red flag saying, "We, Acme Corp., hire clowns we don't trust, and THIS person may be one of them!" Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Identical Queries
Stephan Szabo wrote: I tried posting to the bugs, and they said this is a better question for here. I have to queries. One runs in about 2 seconds. The other takes upwards of 2 minutes. I have a temp table that is created with 2 columns. This table is joined with the larger database of call detail records. However, these 2 queries are handled very differently. Even for a temporary table, you should run ANALYZE on it after you fill it but before you query or join to it. I found out (the hard way) that a temporary table of just 100 rows will generate dramatically different plans before and after ANALYZE. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How to debug performance problems
Ray, I'd like to have a toolbox prepared for when performance goes south. I'm clueless. Would someone mind providing some detail about how to measure these four items Craig listed: I hope I didn't give the impression that these were the only thing to look at ... those four items just popped into my head, because they've come up repeatedly in this forum. There are surely more things that could be suspect; perhaps others could add to your list. You can find the answers to each of the four topics I mentioned by looking through the archives of this list. It's a lot of work. It would be really nice if there was some full-time employee somewhere whose job was to monitor this group and pull out common themes that were put into a nice, tidy manual. But this is open-source development, and there is no such person, so you have to dig in and find it yourself. Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] How to debug performance problems
Andreas Tille wrote: My web application was running fine for years without any problem and the performance was satisfying. Some months ago I added a table containing 450 data rows ... Since about two weeks the application became *drastically* slower and I urgently have to bring back the old performance. As I said I'm talking about functions accessing tables that did not increased over several years and should behave more or less the same. Don't assume that the big table you added is the source of the problem. It might be, but more likely it's something else entirely. You indicated that the problem didn't coincide with creating the large table. There are a number of recurring themes on this discussion group: * A long-running transaction keeps vacuum from working. * A table grows just enough to pass a threshold in the planner and a drastically different plan is generated. * An index has become bloated and/or corrupted, and you need to run the REINDEX command. And several other common problems. The first thing is to find out which query is taking a lot of time. I'm no expert, but there have been several explanations on this forum recently how to find your top time-consuming queries. Once you find them, then EXPLAIN ANALYZE should get you started Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS
Craig A. James wrote: The "idiom" to replace count() was "select col from tbl order by col desc limit 1". It worked miracles for my app. Sorry, I meant to write, "the idiom to replace MAX()", not count()... MAX() was the function that was killing me, 'tho count() also gave me problems. Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS
Guy, The application is fairly straightforward, but as you say, what is working okay with BigDBMS isn't working as well under PG. I'm going to try other configuration suggestions made by others before I attempt logic changes. The core logic is unchangeable; millions of rows of data in a single table will be updated throughout the day. If PG can't handle high volume updates well, this may be brick wall. Here are a couple things I learned. ANALYZE is VERY important, surprisingly so even for small tables. I had a case last week where a temporary "scratch" table with just 100 rows was joined to two more tables of 6 and 12 million rows. You might think that a 100-row table wouldn't need to be analyzed, but it does: Without the ANALYZE, Postgres generated a horrible plan that took many minutes to run; with the ANALYZE, it took milliseconds. Any time a table's contents change dramatically, ANALYZE it, ESPECIALLY if it's a small table. After all, changing 20 rows in a 100-row table has a much larger affect on its statistics than changing 20 rows in a million-row table. Postgres functions like count() and max() are "plug ins" which has huge architectural advantages. But in pre-8.1 releases, there was a big speed penalty for this: functions like count() were very, very slow, requiring a full table scan. I think this is vastly improved from 8.0x to 8.1 and forward; others might be able to comment whether count() is now as fast in Postgres as Oracle. The "idiom" to replace count() was "select col from tbl order by col desc limit 1". It worked miracles for my app. Postgres has explicit garbage collection via VACUUM, and you have to design your application with this in mind. In Postgres, update is delete+insert, meaning updates create garbage. If you have very "wide" tables, but only a subset of the columns are updated frequently, put these columns in a separate table with an index to join the two tables. For example, my original design was something like this: integer primary key very large text column ... a bunch of integer columns, float columns, and small text columns The properties were updated by the application, but the large text column never changed. This led to huge garbage-collection problems as the large text field was repeatedly deleted and reinserted by the updates. By separating these into two tables, one with the large text column, and the other table with the dynamic, but smaller, columns, garbage is massively reduced, and performance increased, both immediately (smaller data set to update) and long term (smaller vacuums). You can use views to recreate your original combined columns, so the changes to your app are limited to where updates occur. If you have a column that is *frequently* updated (say, for example, a user's last-access timestamp each time s/he hits your web server) then you definitely want this in its own table, not mixed in with the user's name, address, etc. Partitioning in Postgres is more powerful than in Oracle. Use it if you can. Partial indexes are VERY nice in Postgres, if your data is poorly distributed (for example, a mostly-NULL column with a small percentage of very important values). I'm sure there are more things that others can contribute. Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS
Guy Rouillier wrote: I've got back access to my test system. I ran another test run with the same input data set. This time I put pg_xlog on a different RAID volume (the unused one that I suspect is a software RAID), and I turned fsync=off in postgresql.conf. I left the rest of the configuration alone (all foreign keys removed), etc. Unfortunately, this only dropped elapsed time down to about 28000 seconds (from 3), still significantly more than BigDBMS. Additional info inline below. Although tuning is extremely important, you also have to look at the application itself. I discovered (the hard way) that there's simply no substitute for a bit of redesign/rewriting of the schema and/or SQL statements. Many of us who "grew up" on Oracle assume that their SQL is standard stuff, and that Oracle's optimizer is "the way it's done." But in fact most Oracle applications are tweaked and tuned to take advantage of Oracle's strengths and avoid its weaknesses. If you designed an application from the ground up to use Postgres, then migrated to Oracle, you would probably be equally frustrated by Oracle's poor performance on your Postgres-tuned application. I don't know if you have access to the application's SQL, or the time to experiment a bit, but unless your schema is trival and your SQL is boneheaded simple, you're not going to get equal performance from Postgres until you do some analysis of your application under real-world conditions, and optimize the problem areas. In my case, I found just a few specific SQL constructs that, with a bit of tuning, made massive differences in performance. Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Trivial function query optimized badly
Tom Lane wrote: "Craig A. James" <[EMAIL PROTECTED]> writes: CREATE OR REPLACE FUNCTION cansmiles(text) RETURNS text AS '/usr/local/pgsql/lib/libchem.so', 'cansmiles' LANGUAGE 'C' STRICT IMMUTABLE; Umm ... this is a single-argument function. db=> explain analyze select version_id, 'Brc1ccc2nc(cn2c1)C(=O)O' from version where version.isosmiles = cansmiles('Brc1ccc2nc(cn2c1)C(=O)O', 1); And this query is invoking some other, two-argument function; which apparently hasn't been marked IMMUTABLE, else it'd have been folded to a constant. Good catch, mystery solved. There are two definitions for this function, the first just a "wrapper" for the second with the latter parameter defaulting to "1". The second definition was missing the "IMMUTABLE" keyword. Thanks! Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Trivial function query optimized badly
Adam Rich wrote: Craig, What version of postgres are you using? I just tested this on PG 8.1.2 and was unable to reproduce these results. I wrote a simple function that returns the same text passed to it, after sleeping for 1 second. I use it in a where clause, like your example below, and regardless of the number of rows in the table, it still takes roughly 1 second, indicating to me the function is only called once. Sorry, I forgot that critical piece of info: I'm using 8.1.4. Your results would indicate that 8.1.2 creates a different plan than 8.1.4, or else there's some configuration parameter that's different between your installation and mine that causes a radically different plan to be used. I assume you vacuum/analyzed the table before you ran the query. Is it possible that your function really isn't immutable? Would PG realize this and fall back to treating it as VOLATILE ? Now that you say this, this seems more like a bug with the definition of IMMUTABLE. The function should only be called once if it's given a constant string, right? So the fact that Postgres called it once per row is just wrong. Craig ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[PERFORM] Trivial function query optimized badly
Well, once again I'm hosed because there's no way to tell the optimizer the cost for a user-defined function. I know this issue has already been raised (by me!) several times, but I have to remind everyone about this. I frequently must rewrite my SQL to work around this problem. Here is the function definition: CREATE OR REPLACE FUNCTION cansmiles(text) RETURNS text AS '/usr/local/pgsql/lib/libchem.so', 'cansmiles' LANGUAGE 'C' STRICT IMMUTABLE; Here is the bad optimization: db=> explain analyze select version_id, 'Brc1ccc2nc(cn2c1)C(=O)O' from version where version.isosmiles = cansmiles('Brc1ccc2nc(cn2c1)C(=O)O', 1); QUERY PLAN Seq Scan on version (cost=0.00..23.41 rows=1 width=4) (actual time=1434.281..1540.253 rows=1 loops=1) Filter: (isosmiles = cansmiles('Brc1ccc2nc(cn2c1)C(=O)O'::text, 1)) Total runtime: 1540.347 ms (3 rows) I've had to break it up into two separate queries. Ironically, for large databases, Postgres does the right thing -- it computes the function, then uses the index on the "isosmiles" column. It's blazingly fast and very satisfactory. But for small databases, it apparently decides to recompute the function once per row, making the query N times slower (N = number of rows) than it should be! In this instance, there are 1000 rows, and factor of 10^4 is a pretty dramatic slowdown... To make it work, I had to call the function separately then use its result to do the select. db=> explain analyze select cansmiles('Brc1ccc2nc(cn2c1)C(=O)O', 1); QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=1.692..1.694 rows=1 loops=1) Total runtime: 1.720 ms (2 rows) db=> explain analyze select version_id, 'Brc1ccc2nc(cn2c1)C(=O)O' from version where version.isosmiles = 'Brc1ccc2nc(cn2c1)C(=O)O'; QUERY PLAN - Index Scan using i_version_isosmiles on version (cost=0.00..5.80 rows=1 width=4) (actual time=0.114..0.117 rows=1 loops=1) Index Cond: (isosmiles = 'Brc1ccc2nc(cn2c1)C(=O)O'::text) Total runtime: 0.158 ms (3 rows) Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] glibc double-free error
Tom Lane wrote: "Craig A. James" <[EMAIL PROTECTED]> writes: I'm using pg_dump/pg_restore to quickly copy databases between servers. But my server keeps crashing when I run pg_restore: glibc detected *** double free or corruption (!prev): 0x0a00b1a0 What can I do to help diagnose this problem? Either dig into it yourself with gdb, or send me a not-too-large example dump file (off-list)... Hmmm ... after moving to our production server, four hours of work copying a dozen databases, there hasn't been a single glibc problem. The development server is Fedora Core 3, the productions server is Fedora Core 4. Unless it happens on FC4, I'm diagnosing that it's a glibc bug or incompatibility that was already fixed. Thanks, Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] glibc double-free error
I'm using pg_dump/pg_restore to quickly copy databases between servers. But my server keeps crashing when I run pg_restore: glibc detected *** double free or corruption (!prev): 0x0a00b1a0 Postgres: 8.1.4 Linux: 2.6.12-1.1381_FC3 glibc: 2.3.6-0.fc3.1 Server: Dell CPU: Xeon 2.80GHz Memory: 4 GB This is pretty repeatable. Any particular pg_dump file that causes the crash will cause it every time it is used, and it happens with a lot of my databases. What can I do to help diagnose this problem? Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] New to PostgreSQL, performance considerations
Ron wrote: We are not going to get valuable contributions nor help people become more valuable to the community by "flaming them into submission". Let's support getting definitive evidence. No matter who brings it to the table ;-) Thanks, Ron, for a voice of respect and reason. Since I first started using Usenet back in 1984, inexplicable rudeness has been a plague on otherwise civilized people. We're a community, we're all in this to help one another. Sometimes we give good advice, and sometimes even those "wearing the mantle of authority" can make boneheaded comments. I know I do, and when it happens, I always appreciate it when I'm taken to task with good humor and tolerance. When someone comes to this forum with an idea you disagree with, no matter how brash or absurd their claims, it's so easy to challenge them with grace and good humor, rather than chastizing with harsh words and driving someone from our community. If you're right, you will have taught a valuable lesson to someone. And if on occasion a newcomer shows us something new, then we've all learned. Either way, we have a new friend and contributor to the community. Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Bad iostat numbers
Alex Turner wrote: The problem I see with software raid is the issue of a battery backed unit: If the computer loses power, then the 'cache' which is held in system memory, goes away, and fubars your RAID. I'm not sure I see the difference. If data are cached, they're not written whether it is software or hardware RAID. I guess if you're writing RAID 1, the N disks could be out of sync, but the system can synchronize them once the array is restored, so that's no different than a single disk or a hardware RAID. If you're writing RAID 5, then the blocks are inherently error detecting/correcting, so you're still OK if a partial write occurs, right? I'm not familiar with the inner details of software RAID, but the only circumstance I can see where things would get corrupted is if the RAID driver writes a LOT of blocks to one disk of the array before synchronizing the others, but my guess (and it's just a guess) is that the writes to the N disks are tightly coupled. If I'm wrong about this, I'd like to know, because I'm using software RAID 1 and 1+0, and I'm pretty happy with it. Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] When to vacuum a table?
Rod Taylor wrote: Just a minor clarification here: INSERT does not create dead rows, only UPDATE and DELETE do. Thus, if you only insert rows, you do not need to vacuum (although you probably need to analyze). Is there no real-time garbage collection at all in Postgres? And if so, is this because nobody has had time to implement garbage collection, or for a more fundamental reason, or because VACUUM is seen as sufficient? I'm just curious ... Vacuum has always seemed to me like an ugly wart on the pretty face of Postgres. (I say this even though I implemented an identical solution on a non-relational chemistry database system a long time ago. I didn't like it then, either.) Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Postgres server crash
You realize that it had to be turned on explicitly on IRIX, right? But don't let facts get in the way of a good rant... On the contrary, with Irix 4 and earlier it was the default, but it caused so many problems that SGI switched the default to OFF in IRIX 5. But because it had been available for so long, many important apps had come to rely on it, so most sites had to immediately re-enable virtual swap on every IRIX 5 server that came in. Admins just got used to doing it, so it became a "default" at most sites, and admins often couldn't be convinced to disable it for database server machines, because "That's our standard for IRIX configuration." I worked at a big molecular modeling/visualization company; our visualization programs *required* virtual swap, and our server programs *prohibited* virtual swap. Imagine how our sales people felt about that, telling customers that they'd have to buy two $30,000 machines just because of one kernel parameter. Of course, they didn't, and our server apps took the heat as being "unreliable." SGI called it "virtual swap" which I always thought was a hoot. You have virtual memory, which is really your swap space, and then virtual swap, which is some kind of dark hole... Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Postgres server crash
Michael Stone wrote: At one point someone complained about the ability to configure, e.g., IRIX to allow memory overcommit. I worked on some large IRIX installations where full memory accounting would have required on the order of 100s of gigabytes of swap, due to large shared memory allocations. These were mostly scientific and graphical apps where reliability took a back seat to performance and to program complexity. They would allocate 100's of GB of swap space rather than taking the time to design proper data structures. If the program crashed every week or two, no big deal -- just run it again. Overallocating memory is a valuable technique for such applications. But overallocating memory has no place in a server environment. When memory overcommittment is allowed, it is impossible to write a reliable application, because no matter how carefully and correctly you craft your code, someone else's program that leaks memory like Elmer Fudd's rowboat after his shotgun goes off, can kill your well-written application. Installing Postgres on such a system makes Postgres unreliable. Tom Lane wrote: That might have been right when it was written (note the reference to a 2.2 Linux kernel), but it's 100% wrong now. [Setting /proc/sys/vm/overcommit_memory to] 0 is the default, not-safe setting. I'm surprised that the Linux kernel people take such a uncritical view of reliability that they set, as *default*, a feature that makes Linux an unreliable platform for servers. And speaking of SGI, this very issue was among the things that sank the company. As the low-end graphics cards ate into their visualization market, they tried to become an Oracle Server platform. Their servers were *fast*. But they crashed -- a lot. And memory-overcommit was one of the reasons. IRIX admins would brag that their systems only crashed every couple of weeks. I had HP and Sun systems that would run for years. Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Postgres server crash
Richard Troy wrote: I did that - spent about two f-ing hours looking for what I wanted. (Guess I entered poor choices for my searches. -frown- ) There are a LOT of articles that TALK ABOUT OOM, but prescious few actually tell you what you can do about it. Trying to save you some time: On linux you can use the sysctl utility to muck with vm.overcommit_memory; You can disable the "feature." Google _that_ for more info! Here's something I found googling for "memory overcommitment"+linux http://archives.neohapsis.com/archives/postfix/2000-04/0512.html From /usr/src/linux/Documentation/sysctl/vm.txt "overcommit_memory: This value contains a flag that enables memory overcommitment. When this flag is 0, the kernel checks before each malloc() to see if there's enough memory left. If the flag is nonzero, the system pretends there's always enough memory." This flag is settable in /proc/sys/vm Lo and behold, here it is on my system: $ cat /proc/sys/vm/overcommit_memory 0 $ cat /proc/sys/vm/overcommit_ratio 50 Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Postgres server crash
By the way, in spite of my questions and concerns, I was *very* impressed by the recovery process. I know it might seem like old hat to you guys to watch the WAL in action, and I know on a theoretical level it's supposed to work, but watching it recover 150 separate databases, and find and fix a couple of problems was very impressive. It gives me great confidence that I made the right choice to use Postgres. Richard Huxton wrote: 2. Why didn't the database recover? Why are there two processes that couldn't be killed? I'm guessing it didn't recover *because* there were two processes that couldn't be killed. Responsibility for that falls to the operating-system. I've seen it most often with faulty drivers or hardware that's being communicated with/written to. However, see below. It can't be a coincidence that these were the only two processes in a SELECT operation. Does the server disable signals at critical points? I'd make a wild guess that this is some sort of deadlock problem -- these two servers have disabled signals for a critical section of SELECT, and are waiting for something from the postmaster, but postmaster is dead. This is an ordinary system, no hardware problems, stock RH FC3 kernel, stock PG 8.1.4, with 4 GB memory, and at the moment the database is running on a single SATA disk. I'm worried that a production server can get into a state that requires manual intervention to recover. Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres server crash
Russell Smith wrote: For the third time today, our server has crashed... I would guess it's the linux OOM if you are running linux. You need to turn off killing of processes when you run out of memory. Are you getting close to running out of memory? Good suggestion, it was a memory leak in an add-on library that we plug in to the Postgres server. OOM? Can you give me a quick pointer to what this acronym stands for and how I can reconfigure it? It sounds like a "feature" old UNIX systems like SGI IRIX had, where the system would allocate virtual memory that it didn't really have, then kill your process if you tried to use it. I.e. malloc() would never return NULL even if swap space was over allocated. Is this what you're talking about? Having this enabled on a server is deadly for reliability. Thanks, Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Postgres server crash
For the third time today, our server has crashed, or frozen, actually something in between. Normally there are about 30-50 connections because of mod_perl processes that keep connections open. After the crash, there are three processes remaining: # ps -ef | grep postgres postgres 23832 1 0 Nov11 pts/100:02:53 /usr/local/pgsql/bin/postmaster -D /postgres/main postgres 1200 23832 20 14:28 pts/100:58:14 postgres: pubchem pubchem 66.226.76.106(58882) SELECT postgres 4190 23832 25 14:33 pts/101:09:12 postgres: asinex asinex 66.226.76.106(56298) SELECT But they're not doing anything: No CPU time consumed, no I/O going on, no progress. If I try to connect with psql(1), it says: psql: FATAL: the database system is in recovery mode And the server log has: LOG: background writer process (PID 23874) was terminated by signal 9 LOG: terminating any other active server processes LOG: statistics collector process (PID 23875) was terminated by signal 9 WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited ab normally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited ab ... repeats about 50 times, one per process. Questions: 1. Any idea what happened and how I can avoid this? It's a *big* problem. 2. Why didn't the database recover? Why are there two processes that couldn't be killed? 3. Where did the "signal 9" come from? (Nobody but me ever logs in to the server machine.) Help! Thanks, Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Hundreds of database and FSM
A few months ago a couple guys got "bragging rights" for having the most separate databases. A couple guys claimed several hundred databases and one said he had several thousand databases. The concensus was that Postgres has no problem handling many separate databases. I took that to heart and redesigned our system; we now have about 150 "primary data sources" that are used to build couple of "warehouses" that our customers actually search. Each database has about 20 tables. The total size (all databases and all tables together) is not huge, about 40 million rows. Eventually the warehouse (customer accessible) databases will be moved to separate servers, configured and indexed specifically for the task. The only problem I've encountered is messages in the log: NOTICE: number of page slots needed (131904) exceeds max_fsm_pages (10) HINT: Consider increasing the configuration parameter "max_fsm_pages" to a value over 131904. So I dutifully followed this advice: max_fsm_pages = 32 max_fsm_relations = 2 This is based on our current 150 databases times 20 tables, or 3000 tables total. But I wasn't sure if sequences count as "relations", which would double the number. So I set it at 20K relations to allow for growth. Is there anything else I need to worry about? What happens if I go to, say, 500 databases (aside from increasing the FSM numbers even more)? 1000 databases? The servers are 4 GB, dual Xeon, Postgres 8.1.4 on Linux FC4. Thanks, Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Best COPY Performance
Spiegelberg, Greg wrote: The data are on two different computers, and I do processing of the data as it passes through the application. Otherwise, the INSERT INTO ... SELECT is my first choice. Would dblink() help in any way? It might if perl wasn't so damned good at this. ;-) Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Best COPY Performance
Jim C. Nasby wrote: Wait... so you're using perl to copy data between two tables? And using a cursor to boot? I can't think of any way that could be more inefficient... What's wrong with a plain old INSERT INTO ... SELECT? Or if you really need to break it into multiple transaction blocks, at least don't shuffle the data from the database into perl and then back into the database; do an INSERT INTO ... SELECT with that same where clause. The data are on two different computers, and I do processing of the data as it passes through the application. Otherwise, the INSERT INTO ... SELECT is my first choice. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Best COPY Performance
Jim C. Nasby wrote: Well, given that perl is using an entire CPU, it sounds like you should start looking either at ways to remove some of the overhead from perl, or to split that perl into multiple processes. I use Perl for big database copies (usually with some processing/transformation along the way) and I've never seen 100% CPU usage except for brief periods, even when copying BLOBS and such. My typical copy divides operations into blocks, for example doing N = 0 while (more rows to go) { begin transaction select ... where primary_key > N order by primary_key limit 1000 while (fetch a row) insert into ... N = (highest value found in last block) commit } Doing it like this in Perl should keep Postgres busy, with Perl using only moderate resources. If you're seeing high Perl CPU usage, I'd look first at the Perl code. Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Best COPY Performance
Jim C. Nasby wrote: http://stats.distributed.net used to use a perl script to do some transformations before loading data into the database. IIRC, when we switched to using C we saw 100x improvement in speed, so I suspect that if you want performance perl isn't the way to go. I think you can compile perl into C, so maybe that would help some. I use Perl extensively, and have never seen a performance problem. I suspect the perl-to-C "100x improvement" was due to some other factor, like a slight change in the schema, indexes, or the fundamental way the client (C vs Perl) handled the data during the transformation, or just plain bad Perl code. Modern scripting languages like Perl and Python make programmers far, far more productive than the bad old days of C/C++. Don't shoot yourself in the foot by reverting to low-level languages like C/C++ until you've exhausted all other possibilities. I only use C/C++ for intricate scientific algorithms. In many cases, Perl is *faster* than C/C++ code that I write, because I can't take the time (for example) to write the high-performance string manipulation that have been fine-tuned and extensively optimized in Perl. Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Optimization of this SQL sentence
Alexander Staubo wrote: On Oct 17, 2006, at 17:10 , Craig A. James wrote: These tables are particularly egregious examples of ignorant database design. You need to understand the relational model This email is a *particularly* egregious example of rudeness. You owe Mr. Staubo, and the Postgress community, an apology. I'm sorry you feel that way, but I don't think I was out of line. ... If you think the database design in question is *not* ignorant database design, please do explain why, but on technical grounds. (Ignorance, of course, is not a sin.) This is not about design. It's about someone who came for help, and got a derogatory remark. Is it really so hard to be helpful *and* use polite, encouraging language? Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Optimization of this SQL sentence
These tables are particularly egregious examples of ignorant database design. You need to understand the relational model This email is a *particularly* egregious example of rudeness. You owe Mr. Staubo, and the Postgress community, an apology. There is absolutely no reason to insult people who come to this forum for help. That's why the forum is here, to help people who are "ignorant" and want to improve their knowledge. Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Hints proposal
So let's cut to the bone: If someone thinks a proposal is a bad idea, and they're volunteering their time on an open-source project, why would they implement the proposal? In all the heat and smoke, I believe there are two basic conclusions we all agree on. 1. Optimizer: a) A perfect optimizer would be a wonderful thing b) Optimization is a hard problem c) Any problem that can be solve by improving the optimizer *should* be solved by improving the optimizer. 2. Hints a) On a aesthetic/theoretical level, hints suck. They're ugly and rude b) On a practical level, introducing hints will cause short- and long-term problems c) Hints would help DBAs solve urgent problems for which there is no other solution The disagreements revolve around the degree to which 1 conflicts with 2. 1. Developers feel very strongly about 2(a) and 2(b). 2. DBAs "in the trenches" feel very strongly about 2(c). So my question is: Is there any argument that can be made to persuade those of you who are volunteering your time on the optimizer to even consider a HINTS proposal? Has all this discussion changed your perspective on 2(c), and why it really matters to some of us? Are we just wasting our time, or is this a fruitful discussion? Thanks, Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Hints proposal
Josh Berkus wrote: I actually think the way to attack this issue is to discuss the kinds of errors the planner makes, and what tweaks we could do to correct them. Here's the ones I'm aware of: -- Incorrect selectivity of WHERE clause -- Incorrect selectivity of JOIN -- Wrong estimate of rows returned from SRF -- Incorrect cost estimate for index use Can you think of any others? The one that started this discussion: Lack of cost information for functions. I think this feature is a good idea independent of the whole HINTS discussion. At a minimum, a rough categorization is needed, such as "Lighning fast / Fast / Medium / Slow / Ludicrously slow", with some sort if milliseconds or CPU cycles associated with each category. Or perhaps something like, "This is (much faster|faster|same as|slower|much slower) than reading a block from the disk." If I understand Tom and others, the planner already is capable of taking advantage of this information, it just doesn't have it yet. It could be part of the CREATE FUNCTION command. CREATE OR REPLACE FUNCTION foobar(text, text, text) RETURNS text AS '/usr/local/pgsql/lib/foobar.so', 'foobar' COST LUDICROUSLY_SLOW LANGUAGE 'C' STRICT; Better yet ('tho I have no idea how hard this would be to implement...) would be an optional second function with the same parameter signature as the main function, but it would return a cost estimate: CREATE OR REPLACE FUNCTION foobar(text, text, text) RETURNS text AS '/usr/local/pgsql/lib/foobar.so', 'foobar' COST foobar_cost LANGUAGE 'C' STRICT; The planner could call it with the same parameters it was about to use, and get an accurate estimate for the specific operation that is about to be done. In my particular case (running an NP-complete problem), there are cases where I can determine ahead of time that the function will be fast, but in most cases it is *really* slow. Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Simple join optimized badly?
Brian Herlihy wrote: PG does support hints actually.. The only thing is, the hints are expressed in an obscure, ad-hoc and implementation dependant language. For example, the "Don't use index X" hint (the one I used) can be accessed by replacing your index with an index on values derived from the actual index... And then there's select ... from (select ... offset 0) where the "offset 0" prevents any rewriting between the two levels of query. This replaces joins and AND clauses where the planner makes the wrong choice of join order or filtering. I grepped my code and found four of these (all workarounds for the same underlying problem). Imagine I got run over by a train, and someone was reading my code. Which would be easier for them to maintain: Code with weird SQL, or code with sensible, well-written SQL and explicit hints? Luckily for my (hypothetical, I hope) successor, I put massive comments in my code explaining the strange SQL. The bad applications are ALREADY HERE. And they're WORSE to maintain than if we had a formal hint language. The argument that hints lead to poor application is true. But lack of hints leads to worse applications. Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Simple join optimized badly?
Mark Kirkwood wrote: The result? I can't use my function in any WHERE clause that involves any other conditions or joins. Only by itself. PG will occasionally decide to use my function as a filter instead of doing the join or the other WHERE conditions first, and I'm dead. this is an argument for cost-for-functions rather than hints AFAICS. Perhaps you scanned past what I wrote a couple paragraphs farther down. I'm going to repeat it because it's the KEY POINT I'm trying to make: Craig James wrote: Now you might argue that function-cost needs to be added to the optimizer's arsenal of tricks. And I'd agree with you: That WOULD be a better solution than hints. But I need my problem solved TODAY, not next year. Hints can help solve problems NOW that can be brought to the PG team's attention later, and in the mean time let me get my application to work. Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Simple join optimized badly?
Bruce Momjian wrote: I can do 100! on my computer, but can't do it in my head. A poor example. 100! is a simple repetative calculation, something computers are very good at. Optimizing an SQL query is very difficult, and a completely different class of problem. The fact is the PG team has done a remarkable job with the optimizer so far. I'm usually very happy with its plans. But humans still beat computers at many tasks, and there are unquestionably areas where the PG optimizer is not yet fully developed. When the optimizer reaches its limits, and you have to get your web site running, a HINT can be invaluable. I said something in a previous version of this topic, which I'll repeat here. The PG documentation for HINTs should be FILLED with STRONG ADMONITIONS to post the problematic queries here before resorting to hints. There will always be fools who abuse hints. Too bad for them, but don't make the rest of us suffer for their folly. Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Simple join optimized badly?
... and add 100 other problems. Hints are used because the DBA thinks that they are smarter than the optimizer; 99% of the time, they are wrong. Just try manually optimizing a complex query, you'll see -- with three join types, several scan types, aggregates, bitmaps, internal and external sorts, and the ability to collapse subqueries it's significantly more than a human can figure out accurately. Sorry, this is just wrong, wrong, wrong. I've heard this from several PG developers every time hints have come up in my roughly eighteen months as a PG application developer. And in between every assertion that "the application programmers aren't as smart as the optimizer", there are a dozen or two examples where posters to this list are told to increase this setting, decrease that one, adjust these other two, and the end result is to get the plan that the application programmer -- AND the PG professionals -- knew was the right plan to start with. People are smarter than computers. Period. Now I'll agree that the majority, perhaps the great majority, of questions to this group should NOT be solved with hints. You're absolutely right that in most cases hints are a really bad idea. People will resort to hints when they should be learning better ways to craft SQL, and when they should have read the configuration guides. But that doesn't alter the fact that many, perhaps most, complicated application will, sooner or later, run into a showstopper case where PG just optimizes wrong, and there's not a damned thing the app programmer can do about it. My example, discussed previously in this forum, is a classic. I have a VERY expensive function (it's in the class of NP-complete problems, so there is no faster way to do it). There is no circumstance when my function should be used as a filter, and no circumstance when it should be done before a join. But PG has no way of knowing the cost of a function, and so the optimizer assigns the same cost to every function. Big disaster. The result? I can't use my function in any WHERE clause that involves any other conditions or joins. Only by itself. PG will occasionally decide to use my function as a filter instead of doing the join or the other WHERE conditions first, and I'm dead. The interesting thing is that PG works pretty well for me on big tables -- it does the join first, then applies my expensive functions. But with a SMALL (like 50K rows) table, it applies my function first, then does the join. A search that completes in 1 second on a 5,000,000 row database can take a minute or more on a 50,000 row database. Instead, I have to separate the WHERE terms into two SQL statements, and do the join myself. I do the first half of my query, suck it all into memory, do the second half, suck it into memory, build a hash table and join the two lists in memory, then take the joined results and apply my function to it. This is not how a relational database should work. It shouldn't fall over dead just when a table's size SHRINKS beyond some threshold that causes the planner to switch to a poor plan. Since these tables are all in the same database, adjusting configuration parameters doesn't help me. And I suppose I could use SET to disable various plans, but how is that any different from a HINT feature? Now you might argue that function-cost needs to be added to the optimizer's arsenal of tricks. And I'd agree with you: That WOULD be a better solution than hints. But I need my problem solved TODAY, not next year. Hints can help solve problems NOW that can be brought to the PG team's attention later, and in the mean time let me get my application to work. Sorry if I seem particularly hot under the collar on this one. I think you PG designers have created a wonderful product. It's not the lack of hints that bothers me, it's the "You app developers are dumber than we are" attitude. We're not. Some of us know what we're doing, and we need hints. If it is just a matter of resources, that's fine. I understand that these things take time. But please don't keep dismissing the repeated and serious requests for this feature. It's important. Thanks for listening. Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Simple join optimized badly?
I have two tables, SAMPLE and HITLIST that when joined, generate a monsterous sort. HITLIST_ROWS has about 48,000 rows SAMPLE has about 16 million rows The joined column is indexed in SAMPLE HITLIST_ROWS is a scratch table which is used a few times then discarded. HITLIST_ROWS has no indexes at all There are two plans below. The first is before an ANALYZE HITLIST_ROWS, and it's horrible -- it looks to me like it's sorting the 16 million rows of the SEARCH table. Then I run ANALYZE HITLIST_ROWS, and the plan is pretty decent. First question: HITLIST_ROWS so small, I don't understand why the lack of ANALYZE should cause SAMPLE's contents to be sorted. Second question: Even though ANALYZE brings it down from 26 minutes to 47 seconds, a huge improvement, it still seems slow to me. Its going at roughly 1 row per millisecond -- are my expectations too high? This is a small-ish Dell computer (Xeon), 4 GB memory, with a four-disk SATA software RAID0 (bandwidth limited to about 130 MB/sec due to PCI cards). Other joins of a similar size seem much faster. It looks like I'll need to do an ANALYZE every time I modify HITLIST_ROWS, which seems like a waste because HITLIST_ROWS is rarely used more than once or twice before being truncated and rebuilt with new content. (HITLIST_ROWS can't be an actual temporary table, though, because it's a web application and each access is from a new connection.) This is Postgres 8.0.3. (We're upgrading soon.) Thanks, Craig explain analyze select t.SAMPLE_ID from SAMPLE t, HITLIST_ROWS ph where t.VERSION_ID = ph.ObjectID); QUERY PLAN --- Merge Join (cost=4782.35..1063809.82 rows=613226 width=4) (actual time=174.212..1593886.582 rows=176294 loops=1) Merge Cond: ("outer".version_id = "inner".objectid) -> Index Scan using i_sample_version_id on sample t (cost=0.00..1008713.68 rows=16446157 width=8) (actual time=0.111..1571911.208 rows=16446157 loops=1) -> Sort (cost=4782.35..4910.39 rows=51216 width=4) (actual time=173.669..389.496 rows=176329 loops=1) Sort Key: ph.objectid -> Seq Scan on hitlist_rows_378593 ph (cost=0.00..776.16 rows=51216 width=4) (actual time=0.015..90.059 rows=48834 loops=1) Total runtime: 1594093.725 ms (7 rows) chmoogle2=> analyze HITLIST_ROWS; ANALYZE chmoogle2=> explain analyze select t.SAMPLE_ID from SAMPLE t, HITLIST_ROWS ph where t.VERSION_ID = ph.ObjectID; QUERY PLAN --- Hash Join (cost=874.43..457976.83 rows=584705 width=4) (actual time=302.792..47796.719 rows=176294 loops=1) Hash Cond: ("outer".version_id = "inner".objectid) -> Seq Scan on sample t (cost=0.00..369024.57 rows=16446157 width=8) (actual time=46.344..26752.343 rows=16446157 loops=1) -> Hash (cost=752.34..752.34 rows=48834 width=4) (actual time=149.548..149.548 rows=0 loops=1) -> Seq Scan on hitlist_rows_378593 ph (cost=0.00..752.34 rows=48834 width=4) (actual time=0.048..80.721 rows=48834 loops=1) Total runtime: 47988.572 ms (6 rows) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] RAID 0 not as fast as expected
Alan Hodgson wrote: On Thursday 14 September 2006 11:05, "Craig A. James" <[EMAIL PROTECTED]> wrote: I'm experiment with RAID, looking for an inexpensive way to boost performance. I bought 4 Seagate 7200.9 120 GB SATA drives and two SIIG dual-port SATA cards. (NB: I don't plan to run RAID 0 in production, probably RAID 10, so no need to comment on the failure rate of RAID 0.) Are those PCI cards? If yes, it's just a bus bandwidth limit. Ok, that makes sense. One SATA disk = 52 MB/sec 4-disk RAID0 = 106 MB/sec PCI at 33 MHz x 32 bits (4 bytes) = 132 MB/sec. I guess getting to 80% of the theoretical speed is as much as I should expect. Thanks, Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] RAID 0 not as fast as expected
I'm experiment with RAID, looking for an inexpensive way to boost performance. I bought 4 Seagate 7200.9 120 GB SATA drives and two SIIG dual-port SATA cards. (NB: I don't plan to run RAID 0 in production, probably RAID 10, so no need to comment on the failure rate of RAID 0.) I used this raw serial-speed test: time sh -c "dd if=/dev/zero of=./bigfile bs=8k count=100 && sync" (unmount/remount) time sh -c "dd if=./bigfile of=/dev/null bs=8k count=100 && sync" Which showed that the RAID 0 4-disk array was almost exactly twice as fast as each disk individually. I expected 4X performance for a 4-disk RAID 0. My suspicion is that each of these budget SATA cards is bandwidth limited; they can't actually handle two disks simultaneously, and I'd need to get four separate SATA cards to get 4X performance (or a more expensive card such as the Areca someone mentioned the other day). On the other hand, it "feels like" (using our application) the seek performance is quite a bit better, which I'd expect given my hypothesis about the SIIG cards. I don't have concrete benchmarks on seek speed. Thanks, Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Poor performance on seq scan
tin tout KB/t tps MB/s KB/t tps MB/s us ni sy in id 1 14 128.00 1 0.10 128.00 1 0.10 5 0 94 1 0 0 12 123.98 104 12.56 123.74 104 12.56 8 0 90 2 0 0 12 125.66 128 15.75 125.26 128 15.68 10 0 85 6 0 0 12 124.66 129 15.67 124.39 129 15.64 12 0 85 3 0 0 12 117.13 121 13.87 117.95 121 13.96 12 0 84 5 0 0 12 104.84 118 12.05 105.84 118 12.19 10 0 87 2 0 Why is that showing 85+ percent *system* CPU time?? I could believe a lot of idle CPU if the query is I/O bound, or a lot of user time if PG was being a hog about doing the ~~ comparisons (not too unlikely BTW). But if the kernel is eating all the CPU, there's something very wrong, and I don't think it's Postgres' fault. There IS a bug for SATA disk drives in some versions of the Linux kernel. On a lark I ran some of the I/O tests in this thread, and much to my surprise discovered my write speed was 6 MB/sec ... ouch! On an identical machine, different kernel, the write speed was 54 MB/sec. A couple of hours of research turned up this: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=168363 The fix for me was to edit /boot/grub/grub.conf, like this: kernel /vmlinuz-2.6.12-1.1381_FC3 ro root=LABEL=/ rhgb quiet \ ramdisk_size=1200 ide0=noprobe ide1=noprobe Notice the "ideX=noprobe". Instant fix -- after reboot the disk write speed jumped to what I expected. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] Moving a tablespace
Suppose, hypothetically of course, someone lacked foresight, and put a tablespace somewhere with a dumb name, like "/disk2", instead of using a symbolic link with a more descriptive name. And then /disk2 needs to be renamed, say to "/postgres_data", and this (hypothetical) DBA realizes he has made a dumb mistake. Is there a way to move a tablespace to a new location without a dump/restore? I, er, this hypothetical guy, knows he can move it and put a symbolic link in for /disk2, but this is somewhat unsatisfactory since "/disk2" would have to exist forever. Thanks, Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Forcing using index instead of sequential scan?
Michael Stone wrote: On Sat, Jul 22, 2006 at 10:26:53AM -0700, Craig A. James wrote: This causes massive file-system activity and flushes all files that the kernel has cached. If you run this between each Postgres test (let it run for a couple minutes), it gives you an apples-to-apples comparison between successive benchmarks, and eliminates the effects of caching. Assuming a system with small ram or an unusually large system installation. Unmounting is a much more realiable mechanism. Indeed, but it only works if you can. For example, in my small-ish installation, my WAL and system tables are mounted on the root disk. Or someone might not have super-user access. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Forcing using index instead of sequential scan?
The real issue here is caching across successive queries, an effect that Postgres doesn't deal with very well at the moment. If you run these queries from a standing start (freshly booted machine) you'll likely find that the indexscan plan is indeed slower than the seqscan/hash plan, just like the planner thinks. Here's a little trick I learned to speed up this test. find / -type f -exec grep foobar {} \; This causes massive file-system activity and flushes all files that the kernel has cached. If you run this between each Postgres test (let it run for a couple minutes), it gives you an apples-to-apples comparison between successive benchmarks, and eliminates the effects of caching. If you run this as a regular user (NOT super-user or 'postgres'), you won't have permission to access your Postgres files, so you're guaranteed they'll be flushed from the cache. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] increment Rows in an SQL Result Set postgresql
Hassan, 1. I have a function that given two parameter produces an arbitrary id, and text. However arbitrary the id and text are, they are in certain order. i.e. it is imperative that whatever processing I do, the order is preserved. What type of function is this? Did you write it in C? An SQL procedure? If the function is written in C, you can create a static local variable which you increment every time you call your function, and which you return along with your other two values. As long as your client is connected to the back-end server, you're guaranteed that it's a single process, and it's not multi-threaded, so this is a safe approach. However, note that if you disconnect and reconnect, your counter will be reset to zero. If your function is written in a different language or is a procedure, you might create a sequence that your function can query. The trick is that it is the function itself that must return the incremented value, i.e. you must return three, not two, values from your function. That way, you're not relying on any specific features of the planner, so your three values will stick together. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Kill a session
Tom Lane wrote: "Craig A. James" <[EMAIL PROTECTED]> writes: Bottom line is that I was expecting "instant death" with SIGTERM, but instead got an agonizing, drawn out -- but safe -- death of the query. What was the query exactly? Our expectation is that all or at least most queries should respond to SIGINT or SIGTERM interrupts pretty rapidly, say on a less-than-a-second timescale. However there are various loops in the backend that fail to execute CHECK_FOR_INTERRUPTS sufficiently often :-(. We've been gradually finding and fixing these, and will be glad to fix your case if you provide enough details to pin it down. You might be interested in this current thread about a similar problem: http://archives.postgresql.org/pgsql-patches/2006-07/msg00039.php Thanks, this is good information. The qsort is a distinct possibility. The query is a big insert into some_hitlist (select id from another_hitlist join data_table on (...)) where the hitlists are unindexed. So it may be using a merge-join with qsort. When I have a few minutes, I'll turn on logging in the app and find the exact SQL, and run an EXPLAIN ANALYZE and see what's really happening. It's also possible that the INSERT itself is the problem, or adds to the problem. The SIGINT may come after a few million rows have been inserted, so it would have to clean that up, right? Thanks, Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Kill a session
Steinar H. Gunderson wrote: On Wed, Jul 12, 2006 at 08:43:18AM -0700, Craig A. James wrote: Then you killed the wrong backend... No queries run in postmaster. They all run in postgres backends. The postmaster does very little actual work, other than keeping track of everybody else. It turns out I was confused by this: ps(1) reports a process called "postgres", but top(1) reports a process called "postmaster", but they both have the same pid. I guess postmaster replaces its own name in the process table when it's executing a query, and it's not really the postmaster even though top(1) calls it postmaster. So "kill -15 " is NOT killing the process -- to kill the process, I have to use signal 9. But if I do that, ALL queries in progress are aborted. I might as well shut down and restart the database, which is an unacceptable solution for a web site. I don't follow your logic here. If you do "kill -15 " of the postmaster doing the work, the query should be aborted without taking down the entire cluster. I don't see why you'd need -9 (which is a really bad idea anyhow)... I've solved this mystery. "kill -15" doesn't immediately kill the job -- it aborts the query, but it might take 15-30 seconds to clean up. This confused me, because the query I was using to test took about 30 seconds, so the SIGTERM didn't seem to make a difference. But when I used a harder query, one that would run for 5-10 minutes, SIGTERM still stopped it after 15 seconds, which isn't great but it's acceptable. Bottom line is that I was expecting "instant death" with SIGTERM, but instead got an agonizing, drawn out -- but safe -- death of the query. At least that's my deduction based on experiments. I haven't dug into the source to confirm. Thanks everyone for your answers. My "kill this query" feature is now acceptable. Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Kill a session
Thanks for your reply, Mark: I'm back to my original question: How do you kill a runaway query without bringing down the whole database? Is there really no answer to this? ... if you actually want to disconnect a backend process then there is nothing to let you do this remotely. I recently did a patch for Bizgres that just implements the pg_terminate_backend(pid) function (currently #ifdef'ed out of the codebase) as a contrib so it can be easily installed. See http://pgfoundry.org/pipermail/bizgres-general/2006-May/000484.html This answers my question. I've finally got a statement in concrete terms, Postgres has no way to kill a backend process via an SQL statement. "If Mark had to resort to this, then there is no other way." If you want to try it out, please read the README (it discusses possible dangers associated with sending SIGTERM to backends). And I would certainly be interested in hearing what level of success (or otherwise) you have with it! Thanks, but I've already implemented my own, which is essentially identical in concept to yours, but simpler in the sense of being even less safe than yours -- I just let anyone send the signal, since I have no users other than my own app. I'll keep my version since it's embedded in my own plug-in. That way I won't have to keep remembering to modify the Postgres code when I upgrade. I like to keep Postgres "stock". Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] [Fwd: Delivery Status Notification (Failure)]
I wrote: I can't find an address to complain about the mailing list itself, so apologies but I'm posting directly to this list. Every time I post to this group, I get returned mails about OTHER subscribers' invalid accounts, like the one below. Michael Glaesemann replied: Is this when you're replying to a post or creating a new post? If the former, and you're using reply-to-all, you'll be sending one message to the list and another directly to the poster of the message you're responding to. And Richard Broersma Jr replied: Does the message come from postgresql.org or is the bounced email coming from these specific users when you include them in reply-all? Thanks to both for your answers. But no -- It's for new posts. In fact, when writing the email that started this thread, it was only to pgsql-performance@postgresql.org (I double-checked by using emacs on my Thunderbird "Sent" folder), yet I still got another "undeliverable" reply along with your message: This is an automatically generated Delivery Status Notification. Delivery to the following recipients failed. [EMAIL PROTECTED] Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] [Fwd: Delivery Status Notification (Failure)]
I can't find an address to complain about the mailing list itself, so apologies but I'm posting directly to this list. Every time I post to this group, I get returned mails about OTHER subscribers' invalid accounts, like the one below. What's up? This seems to be a new phenomenon. Should the [EMAIL PROTECTED] be getting these and discarding them? Thanks, Craig Original Message Subject: Delivery Status Notification (Failure) Date: Wed, 12 Jul 2006 13:15:16 -0400 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] This is an automatically generated Delivery Status Notification. Delivery to the following recipients failed. [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Kill a session
Magnus Hagander wrote: This raises the question: Why doesn't Postgres have a "kill session" command that works? Oracle has it, and it's invaluable; there is no substitute. Various writers to these PG lists have raised the question repeatedly. Is it just a matter that nobody has had the time to do it (which I respect!), or is there a reason why the Postgres team decided a "kill session" is a bad idea? I beleive the function to kill a backend is actually in the codebase, it's just commented out because it's considered dangerous. There are some possible issues (see -hackers archives) about sending SIGTERM without actually shutting down the whole cluster. Doing the client-side function to call is the easy part. In many cases you just need to cancel a query, in which case you can use pg_cancel_backend() for exmaple. If you need to actually kill it, your only supported way is to restart postgresql. In other words, are you confirming that there is no way to kill a query from another process, other than shutting down the database? My understanding of the documentation tells me I can't use cancel, because the process doing the killing isn't the original process. But in spite earlier posting in these forums that say the killing the backend was the way to go, this doesn't really work. First, even though the "postgres" backend job is properly killed, a "postmaster" job keeps running at 99% CPU, which is pretty useless. Killing the client's backend didn't kill the process actually doing the work! Then you killed the wrong backend... No queries run in postmaster. They all run in postgres backends. The postmaster does very little actual work, other than keeping track of everybody else. It turns out I was confused by this: ps(1) reports a process called "postgres", but top(1) reports a process called "postmaster", but they both have the same pid. I guess postmaster replaces its own name in the process table when it's executing a query, and it's not really the postmaster even though top(1) calls it postmaster. So "kill -15 " is NOT killing the process -- to kill the process, I have to use signal 9. But if I do that, ALL queries in progress are aborted. I might as well shut down and restart the database, which is an unacceptable solution for a web site. I'm back to my original question: How do you kill a runaway query without bringing down the whole database? Is there really no answer to this? Thanks, Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Kill a session
There have been dozens, perhaps hundreds, of entries in the pg-admin, pg-general, and pg-performance lists regarding killing a session, but as far as I can tell, there is no Postgres solution. Did I miss something? This raises the question: Why doesn't Postgres have a "kill session" command that works? Oracle has it, and it's invaluable; there is no substitute. Various writers to these PG lists have raised the question repeatedly. Is it just a matter that nobody has had the time to do it (which I respect!), or is there a reason why the Postgres team decided a "kill session" is a bad idea? The rest of this email is just to illustrate the convoluted solution I've had to adopt, and even with this, I can't get it to work quite right. Background: In our web app, we give our users a fair amount of power to formulate difficult queries. These long-running queries are fork/exec'd from the Apache CGI, and we give the user a "job status" page, with the option to kill the job. I can kill off the CGI, since Apache owns the process. But the "stock answer" of kill -2 backend-pid won't work, because I don't want my Apache jobs running as super-user (!) or as setuid processes. So here's my solution: Install a couple of C extensions like this: Datum get_session_id(PG_FUNCTION_ARGS) { PG_RETURN_INT32(getpid()); } Datum kill_session(PG_FUNCTION_ARGS) { int4 session_id, status; session_id = PG_GETARG_INT32(0); fprintf(stderr, "KILLING SESSION: %d, 15\n", session_id); status = kill(session_id, 15); PG_RETURN_BOOL((status == 0) ? true : false); } These are installed with the appropriate "CREATE OR REPLACE ..." sql. Although this is dangerous (anyone who can log in to Postgres can kill any Postgres job!), its safe enough in a controlled enviroment. It allows an Apache CGI to issue the kill(2) command through the Postgres backend, which is running as the Postgres user, and thus has permission to do the deed. When I start a job, I record the backend's PID, which allows another process to connect and kill the first one. Alright, it's a hack, but it's the best I could think of. But in spite earlier posting in these forums that say the killing the backend was the way to go, this doesn't really work. First, even though the "postgres" backend job is properly killed, a "postmaster" job keeps running at 99% CPU, which is pretty useless. Killing the client's backend didn't kill the process actually doing the work! Second, the "KILLING SESSION" message to stderr is only printed in the PG log file sporadically. This confuses me, since the "KILLING SESSION" is printed by a *different* process than the one being killed, so it shouldn't be affected. So what happens to fprintf()'s output? Most of the time, I just get "unexpected EOF on client connection" in the log which is presumably the postmaster complaining that the postgres child process died. I know the kill_session() is working because it returns "true", and the job is in fact killed. But the query keeps running in postmaster (or is it something else, like a rollback?), and the stderr output disappears. Thanks, Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] need vacuum after insert/truncate/insert?
If I insert a bunch of rows, then truncate, then insert a bunch more rows, do I need to vacuum? I've been assuming that TRUNCATE TABLE is a brute-force technique that more-or-less tosses the old table and starts fresh so that no vacuum is necessary. Second question: Same scenario as above, but now the table has indexes. Is a reindex needed, or are the indexes they "truncated" too? Thanks, Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Is postgresql ca do the job for software deployed in
Richard Broersma Jr wrote: Each table with-in the database is assigned an OID and is located inside the DB directory. So if there is a file-system limitation on the number of files with-in a given directory it would also be a limit to the number of tables that could be created for each database. You could handle this with tablespaces. For example, create ten tablespaces, and then assign customer databases to them in round-robin fashion. This also lets you assign databases to different disks to balance the I/O load. Craig ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Sort order in sub-select
Here is a subtle question about SQL. I have a one-to-many pair of tables (call them "P" and "C" for parent and child). For each row of P, there are many rows in C with data, and I want to sort P on the min(c.data). The basic query is simple: select p_id, min(data) as m from c group by p_id order by m; Now the problem: I also want to store this, in sorted order, as a "hitlist", so I have a table like this: create table hitlist(p_id integer, sortorder integer); and a sequence to go with it. The first thing I tried doesn't work: insert into hitlist(p_id, sortorder) (select p_id, nextval('hitlist_seq') from (select p_id, min(data) as m from c group by p_id order by m); Apparently, the sort order returned by the innermost select is NOT maintained as you go through the next select statement -- the rows seem to come out in random order. This surprised me. But in thinking about the definition of SQL itself, I guess there's no guarantee that sort order is maintained across sub-selects. I was caught by this because in Oracle, this same query works "correctly" (i.e. the hitlist ends up in sorted order), but I suspect that was just the luck of their implementation. Can anyone confirm this, that the sort order is NOT guaranteed to be maintained through layers of SELECT statements? The obvious solution is to make the hitlist.sortorder column have the nextval() as its default and eliminate the first sub-select. But I thought the two would be equivalent. Thanks, Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Sort order in sub-select
Here is a question about SQL. I have a one-to-many pair of tables (call them "P" and "C" for parent and child). For each row of P, there are many rows in C with data, and I want to sort P on the min(c.data). The basic query is simple: select p_id, min(data) as m from c group by p_id order by m; Now the problem: I also want to store this, in sorted order, as a "hitlist", so I have a table like this: create table hitlist(p_id integer, sortorder integer); and a sequence to go with it. The first thing I tried doesn't work: insert into hitlist(p_id, sortorder) (select p_id, nextval('hitlist_seq') from (select p_id, min(data) as m from c group by p_id order by m); Apparently, the sort order returned by the innermost select is NOT maintained as you go through the next select statement -- the rows seem to come out in random order. This surprised me. But in thinking about the definition of SQL itself, I guess there's no guarantee that sort order is maintained across sub-selects. I was caught by this because in Oracle, this same query works "correctly" (i.e. the hitlist ends up in sorted order), but I suspect that was just the luck of their implementation. Can anyone confirm this, that the sort order is NOT guaranteed to be maintained through layers of SELECT statements? The apparent solution is to make the hitlist.sortorder column have nextval() as its default and eliminate the first sub-select. But I thought the two would be equivalent. Thanks, Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] explain analyze reports 20x more time than actual
I have a query that needs to run faster, with the obvious solution being to add an index. But to confirm this, I ran explain analyze. When I run the actual query, it consistently takes 6-7 seconds by the wall clock. My application with a "verbose" mode enabled reports 6.6 seconds consistently. However, when I run EXPLAIN ANALYZE, it takes 120 seconds! This is 20x longer, and it leads me to distrust the plan that it claims to be executing. How can the actual run time be so much faster than that claimed by EXPLAIN ANALYZE? How can I find out the actual plan it's using? Thanks, Craig Details: Postgres 8.0.3 shared_buffers = 2 work_mem = 50 effective_cache_size = 43 Dell w/ Xeon Linux kernel 2.6.9-1.667smp 4 GB memory => explain analyze select SAMPLE.SAMPLE_ID, SAMPLE.VERSION_ID,SAMPLE.SUPPLIER_ID,SAMPLE.CATALOGUE_ID,SAMPLE.PREP_ID from HITLIST_ROWS_281430 join SAMPLE on (HITLIST_ROWS_281430.OBJECTID = SAMPLE.SAMPLE_ID) where SAMPLE.VERSION_ID in (7513672,7513650,7513634,7513620,7513592,7513590,7513582,7513576,7513562,7513560) order by HITLIST_ROWS_281430.SortOrder; QUERY PLAN Sort (cost=234964.38..234964.52 rows=58 width=24) (actual time=120510.842..120510.889 rows=10 loops=1) Sort Key: hitlist_rows_281430.sortorder -> Hash Join (cost=353.68..234962.68 rows=58 width=24) (actual time=81433.194..120510.753 rows=10 loops=1) Hash Cond: ("outer".objectid = "inner".sample_id) -> Seq Scan on hitlist_rows_281430 (cost=0.00..177121.61 rows=11497361 width=8) (actual time=0.008..64434.110 rows=11497361 loops=1) -> Hash (cost=353.48..353.48 rows=82 width=20) (actual time=0.293..0.293 rows=0 loops=1) -> Index Scan using i_sample_version_id, i_sample_version_id, i_sample_version_id, i_sample_version_id, i_sample_version_id, i_sample_version_id, i_sample_version_id, i_sample_version_id, i_sample_version_id, i_sample_version_id on sample (cost=0.00..353.48 rows=82 width=20) (actual time=0.042..0.201 rows=12 loops=1) Index Cond: ((version_id = 7513672) OR (version_id = 7513650) OR (version_id = 7513634) OR (version_id = 7513620) OR (version_id = 7513592) OR (version_id = 7513590) OR (version_id = 7513582) OR (version_id = 7513576) OR (version_id = 7513562) OR (version_id = 7513560)) Total runtime: 120511.485 ms (9 rows) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL
Arjen van der Meijden wrote: First of all, this graph has no origin. Its a bit difficult to test with less than one cpu. Sure it does. I ran all the tests. They all took infinite time, and I got zero results. And my results are 100% accurate and reliable. It's perfectly valid data. :-) Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] function not called if part of aggregate
Greg Stark wrote: However that's not enough to explain what you've shown. How about you show the actual query and actual plan you're working with? The plan you've shown can't result from the query you sent. Mea culpa, sort of. But ... in fact, the plan I sent *was* from query I sent, with the table/column names changed for clarity. This time I'll send the plan "raw". (This is PG 8.0.1.) chm=> explain select count(1) from (select normalize_add_salt(smiles) from chm(> salt_smiles order by db_no) as foo; QUERY PLAN Aggregate (cost=69.95..69.95 rows=1 width=0) -> Subquery Scan foo (cost=0.00..67.93 rows=806 width=0) -> Index Scan using salt_smiles_pkey on salt_smiles (cost=0.00..59.87 rows=806 width=30) (3 rows) As pointed out by Tom and others, this query DOES in fact call the normalize_add_salt() function. Now here's the weird part. (And where my original posting went wrong -- sorry for the error! I got the two queries mixed up.) I originally had a more complex query, the purpose being to guarantee that the function was called on the strings in the order specified. (More on this below.) Here is the original query I used: chm=> explain select count(1) from (select normalize_add_salt(smiles) chm(> from (select smiles from salt_smiles order by db_no) as foo) as bar; QUERY PLAN Aggregate (cost=67.94..67.94 rows=1 width=0) -> Subquery Scan foo (cost=0.00..65.92 rows=806 width=0) -> Index Scan using salt_smiles_pkey on salt_smiles (cost=0.00..57.86 rows=806 width=30) (3 rows) Notice that the plans are essentially identical, yet in this one the function does NOT get called. I proved this by brute force, inserting "char **p = NULL; *p = "foo";" into the C code to guarantee a segmentation violation if the function gets called. In the first case it does SIGSEGV, and in the second case it does not. Now the reason for this more-complex query with an additional subselect is that the SMILES (which, by the way, are a lexical way of representing chemical structures - see www.daylight.com), must be passed to the function in a particular order (hence the ORDER BY). In retrospect I realize the optimizer apparently flattens this query anyway (hence the identical plans, above). But the weird thing is that, in spite of flattening, which would appear to make the queries equivalent, the function gets called in one case, and not in the other. Steinar H. Gunderson asked: select count(1) from (select foo_init(value) from foo_init_table order by value_id) as foo; Why not just count(foo_init(value))? Because the SMILES must be processed in a specific order, hence the more complex queries. The simple answer to this whole problem is what Steinar wrote: This works well, but it requires me to actually retrieve the function's value 800 times. Is this actually a problem? No, it's just a nuisance. It occurs to me that in spite of the ORDER BY expression, Postgres is free to evaluate the function first, THEN sort the results, which means the SMILES would be processed in random order anyway. I.e. my ORDER BY clause is useless for the intended purpose. So the only way I can see to get this right is to pull the SMILES into my application with the ORDER BY to ensure I have them in the correct order, then send them back one at a time via a "select normalize_add_salt(smiles)", meaning I'll retrieve 800 strings and then send them back. I just thought there ought to be a way to do this all on the PG server instead of sending all these strings back and forth. I'd like to say to Postgres, "Just do it this way, OK?" But the optimizer can't be turned off, so I guess I have to do it the slow way. The good news is that this is just an initialization step, after which I typically process thousands of molecules, so the extra overhead won't kill me. Thanks to all for your help. Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] function not called if part of aggregate
My application has a function, call it "foo()", that requires initialization from a table of about 800 values. Rather than build these values into the C code, it seemed like a good idea to put them on a PG table and create a second function, call it "foo_init()", which is called for each value, like this: select foo_init(value) from foo_init_table order by value_id; This works well, but it requires me to actually retrieve the function's value 800 times. So I thought I'd be clever: select count(1) from (select foo_init(value) from foo_init_table order by value_id) as foo; And indeed, it count() returns 800, as expected. But my function foo_init() never gets called! Apparently the optimizer figures out that foo_init() must return one value for each row, so it doesn't bother to actually call the function. db=> explain select count(1) from (select foo_init(value) from foo_init_table order by db_no) as foo; query plan aggregate (cost=69.95..69.95 rows=1 width=0) -> Subquery Scan foo (cost=0.00..67.93 rows=806 width=0) -> Index Scan using foo_init_table_pkey on foo_init_table (cost=0.00..59.87 rows=806 width=30) This doesn't seem right to me -- how can the optimizer possibly know that a function doesn't have a side effect, as in my case? Functions could do all sorts of things, such as logging activity, filling in other tables, etc, etc. Am I missing something here? Thanks, Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Some queries starting to hang
Tom Lane wrote: The idea I just had was: why do we need EXPLAIN ANALYZE to run to completion? In severe cases like this thread, we might be able to discover the root cause by a *partial* execution of the plan, as long as it was properly instrumented. That way, the OP might have been able to discover the root cause himself... I don't think that helps, as it just replaces one uncertainty by another: how far did the EXPLAIN really get towards completion of the plan? You still don't have any hard data. But at least you have some data, which is better than no data. Even knowing that the plan got stuck on a particular node of the query plan could be vital information. For a query that never finishes, you can't even find out where it's getting stuck. That's why Simon's proposal might help in some particularly difficult situations. Regards, Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Some queries starting to hang
Simon Riggs wrote: Well, it's a big query. If it ought to take a second or two, and instead is taking an hour or two (1800 times the expected runtime), that might be close enough to "never" to exhaust Chris' patience. Besides, we don't know whether the 1800 might itself be an underestimate (too bad Chris didn't provide EXPLAIN ANALYZE results). This is a good example of a case where the inefficiency of EXPLAIN ANALYZE would be a contributory factor to it not actually being available for diagnosing a problem. This is a frustration I have, but Simon expressed it much more concisely. The first question one gets in this forum is, "did you run EXPLAIN ANALYZE?" But if EXPLAIN ANALYZE never finishes, you can't get the information you need to diagnose the problem. Simon's proposal, e.g. EXPLAIN ANALYZE ERRLIMIT 10 SELECT ... or something similar, would be a big help. I.e. "If you can't finish in a reasonable time, at least tell me as much as you can." Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Query hanging/not finishing inconsistently
Meetesh Karia wrote: Hi all, We've recently started having a problem where a query that normally executes in ~15ms starts to take upwards of 20s to complete. When the connection that ran query is returned to the connection pool, it appears as though a transaction is still in progress so the connection pool tries to cancel the transaction and close the connection. This fails and the connection is removed from the connection pool. At this point, the situation rapidly degrades and we run out of connections to the postgres server. An inspection of the pg_stat_activity table shows that practically every connection is running the above-mentioned query and some of those queries have been active for many minutes! We've looked at the pg_locks table as well and the only exclusive locks are on transactions that are open. All other locks are AccessShareLocks. Also, as far as we can tell (from looking at the Hibernate stats), every db session that is opened is closed. When this happens, if I kill one of the running postgres processes (just by picking the last process returned from "ps -ef | grep postgres"), the other queries will immediately finish and the system will respond. However, within 15 minutes, we'll be back in the same state as before. At that point, I've cycled Apache, Tomcat and Postgres and the system then seems to come back. This sounds suspiciously like a question I asked a few weeks ago, on April 4. I have a process that just gets stuck. After some questions from various of the experts in this forum, I used gdb(1) to attach to one of the frozen Postgress backend processes, and here's what I found: On 5/12/2006, I wrote: Thanks, good advice. You're absolutely right, it's stuck on a mutex. After doing what you suggest, I discovered that the query in progress is a user-written function (mine). When I log in as root, and use "gdb -p " to attach to the process, here's what I find. Notice the second function in the stack, a mutex lock: (gdb) bt #0 0x0087f7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x0096cbfe in __lll_mutex_lock_wait () from /lib/tls/libc.so.6 #2 0x008ff67b in _L_mutex_lock_3220 () from /lib/tls/libc.so.6 #3 0x4f5fc1b4 in ?? () #4 0x00dc5e64 in std::string::_Rep::_S_empty_rep_storage () from /usr/local/pgsql/lib/libchmoogle.so #5 0x009ffcf0 in ?? () from /usr/lib/libz.so.1 #6 0xbfe71c04 in ?? () #7 0xbfe71e50 in ?? () #8 0xbfe71b78 in ?? () #9 0x009f7019 in zcfree () from /usr/lib/libz.so.1 #10 0x009f7019 in zcfree () from /usr/lib/libz.so.1 #11 0x009f8b7c in inflateEnd () from /usr/lib/libz.so.1 #12 0x00c670a2 in ~basic_unzip_streambuf (this=0xbfe71be0) at zipstreamimpl.h:332 #13 0x00c60b61 in OpenBabel::OBConversion::Read (this=0x1, pOb=0xbfd923b8, pin=0xffea) at istream:115 #14 0x00c60fd8 in OpenBabel::OBConversion::ReadString (this=0x8672b50, pOb=0xbfd923b8) at obconversion.cpp:780 #15 0x00c19d69 in chmoogle_ichem_mol_alloc () at stl_construct.h:120 #16 0x00c1a203 in chmoogle_ichem_normalize_parent () at stl_construct.h:120 #17 0x00c1b172 in chmoogle_normalize_parent_sdf () at vector.tcc:243 #18 0x0810ae4d in ExecMakeFunctionResult () #19 0x0810de2e in ExecProject () #20 0x08115972 in ExecResult () #21 0x08109e01 in ExecProcNode () #22 0x0020 in ?? () #23 0xbed4b340 in ?? () #24 0xbf92d9a0 in ?? () #25 0xbed4b0c0 in ?? () #26 0x in ?? () It looks to me like my code is trying to read the input parameter (a fairly long string, maybe 2K) from a buffer that was gzip'ed by Postgres for the trip between the client and server... somewhere along the way, a mutex gets set, and then ... it's stuck forever. ps(1) shows that this thread had been running for about 7 hours, and the job status showed that this function had been successfully called about 1 million times, before this mutex lock occurred. This is not an issue that's been resolved. Nobody had ever seen this before. Tom Lane suggested it might be a libc/c++ bug, but unfortunately in my case this lockup occurs so rarely (every few days) that it will be very difficult to know if we've fixed the problem. If gdb(1) reveals that your process is stuck in a mutex, then you might have a better chance testing this hypothesis, since your problem happens within 15 minutes or so. Did this start recently, perhaps right after a kernel update? Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Optimizer: limit not taken into account
Tom Lane wrote: There is not anything in there that considers whether the table's physical order is so nonrandom that the search will take much longer than it would given uniform distribution. It might be possible to do something with the correlation statistic in simple cases ... In this case, the rows are not random at all, in fact they're inserted from a sequence, then rows are deleted as they are processed. If the planner is hoping for random physical distribution, this particular case is exactly wrong. Craig ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Optimizer: limit not taken into account
Here's a "corner case" that might interest someone. It tripped up one of our programmers. We have a table with > 10 million rows. The ID column is indexed, the table has been vacuum/analyzed. Compare these two queries: select * from tbl where id >= 1000 limit 1; select * from tbl where id >= 1000 order by id limit 1; The first takes 4 seconds, and uses a full table scan. The second takes 32 msec and uses the index. Details are below. I understand why the planner makes the choices it does -- the "id > 1000" isn't very selective and under normal circumstances a full table scan is probably the right choice. But the "limit 1" apparently doesn't alter the planner's strategy at all. We were surprised by this. Adding the "order by" was a simple solution. Craig pg=> explain analyze select url, url_digest from url_queue where priority >= 1000 limit 1; QUERY PLAN -- Limit (cost=0.00..0.65 rows=1 width=108) (actual time=4036.113..4036.117 rows=1 loops=1) -> Seq Scan on url_queue (cost=0.00..391254.35 rows=606176 width=108) (actual time=4036.101..4036.101 rows=1 loops=1) Filter: (priority >= 1000) Total runtime: 4036.200 ms (4 rows) pg=> explain analyze select url, url_digest from url_queue where priority >= 1000 order by priority limit 1; QUERY PLAN -- Limit (cost=0.00..2.38 rows=1 width=112) (actual time=32.445..32.448 rows=1 loops=1) -> Index Scan using url_queue_priority on url_queue (cost=0.00..1440200.41 rows=606176 width=112) (actual time=32.434..32.434 rows=1 loops=1) Index Cond: (priority >= 1000) Total runtime: 32.566 ms ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Speed Up Offset and Limit Clause
Christian Paul Cosinas wrote: I am creating an application that gets the value of a large table and write it to a file. Why I want to use offset and limit is for me to create a threaded application so that they will not get the same results. For example: Thread 1 : gets offset 0 limit 5000 Thread 2 : gets offset 5000 limit 5000 Thread 3 : gets offset 1 limit 5000 And so on... Would there be any other faster way than what It thought? In order to return rows 1 to 15000, it must select all rows from zero to 15000 and then discard the first 1 -- probably not what you were hoping for. You might add a "thread" column. Say you want to run ten threads: create sequence thread_seq increment by 1 minvalue 1 maxvalue 10 cycle start with 1; create table mytable( column1integer, ... other columns..., thread integer default nextval('thread_seq') ); create bitmap index i_mytable_thread on mytable(thread); Now whenever you insert into mytable, you get a value in mytable.thread between 1 and 10, and it's indexed with a highly efficient bitmap index. So your query becomes: Thread 1: select ... from mytable where ... and thread = 1; Thread 2: select ... from mytable where ... and thread = 2; ... and so forth. Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Postgres gets stuck
Tom Lane wrote: >My suspicion is that it's an incompatibility between malloc() >libraries. On Linux there's only supposed to be one malloc, ie, glibc's version. On other platforms I'd be worried about threaded vs non-threaded libc (because the backend is not threaded), but not Linux. I guess I misinterpreted the Postgress manual, which says (in 31.9, "C Language Functions"), "When allocating memory, use the PostgreSQL functions palloc and pfree instead of the corresponding C library functions malloc and free." I imagined that perhaps palloc/pfree used mutexes for something. But if I understand you, palloc() and pfree() are just wrappers around malloc() and free(), and don't (for example) make their own separate calls to brk(2), sbrk(2), or their kin. If that's the case, then you answered my question - it's all ordinary malloc/free calls in the end, and that's not the source of the problem. There may be a more basic threading problem here, though, rooted in the precise fact that the backend isn't threaded. If you're trying to use any libraries that assume they can have multiple threads, I wouldn't be at all surprised to see things go boom. No threading anywhere. None of the libraries use threads or mutexes. It's just plain old vanilla C/C++ scientific algorithms. C++ exception handling could be problematic too. No C++ exceptions are thrown anywhere in the code, 'tho I suppose one of the I/O libraries could throw an exception, e.g. when reading from a file. But there's no evidence of this after millions of identical operations succeeded. In addition, the stack trace shows it to be stuck in a memory operation, not an I/O operation. Or it could be a garden variety glibc bug. How up-to-date is your platform? I guess this is the next place to look. From the few answers I've gotten, it sounds like this isn't a known Postgres issue, and my stack trace doesn't seem to be familiar to anyone on this forum. Oh well... thanks for your help. Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Postgres gets stuck
Chris wrote: This is a deadly bug, because our web site goes dead when this happens, ... Sounds like a deadlock issue. ... stats_command_string = true and restart postgresql. then you'll be able to: select * from pg_stat_activity; to see what queries postgres is running and that might give you some clues. Thanks, good advice. You're absolutely right, it's stuck on a mutex. After doing what you suggest, I discovered that the query in progress is a user-written function (mine). When I log in as root, and use "gdb -p " to attach to the process, here's what I find. Notice the second function in the stack, a mutex lock: (gdb) bt #0 0x0087f7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x0096cbfe in __lll_mutex_lock_wait () from /lib/tls/libc.so.6 #2 0x008ff67b in _L_mutex_lock_3220 () from /lib/tls/libc.so.6 #3 0x4f5fc1b4 in ?? () #4 0x00dc5e64 in std::string::_Rep::_S_empty_rep_storage () from /usr/local/pgsql/lib/libchmoogle.so #5 0x009ffcf0 in ?? () from /usr/lib/libz.so.1 #6 0xbfe71c04 in ?? () #7 0xbfe71e50 in ?? () #8 0xbfe71b78 in ?? () #9 0x009f7019 in zcfree () from /usr/lib/libz.so.1 #10 0x009f7019 in zcfree () from /usr/lib/libz.so.1 #11 0x009f8b7c in inflateEnd () from /usr/lib/libz.so.1 #12 0x00c670a2 in ~basic_unzip_streambuf (this=0xbfe71be0) at zipstreamimpl.h:332 #13 0x00c60b61 in OpenBabel::OBConversion::Read (this=0x1, pOb=0xbfd923b8, pin=0xffea) at istream:115 #14 0x00c60fd8 in OpenBabel::OBConversion::ReadString (this=0x8672b50, pOb=0xbfd923b8) at obconversion.cpp:780 #15 0x00c19d69 in chmoogle_ichem_mol_alloc () at stl_construct.h:120 #16 0x00c1a203 in chmoogle_ichem_normalize_parent () at stl_construct.h:120 #17 0x00c1b172 in chmoogle_normalize_parent_sdf () at vector.tcc:243 #18 0x0810ae4d in ExecMakeFunctionResult () #19 0x0810de2e in ExecProject () #20 0x08115972 in ExecResult () #21 0x08109e01 in ExecProcNode () #22 0x0020 in ?? () #23 0xbed4b340 in ?? () #24 0xbf92d9a0 in ?? () #25 0xbed4b0c0 in ?? () #26 0x in ?? () It looks to me like my code is trying to read the input parameter (a fairly long string, maybe 2K) from a buffer that was gzip'ed by Postgres for the trip between the client and server. My suspicion is that it's an incompatibility between malloc() libraries. libz (gzip compression) is calling something called zcfree, which then appears to be intercepted by something that's (probably statically) linked into my library. And somewhere along the way, a mutex gets set, and then ... it's stuck forever. ps(1) shows that this thread had been running for about 7 hours, and the job status showed that this function had been successfully called about 1 million times, before this mutex lock occurred. Any ideas? Thanks, Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Postgres gets stuck
I'm having a rare but deadly problem. On our web servers, a process occasionally gets stuck, and can't be unstuck. Once it's stuck, all Postgres activities cease. "kill -9" is required to kill it -- signals 2 and 15 don't work, and "/etc/init.d/postgresql stop" fails. Here's what the process table looks like: $ ps -ef | grep postgres postgres 30713 1 0 Apr24 ?00:02:43 /usr/local/pgsql/bin/postmaster -p 5432 -D /disk3/postgres/data postgres 25423 30713 0 May08 ?00:03:34 postgres: writer process postgres 25424 30713 0 May08 ?00:00:02 postgres: stats buffer process postgres 25425 25424 0 May08 ?00:00:02 postgres: stats collector process postgres 11918 30713 21 07:37 ?02:00:27 postgres: production webuser 127.0.0.1(21772) SELECT postgres 31624 30713 0 16:11 ?00:00:00 postgres: production webuser [local] idle postgres 31771 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12422) idle postgres 31772 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12421) idle postgres 31773 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12424) idle postgres 31774 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12425) idle postgres 31775 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12426) idle postgres 31776 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12427) idle postgres 31777 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12428) idle The SELECT process is the one that's stuck. top(1) and other indicators show that nothing is going on at all (no CPU usage, normal memory usage); the process seems to be blocked waiting for something. (The "idle" processes are attached to a FastCGI program.) This has happened on *two different machines*, both doing completely different tasks. The first one is essentially a read-only warehouse that serves lots of queries, and the second one is the server we use to load the warehouse. In both cases, Postgres has been running for a long time, and is issuing SELECT statements that it's issued millions of times before with no problems. No other processes are accessing Postgres, just the web services. This is a deadly bug, because our web site goes dead when this happens, and it requires an administrator to log in and kill the stuck postgres process then restart Postgres. We've installed failover system so that the web site is diverted to a backup server, but since this has happened twice in one week, we're worried. Any ideas? Details: Postgres 8.0.3 Linux 2.6.12-1.1381_FC3smp i686 i386 Dell 2-CPU Xeon system (hyperthreading is enabled) 4 GB memory 2 120 GB disks (SATA on machine 1, IDE on machine 2) Thanks, Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Memory and/or cache issues?
mcelroy, tim wrote: Sorry, been up all night and maybe provided too much information or not the right information and only confused folks, tired I guess. When I say 'in use' I am referring to the 'used' column. Thanks all who have responded to this inquiry, I appreciate it. Here's free from PROD001: [EMAIL PROTECTED] kernel]# free -k -t total used free sharedbuffers cached Mem: 76435366975772 667764 0 1654965393396 -/+ buffers/cache:14168806226656 Swap: 8185108 52088179900 Total:1582864469809808847664 On Linux (unlike most Unix systems), "used" includes both processes AND the kernel's file-system buffers, which means "used" will almost always be close to 100%. Starting with a freshly-booted system, you can issue almost any command that scans files, and "used" will go up and STAY at nearly 100% of memory. For example, reboot and try "tar cf - / >/dev/null" and you'll see the same sort of "used" numbers. In My Humble Opinion, this is a mistake in Linux. This confuses just about everyone the first time they see it (including me), because the file-system buffers are dynamic and will be relenquished by the kernel if another process needs memory. On Unix systems, "used" means, "someone else is using it and you can't have it", which is what most of us really want to know. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE
Jim C. Nasby wrote: No, I don't agree with this. Too many people waste time designing for "what if..." scenarios that never happen. You don't want to be dumb and design something that locks out a foreseeable and likely future need, but referential integrity doesn't meet this criterion. There's nothing to keep you from changing from app-managed to database-managed referential integrity if your needs change. In this case your argument makes no sense, because you will spend far more time re-creating RI capability inside an application than if you just use what the database offers natively. But one of the specific conditions in my original response was, "You have application-specific knowledge about when you can skip referential integrity and thereby greatly improve performance." If you can't do that, I agree with you. Anyway, this discussion is probably going on too long, and I'm partly to blame. I think we all agree that in almost all situations, using the database to do referential integrity is the right choice, and that you should only violate this rule if you have a really, really good reason, and you've thought out the implications carefully, and you know you may have to pay a steep price later if your requirements change. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE
Jim C. Nasby wrote: 1. You have only one application that modifies the data. (Otherwise, you have to duplicate the rules across many applications, leading to a code-maintenance nightmare). You forgot something: 1a: You know that there will never, ever, ever, ever, be any other application that wants to talk to the database. I know tons of people that get burned because they go with something that's "good enough for now", and then regret that decision for years to come. No, I don't agree with this. Too many people waste time designing for "what if..." scenarios that never happen. You don't want to be dumb and design something that locks out a foreseeable and likely future need, but referential integrity doesn't meet this criterion. There's nothing to keep you from changing from app-managed to database-managed referential integrity if your needs change. Design for your current requirements. Let us be of good cheer, remembering that the misfortunes hardest to bear are those which never happen. - James Russell Lowell (1819-1891) Therefore do not be anxious about tomorrow, for tomorrow will be anxious for itself. Let the day's own trouble be sufficient for the day. - Matthew 6:34 Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE
I think this is an old question, but I want to know if it really is well worth to not create some foreign keys an deal with the referential integrity at application-level? Trust me : do it in the application and you'll enter a world of hurt. I'm doing it with some mysql apps, and it's a nightmare ; doing cascaded delete's by hand, etc, you always forget something, you have to modify a million places in your code everytime you add a new table, your ORM bloats, you get to write cleanup cron scripts which take forever to run, your website crashes etc. All good advice, but... there are no absolutes in this world. Application-enforced referential integrity makes sense if (and probably ONLY if): 1. You have only one application that modifies the data. (Otherwise, you have to duplicate the rules across many applications, leading to a code-maintenance nightmare). 2. If your application crashes and leaves a mess, it's not a catastrophe, and you have a good way to clean it up. For example, a bank shouldn't do this, but it might be OK for a computer-aided-design application, or the backend of a news web site. 3. You have application-specific knowledge about when you can skip referential integrity and thereby greatly improve performance. For example, you may have batch operations where large numbers of rows are temporarily inconsistent. If your application doesn't meet ALL of these criteria, you probably should use the database for referential integrity. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Decide between Postgresql and Mysql (help of
This is off-topic for this group so I'll just give a brief reply; I'm happy to carry on more just between the two of us... Gorshkov wrote: That being said . what *is* the difference between coding a website - major or otherwise - in an "old-fashioned" compiled language and a non-compiled language, except for the amount of hoursepower and memory you require? Old-fashioned doesn't mean bad, inappropriate, or inferior. It's just not the latest-and-greatest, however it's currently defined by the geek fashion police. Our experience coding web sites with C/C++ versus Perl is about a factor of ten in productivity. We only use C/C++ for CPU-intensive calculations, such as scientific prediction code. Everything else is Perl or Java. I recently re-coded 10,000 lines of C into 650 lines of Perl. Why? String handling, hash tables, and the simplicity of DBD/DBI. And there was no loss of performance, because the app was strictly I/O bound (that is, Postgres was I/O bound). Sure, the old app may not have been optimal, but we're talking about a factor of 15 reduction in lines of code. That's not "geek fashion", it's good engineering. Pick the best tool for the job, and learn how to use it. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Decide between Postgresql and Mysql (help of
Gorshkov wrote: /flame on if you were *that* worried about performance, you wouldn't be using PHP or *any* interperted language /flame off sorry - couldn't resist it :-) I hope this was just a joke. You should be sure to clarify - there might be some newbie out there who thinks you are seriously suggesting coding major web sites in some old-fashioned compiled language. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Poor performance o
Tom Lane wrote: "Craig A. James" <[EMAIL PROTECTED]> writes: It looks to me like the problem is the use of nested loops when a hash join should be used, but I'm no expert at query planning. Given the sizes of the tables involved, you'd likely have to boost up work_mem before the planner would consider a hash join. What nondefault configuration settings do you have, anyway? shared_buffers = 2 work_mem = 32768 effective_cache_size = 30 This is on a 4GB machine. Is there a guideline for work_mem that's related to table size? Something like, "allow 2 MB per million rows"? I'm also curious why the big difference between my "Query #1" and "Query #2". Even though it does a nested loop, #2's outer loop only returns one result from a very tiny table, so shouldn't it be virtually indistinguishable from #1? Thanks, Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Poor performance o
I'm reposting this -- I sent this out a month ago but never got a response, and hope someone can shed some light on this. Thanks, Craig -- This is a straightforward query that should be fairly quick, but takes about 30 minutes. It's a query across three tables, call them A, B, and C. The tables are joined on indexed columns. Here's a quick summary: Table A -> Table B -> Table C A_ID B_ID C_ID A_ID NAME C_ID Tables A and B have 6 million rows each. Table C is small: 67 names, no repeats. All columns involved in the join are indexed. The database has been full-vacuumed and analyzed. Summary: 1. Query B only:2.7 seconds, 302175 rows returned 2. Join B and C:4.3 seconds, exact same answer 3. Join A and B:7.2 minutes, exact same answer 4. Join A, B, C:32.7 minutes, exact same answer Looking at these: Query #1 is doing the real work: finding the rows of interest. Queries #1 and #2 ought to be virtually identical, since Table C has just one row with C_ID = 9, but the time almost doubles. Query #3 should take a bit longer than Query #1 because it has to join 300K rows, but the indexes should make this take just a few seconds, certainly well under a minute. Query #4 should be identical to Query #3, again because there's only one row in Table C. 32 minutes is pretty horrible for such a straightforward query. It looks to me like the problem is the use of nested loops when a hash join should be used, but I'm no expert at query planning. This is psql 8.0.3. Table definitions are at the end. Hardware is a Dell, 2-CPU Xeon, 4 GB memory, database is on a single SATA 7200RPM disk. These table and column names are altered to protect the guilty, otherwise these are straight from Postgres. QUERY #1: - explain analyze select B.A_ID from B where B.B_ID = 9; Index Scan using i_B_B_ID on B (cost=0.00..154401.36 rows=131236 width=4) (actual time=0.158..1387.251 rows=302175 loops=1) Index Cond: (B_ID = 9) Total runtime: 2344.053 ms QUERY #2: - explain analyze select B.A_ID from B join C on (B.C_ID = C.C_ID) where C.name = 'Joe'; Nested Loop (cost=0.00..258501.92 rows=177741 width=4) (actual time=0.349..3392.532 rows=302175 loops=1) -> Seq Scan on C (cost=0.00..12.90 rows=1 width=4) (actual time=0.232..0.336 rows=1 loops=1) Filter: ((name)::text = 'Joe'::text) -> Index Scan using i_B_C_ID on B (cost=0.00..254387.31 rows=328137 width=8) (actual time=0.102..1290.002 rows=302175 loops=1) Index Cond: (B.C_ID = "outer".C_ID) Total runtime: 4373.916 ms QUERY #3: - explain analyze select A.A_ID from A join B on (A.A_ID = B.A_ID)where B.B_ID = 9; Nested Loop (cost=0.00..711336.41 rows=131236 width=4) (actual time=37.118..429419.347 rows=302175 loops=1) -> Index Scan using i_B_B_ID on B (cost=0.00..154401.36 rows=131236 width=4) (actual time=27.344..8858.489 rows=302175 loops=1) Index Cond: (B_ID = 9) -> Index Scan using pk_A_test on A (cost=0.00..4.23 rows=1 width=4) (actual time=1.372..1.376 rows=1 loops=302175) Index Cond: (A.A_ID = "outer".A_ID) Total runtime: 430467.686 ms QUERY #4: - explain analyze select A.A_ID from A join B on (A.A_ID = B.A_ID) join C on (B.B_ID = C.B_ID) where C.name = 'Joe'; Nested Loop (cost=0.00..1012793.38 rows=177741 width=4) (actual time=70.184..1960112.247 rows=302175 loops=1) -> Nested Loop (cost=0.00..258501.92 rows=177741 width=4) (actual time=52.114..17753.638 rows=302175 loops=1) -> Seq Scan on C (cost=0.00..12.90 rows=1 width=4) (actual time=0.109..0.176 rows=1 loops=1) Filter: ((name)::text = 'Joe'::text) -> Index Scan using i_B_B_ID on B (cost=0.00..254387.31 rows=328137 width=8) (actual time=51.985..15566.896 rows=302175 loops=1) Index Cond: (B.B_ID = "outer".B_ID) -> Index Scan using pk_A_test on A (cost=0.00..4.23 rows=1 width=4) (actual time=6.407..6.412 rows=1 loops=302175) Index Cond: (A.A_ID = "outer".A_ID) Total runtime: 1961200.079 ms TABLE DEFINITIONS: -- xxx => \d a Table "xxx.a" Column | Type | Modifiers --++--- a_id | integer| not null ... more columns Indexes: "pk_a_id" PRIMARY KEY, btree (a_id) ... more indexes on other columns xxx => \d b Table "xxx.b" Column | Type | Modifiers -++--- b_id | integer| not null a_id | integer| not null c_id | integer| not null ... more columns Indexes: "b_pkey" PRIMARY KEY, btree (b_id) "i_b_a_id" btree (a_id) "i_b_c_id" btree (c_id) xxx=> \d c Table "xxx.c" Column | Type | Modifiers -
[PERFORM] update == delete + insert?
I've seen it said here several times that "update == delete + insert". On the other hand, I've noticed that "alter table [add|drop] column ..." is remarkably fast, even for very large tables, which leads me to wonder whether each column's contents are in a file specifically for that column. My question: Suppose I have a very "wide" set of data, say 100 columns, and one of those columns will be updated often, but the others are fairly static. I have two choices: Design 1: create table a ( id integer, frequently_updated integer); create table b( id integer, infrequently_updated_1 integer, infrequently_updated_2 integer, infrequently_updated_3 integer, ... etc. infrequently_updated_99 integer); Design 2: create table c( id integer, frequently_updated integer, infrequently_updated_1 integer, infrequently_updated_2 integer, infrequently_updated_3 integer, ... etc. infrequently_updated_99 integer); If "update == delete + insert" is strictly true, then "Design 2" would be poor since 99 columns would be moved around with each update. But if columns are actually stored in separate files, the Designs 1 and 2 would be essentially equivalent when it comes to vacuuming. Thanks, Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Migration study, step 1: bulk write performance optimization
Mikael Carneholm wrote: I am responisble for an exciting project of evaluating migration of a medium/large application for a well-known swedish car&truck manufacturer ... The goal right now is to find the set of parameters that gives as short bulk insert time as possible, minimizing downtime while the data itself is migrated. If you haven't explored the COPY command yet, check it out. It is stunningly fast compared to normal INSERT commands. http://www.postgresql.org/docs/8.1/static/sql-copy.html pg_dump and pg_restore make use of the COPY command. Since you're coming from a different vendor, you'd have to dump the data into a COPY-compatible set of files yourself. But it will be worth the effort. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] No vacuum for insert-only database?
Alvaro Herrera wrote: If I only insert data into a table, never update or delete, then I should never have to vacuum it. Is that correct? You still need to vacuum eventually, to avoid transaction Id wraparound issues. But not as often. Thanks. Any suggestions for what "not as often" means? For example, if my database will never contain more than 10 million rows, is that a problem? 100 million rows? When does transaction ID wraparound become a problem? Craig ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] No vacuum for insert-only database?
If I only insert data into a table, never update or delete, then I should never have to vacuum it. Is that correct? Thanks, Craig ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Reliability recommendations
Joshua D. Drake wrote: I find this strains credibility, that this major manufacturer of PC's would do something deceptive that hurts performance, when it would be easily detected and widely reported. Can anyone cite a specific instances where this has happened? Such as, "I bought Dell model XYZ, which was advertised to have these parts and these specs, but in fact had these other parts and here are the actual specs." I can :) Feb 20 07:33:52 master kernel: [4294682.803000] Vendor: MegaRAID Model: LD 0 RAID1 51G Rev: 196T --- snip --- This machine... if you run it in raid 5 will only get 7-9 megabytes a second READ! performance. That is with 6 SCSI drives. If you run it in RAID 10 you get a more reasonable 50-55 megabytes per second. But you don't say how this machine was advertised. Are there components in that list that were not as advertised? Was the machine advertised as capable of RAID 5? Were performance figures published for RAID 5? If Dell advertised that the machine could do what you asked, then you're right -- they screwed you. But if it was designed for and advertised to a different market, then I've made my point: People are blaming Dell for something that's not their fault. Craig ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Reliability recommendations
Bruce Momjian wrote: Dell often says part X is included, but part X is not the exact same as part X sold by the original manufacturer. To hit a specific price point, Dell is willing to strip thing out of commodity hardware, and often does so even when performance suffers. For many people, this is unacceptable. I find this strains credibility, that this major manufacturer of PC's would do something deceptive that hurts performance, when it would be easily detected and widely reported. Can anyone cite a specific instances where this has happened? Such as, "I bought Dell model XYZ, which was advertised to have these parts and these specs, but in fact had these other parts and here are the actual specs." Dell seems to take quite a beating in this forum, and I don't recall seeing any other manufacturer blasted this way. Is it that they are deceptive, or simply that their "servers" are designed to be office servers, not database servers? There's nothing wrong with Dell designing their servers for a different market than ours; they need to go for the profits, and that may not include us. But it's not fair for us to claim Dell is being deceptive unless we have concrete evidence. Craig ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Cost Issue - How do I force a Hash Join
"Virag Saksena" <[EMAIL PROTECTED]> writes: The individual queries run in 50-300 ms. However the optimizer is choosing a nested loop to join them rather than a Hash join... I have what appears to be the identical problem. This is a straightforward query that should be fairly quick, but takes about 30 minutes. It's a query across three tables, call them A, B, and C. The tables are joined on indexed columns. Here's a quick summary: Table A -> Table B -> Table C A_ID B_ID C_ID A_ID NAME C_ID Tables A and B have 6 million rows each. Table C is small: 67 names, no repeats. All columns involved in the join are indexed. Summary: 1. Query B only:2.7 seconds, 302175 rows returned 2. Join B and C:4.3 seconds, exact same answer 3. Join A and B:7.2 minutes, exact same answer 4. Join A, B, C:32.7 minutes, exact same answer Looking at these: Query #1 is doing the real work: finding the rows of interest. Queries #1 and #2 ought to be virtually identical, since Table C has just one row with C_ID = 9, but the time almost doubles. Query #3 should take a bit longer than Query #1 because it has to join 300K rows, but the indexes should make this take just a few seconds, certainly well under a minute. Query #4 should be identical to Query #3, again because there's only one row in Table C. 32 minutes is pretty horrible for such a straightforward query. It looks to me like the problem is the use of nested loops when a hash join should be used, but I'm no expert at query planning. This is psql 8.0.3. Table definitions are at the end. (Table and column names are altered to protect the guilty, otherwise these are straight from Postgres.) I ran "vacuum full analyze" after the last data were added. Hardware is a Dell, 2-CPU Xeon, 4 GB memory, database is on a single SATA 7200RPM disk. Thanks, Craig QUERY #1: - explain analyze select B.A_ID from B where B.B_ID = 9; Index Scan using i_B_B_ID on B (cost=0.00..154401.36 rows=131236 width=4) (actual time=0.158..1387.251 rows=302175 loops=1) Index Cond: (B_ID = 9) Total runtime: 2344.053 ms QUERY #2: - explain analyze select B.A_ID from B join C on (B.C_ID = C.C_ID) where C.name = 'Joe'; Nested Loop (cost=0.00..258501.92 rows=177741 width=4) (actual time=0.349..3392.532 rows=302175 loops=1) -> Seq Scan on C (cost=0.00..12.90 rows=1 width=4) (actual time=0.232..0.336 rows=1 loops=1) Filter: ((name)::text = 'Joe'::text) -> Index Scan using i_B_C_ID on B (cost=0.00..254387.31 rows=328137 width=8) (actual time=0.102..1290.002 rows=302175 loops=1) Index Cond: (B.C_ID = "outer".C_ID) Total runtime: 4373.916 ms QUERY #3: - explain analyze select A.A_ID from A join B on (A.A_ID = B.A_ID) where B.B_ID = 9; Nested Loop (cost=0.00..711336.41 rows=131236 width=4) (actual time=37.118..429419.347 rows=302175 loops=1) -> Index Scan using i_B_B_ID on B (cost=0.00..154401.36 rows=131236 width=4) (actual time=27.344..8858.489 rows=302175 loops=1) Index Cond: (B_ID = 9) -> Index Scan using pk_A_test on A (cost=0.00..4.23 rows=1 width=4) (actual time=1.372..1.376 rows=1 loops=302175) Index Cond: (A.A_ID = "outer".A_ID) Total runtime: 430467.686 ms QUERY #4: - explain analyze select A.A_ID from A join B on (A.A_ID = B.A_ID) join C on (B.B_ID = C.B_ID) where C.name = 'Joe'; Nested Loop (cost=0.00..1012793.38 rows=177741 width=4) (actual time=70.184..1960112.247 rows=302175 loops=1) -> Nested Loop (cost=0.00..258501.92 rows=177741 width=4) (actual time=52.114..17753.638 rows=302175 loops=1) -> Seq Scan on C (cost=0.00..12.90 rows=1 width=4) (actual time=0.109..0.176 rows=1 loops=1) Filter: ((name)::text = 'Joe'::text) -> Index Scan using i_B_B_ID on B (cost=0.00..254387.31 rows=328137 width=8) (actual time=51.985..15566.896 rows=302175 loops=1) Index Cond: (B.B_ID = "outer".B_ID) -> Index Scan using pk_A_test on A (cost=0.00..4.23 rows=1 width=4) (actual time=6.407..6.412 rows=1 loops=302175) Index Cond: (A.A_ID = "outer".A_ID) Total runtime: 1961200.079 ms TABLE DEFINITIONS: -- xxx => \d a Table "xxx.a" Column | Type | Modifiers ---++--- a_id | integer| not null ... more columns Indexes: "pk_a_id" PRIMARY KEY, btree (a_id) ... more indexes on other columns xxx => \d b Table "xxx.b" Column | Type | Modifiers --++--- b_id | integer| not null a_id | integer| not null c_id | inte