from:"Craig A. James"

[PERFORM] Replication

2007-06-14 Thread Craig A. James


Looking for replication solutions, I find:

Slony-I
 Seems good, single master only, master is a single point of failure,
 no good failover system for electing a new master or having a failed
 master rejoin the cluster.  Slave databases are mostly for safety or
 for parallelizing queries for performance.  Suffers from O(N^2) 
 communications (N = cluster size).


Slony-II
 Seems brilliant, a solid theoretical foundation, at the forefront of
 computer science.  But can't find project status -- when will it be
 available?  Is it a pipe dream, or a nearly-ready reality?

PGReplication
 Appears to be a page that someone forgot to erase from the old GBorg site.

PGCluster
 Seems pretty good, but web site is not current, there are releases in use
 that are not on the web site, and also seems to always be a couple steps
 behind the current release of Postgres.  Two single-points failure spots,
 load balancer and the data replicator.

Is this a good summary of the status of replication?  Have I missed any 
important solutions or mischaracterized anything?

Thanks!
Craig

(Sorry about the premature send of this message earlier, please ignore.)



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-04-30 Thread Craig A. James

Greg Smith wrote:
If you're going to the trouble of building a tool for offering
configuration advice, it can be widly more effective if you look inside
the database after it's got data in it, and preferably after it's been
running under load for a while, and make your recommendations based on
all that information.

There are two completely different problems that are getting mixed together in
this discussion. Several people have tried to distinguish them, but let's be
explicit:

1. Generating a resonable starting configuration for neophyte users who have
installed Postgres for the first time.

2. Generating an optimal configuration for a complex, running system that's
loaded with data.

The first problem is easy: Any improvement would be welcome and would give most users a
better initial experience. The second problem is nearly impossible. Forget the second
problem (or put it on the "let's find someone doing a PhD project" list), and
focus on the first.

From my limited experience, a simple questionaire could be used to create a
pretty good starting configuration file. Furthermore, many of the answers can
be discovered automatically:

1. How much memory do you have?
2. How many disks do you have?
a. Which disk contains the OS?
b. Which disk(s) have swap space?
c. Which disks are "off limits" (not to be used by Postgres)
3. What is the general nature of your database?
a. Mostly static (few updates, lots of access)
b. Mostly archival (lots of writes, few reads)
c. Very dynamic (data are added, updated, and deleted a lot)
4. Do you have a lot of small, fast transactions or a few big, long
transactions?
5. How big do you expect your database to be?
6. How many simultaneous users do you expect?
7. What are the users you want configured initially?
8. Do you want local access only, or network access?

With these few questions (and perhaps a couple more), a decent set of startup
files could be created that would give good, 'tho not optimal, performance for
most people just getting started.

I agree with an opinion posted a couple days ago: The startup configuration is
one of the weakest features of Postgres. It's not rocket science, but there
are several files, and it's not obvious to the newcomer that the files even
exist.

Here's just one example: A coworker installed Postgres and couldn't get it to work at all. He
struggled for hours. When he contacted me, I tried his installation and it worked fine. He tried
it, and he couldn't connect. I asked him, "Are you using localhost?" He said yes, but
what he meant was he was using the local *network*, 192.168.0.5, whereas I was using
"localhost". He didn't have network access enabled. So, four hours wasted.

This is the sort of thing that makes experienced users say, "Well, duh!" But
there are many number of these little traps and obscure configuration parameters that
make the initial Postgres experience a poor one. It wouldn't take much to make a big
difference to new users.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Re: Filesystem fragmentation (Re: [PERFORM] Fragmentation of WAL files)

2007-04-26 Thread Craig A. James


Bill Moran wrote:

In response to Heikki Linnakangas <[EMAIL PROTECTED]>:
Can anyone else confirm this? I don't know if this is a windows-only  
issue, but I don't know of a way to check fragmentation in unix.

I can confirm that it's only a Windows problem.  No UNIX filesystem
that I'm aware of suffers from fragmentation.
What do you mean by suffering? All filesystems fragment files at some 
point. When and how differs from filesystem to filesystem. And some 
filesystems might be smarter than others in placing the fragments.


To clarify my viewpoint:
To my knowledge, there is no Unix filesystem that _suffers_ from
fragmentation.  Specifically, all filessytems have some degree of
fragmentation that occurs, but every Unix filesystem that I am aware of
has built-in mechanisms to mitigate this and prevent it from becoming
a performance issue.


More specifically, this problem was solved on UNIX file systems way back in the 
1970's and 1980's.  No UNIX file system (including Linux) since then has had 
significant fragmentation problems, unless the file system gets close to 100% 
full.  If you run below 90% full, fragmentation shouldn't ever be a significant 
performance problem.

The word "fragmentation" would have dropped from the common parlance if it 
weren't for MS Windoz.

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Basic Q on superfluous primary keys

2007-04-22 Thread Craig A. James


Merlin Moncure wrote:

In the context of this debate, I see this argument all the time, with
the implied suffix: 'If only we used integer keys we would not have
had this problem...'.  Either the customer identifies parts with a
part number or they don't...and if they do identify parts with a
number and recycle the numbers, you have a problem...period.


On the contrary.  You create a new record with the same part number.  You mark the old 
part number "obsolete".  Everything else (the part's description, and all the 
relationships that it's in, such as order history, catalog inclusion, revision history, 
etc.) is unaffected.  New orders are placed against the new part number's DB record; for 
safety the old part number can have a trigger that prevent new orders from being placed.

Since the part number is NOT the primary key, duplicate part numbers are not a 
problem.  If you had instead used the part number as the primary key, you'd be 
dead in the water.

You can argue that the customer is making a dumb decision by reusing catalog 
numbers, and I'd agree.  But they do it, and as database designers we have to 
handle it.  In my particular system, we aggregate information from several 
hundred companies, and this exact scenario happens frequently.  Since we're 
only aggregating information, we have no control over the data that these 
companies provide.  If we'd used catalog numbers for primary keys, we'd have 
big problems.

Craig





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Basic Q on superfluous primary keys

2007-04-18 Thread Craig A. James


Merlin Moncure wrote:
Since the part number is NOT the primary key, duplicate part numbers 
are not a problem.  If you had instead used the part number as the 
primary key, you'd be dead in the water.


You are redefining the primary key to be (part_number,
obsoletion_date).  Now, if you had not anticipated that in the
original design (likely enough), you do have to refactor queries that
join on the table...so what?  If that's too much work, you can use a
view to take care of the problem (which may be a good idea anyways).
*you have to refactor the system anyways because you are now allowing
duplicate part numbers where previously (from the perspective of the
user), they were unique *.

The hidden advantage of pushing the full key through the database is
it tends to expose holes in the application/business logic.  Chances
are some query is not properly distinguishing obsoleted parts and now
the real problems come...surrogate keys do not remove complexity, they
simply sweep it under the rug.


This really boils down to an object-oriented perspective.  I have an object, a 
customer's catalog entry.  It has properties such as catalog number, 
description, etc, and whether it's obsolete or not.  Management of the object 
(its relation to other objects, its history, etc.) should NOT depend on the 
object's specific definition.

This is true whether the object is represented in Lisp, C++, Perl, or (in this 
case) an SQL schema.  Good object oriented design abstracts the object and its 
behavior from management of the object.  In C++, Perl, etc., we manage objects 
via a pointer or object reference.  In SQL, we reference objects by an 
*arbitrary* integer that is effectively a pointer to the object.

What you're suggesting is that I should break the object-oriented encapsulation 
by pulling out specific fields of the object, exposing those internal object 
details to the applications, and spreading those details across the whole 
schema. And I argue that this is wrong, because it breaks encapsulation.  By 
exposing the details of the object, if the details change, *all* of your 
relationships break, and all of your applications have to change.  And I've 
never seen a system where breaking object-oriented encapsulation was a good 
long-term solution.  Systems change, and object-oriented techniques were 
invented to help manage change.

This is one of the reasons the Postgres project was started way back when: To 
bring object-oriented techniques to the relational-database world.

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Basic Q on superfluous primary keys

2007-04-17 Thread Craig A. James


Merlin Moncure wrote:

In the context of this debate, I see this argument all the time, with
the implied suffix: 'If only we used integer keys we would not have
had this problem...'.  Either the customer identifies parts with a
part number or they don't...and if they do identify parts with a
number and recycle the numbers, you have a problem...period.


On the contrary.  You create a new record with the same part number.  You mark the old 
part number "obsolete".  Everything else (the part's description, and all the 
relationships that it's in, such as order history, catalog inclusion, revision history, 
etc.) is unaffected.  New orders are placed against the new part number's DB record; for 
safety the old part number can have a trigger that prevent new orders from being placed.

Since the part number is NOT the primary key, duplicate part numbers are not a 
problem.  If you had instead used the part number as the primary key, you'd be 
dead in the water.

You can argue that the customer is making a dumb decision by reusing catalog 
numbers, and I'd agree.  But they do it, and as database designers we have to 
handle it.  In my particular system, we aggregate information from several 
hundred companies, and this exact scenario happens frequently.  Since we're 
only aggregating information, we have no control over the data that these 
companies provide.  If we'd used catalog numbers for primary keys, we'd have 
big problems.

Craig






---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Basic Q on superfluous primary keys

2007-04-16 Thread Craig A. James

Merlin Moncure wrote:

Using surrogate keys is dangerous and can lead to very bad design
habits that are unfortunately so prevalent in the software industry
they are virtually taught in schools. ... While there is
nothing wrong with them in principle (you are exchanging one key for
another as a performance optimization), they make it all too easy to
create denormalized designs and tables with no real identifying
criteria, etc,...

Wow, that's the opposite of everything I've ever been taught, and all my
experience in the last few decades.

I can't recall ever seeing a "natural" key that was immutable. In my business
(chemistry), we've seen several disasterous situations were companies picked keys they thought were
natural and immutable, and years down the road they discovered (for example) that chemical
compounds they thought were pure were in fact isotopic mixtures, or simply the wrong molecule (as
analytical techniques improved). Or during a corporate takeover, they discovered that two
companies using the same "natural" keys had as much as 10% differences in their
multi-million-compound databases. These errors led to six-month to year-long delays, as each of
the conflicting chemical record had to be examined by hand by a PhD chemist to reclassify it.

In other businesses, almost any natural identifier you pick is subject to
simple typographical errors. When you discover the errors in a field you've
used as a primary key, it can be quite hard to fix, particularly if you have
distributed data across several systems and schemas.

We've always recommended to our customers that all primary keys be completely
information free. They should be not based on any information or combination
of information from the data records. Every time the customer has not followed
this advice, they've later regretted it.

I'm sure there are situations where a natural key is appropriate, but I haven't
seen it in my work.

Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Re: [PERFORM] DELETE with filter on ctid

2007-04-09 Thread Craig A. James


Spiegelberg, Greg wrote:
We have a query which generates a small set of rows (~1,000) which are 
to be used in a DELETE on the same table.  The problem we have is that 
we need to join on 5 different columns and it takes far too long.


You may have encountered the same problem I did:  You *must* run ANALYZE on a temporary 
table before you use in another query.  It's surprising that this is true even for very 
small tables (a few hundred to a few thousand rows), but it is.  I had a case where I 
created a "scratch" table like yours, and the before/after ANALYZE performance 
was the difference between 30 seconds and a few milliseconds for the same query.

Craig

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] Can't drop tablespace or user after disk gone

2007-04-04 Thread Craig A. James


I had a 'scratch' database for testing, which I deleted, and then disk went 
out.  No problem, no precious data.  But now I can't drop the tablespace, or 
the user who had that as the default tablespace.

I thought about removing the tablespace from pg_tablespaces, but it seems wrong 
to be monkeying with the system tables.  I still can't drop the user, and can't 
drop the tablespace.  What's the right way to clear out Postgres when a disk 
fails and there's no reason to repair the disk?

Thanks,
Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] [HACKERS] EXISTS optimization

2007-04-03 Thread Craig A. James


Kevin Grittner wrote:

Management has simply given a mandate that the software be independent
of OS and database vendor, and to use Java to help with the OS independence.
... we write all of our queries in ANSI SQL in our own query tool, parse it,
and generate Java classes to run it.


A better solution, and one I've used for years, is to use OS- or database-specific 
features, but carefully encapsulate them in a single module, for example, 
"database_specific.java".

For example, when I started supporting both Oracle and Postgres, I encountered the MAX() problem, 
which (at the time) was very slow in Postgres, but could be replaced by "select X from MYTABLE 
order by X desc limit 1".  So I created a function, "GetColumnMax()" that 
encapsulates the database-specific code for this.  Similar functions encapsulate and a number of 
other database-specific optimizations.

Another excellent example: I have a function called "TableExists(name)".  To 
the best of my knowledge, there simply is no ANSI SQL for this, so what do you do?  
Encapsulate it in one place.

The result?  When I port to a new system, I know exactly where to find all of the 
non-ANSI SQL.  I started this habit years ago with C/C++ code, which has the same 
problem: System calls are not consistent across the varients of Unix, Windows, and other 
OS's.  So you put them all in one file called "machine_dependent.c".

Remember the old adage: There is no such thing as portable code, only code that 
has been ported.

Cheers,
Craig



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James


Tom Lane wrote:

"Craig A. James" <[EMAIL PROTECTED]> writes:

Steve Atkins wrote:

As long as you're ordering by some row in the table then you can do that in
straight SQL.

select a, b, ts from foo where (stuff) and foo > X order by foo limit 10

Then, record the last value of foo you read, and plug it in as X the next
time around.



We've been over this before in this forum: It doesn't work as advertised.
Look for postings by me regarding the fact that there is no way to tell
the optimizer the cost of executing a function.  There's one, for example,
on Oct 18, 2006.


You mean
http://archives.postgresql.org/pgsql-performance/2006-10/msg00283.php
?  I don't see anything there that bears on Steve's suggestion.
(The complaint is obsolete as of CVS HEAD anyway.)


Mea culpa, it's October 8, not October 18:

  http://archives.postgresql.org/pgsql-performance/2006-10/msg00143.php

The relevant part is this:

"My example, discussed previously in this forum, is a classic.  I have a VERY 
expensive function (it's in the class of NP-complete problems, so there is no faster 
way to do it).  There is no circumstance when my function should be used as a 
filter, and no circumstance when it should be done before a join.  But PG has no way 
of knowing the cost of a function, and so the optimizer assigns the same cost to 
every function.  Big disaster.

"The result?  I can't use my function in any WHERE clause that involves any 
other conditions or joins.  Only by itself.  PG will occasionally decide to use my 
function as a filter instead of doing the join or the other WHERE conditions first, 
and I'm dead.

"The interesting thing is that PG works pretty well for me on big tables -- it does 
the join first, then applies my expensive functions.  But with a SMALL (like 50K rows) 
table, it applies my function first, then does the join.  A search that completes in 1 
second on a 5,000,000 row database can take a minute or more on a 50,000 row 
database."

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James


Steve Atkins wrote:

As long as you're ordering by some row in the table then you can do that in
straight SQL.

select a, b, ts from foo where (stuff) and foo > X order by foo limit 10

Then, record the last value of foo you read, and plug it in as X the next
time around.


We've been over this before in this forum: It doesn't work as advertised.  Look 
for postings by me regarding the fact that there is no way to tell the 
optimizer the cost of executing a function.  There's one, for example, on Oct 
18, 2006.


I think the problem is more that most web developers aren't very good
at using the database, and tend to fall back on simplistic, wrong, 
approaches

to displaying the data. There's a lot of monkey-see, monkey-do in web
UI design too, which doesn't help.


Thanks, I'm sure your thoughtful comments will help me solve my problem.  
Somehow. ;-)

Craig

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James


Tino Wildenhain wrote:
You guys can correct me if I'm wrong, but the key feature that's 
missing from Postgres's flexible indexing is the ability to maintain 
state across queries.  Something like this:


 select a, b, my_index_state() from foo where ...
   offset 100 limit 10 using my_index(prev_my_index_state);



Yes, you are wrong :-) The technique is called "CURSOR"
if you maintain persistent connection per session
(e.g. stand allone application or clever pooling webapplication)


That's my whole point: If relational databases had a simple mechanism for 
storing their internal state in an external application, the need for cursors, 
connection pools, and all those other tricks would be eliminated.

As I said earlier, relational technology was invented in an earlier era, and 
hasn't caught up with the reality of modern web apps.


If its a naive web application you just store your session
in tables where you can easily maintain the scroll state
as well.


One thing I've learned in 25 years of software development is that people who 
use my software have problems I never imagined.  I've been the one who was 
naive when I said similar things about my customers, and was later embarrassed 
to learn that their problems were more complex than I ever imagined.

Craig

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James


Tino Wildenhain wrote:

Craig A. James schrieb:
...
In our case (for a variety of reasons, but this one is critical), we 
actually can't use Postgres indexing at all -- we wrote an entirely 
separate indexing system for our data...


...There is no need to store or
maintain this information along with postgres when you can store
and maintain it directly in postgres as well.


Whether we store our data inside or outside Postgres misses the point (in fact, 
most of our data is stored IN Postgres).  It's the code that actually performs 
the index operation that has to be external to Postgres.


On top of that, postgres has a very flexible and extensible index
system.


You guys can correct me if I'm wrong, but the key feature that's missing from 
Postgres's flexible indexing is the ability to maintain state across queries.  
Something like this:

 select a, b, my_index_state() from foo where ...
   offset 100 limit 10 using my_index(prev_my_index_state);

The my_index_state() function would issue something like a "cookie", an opaque 
text or binary object that would record information about how it got from row 1 through 
row 99.  When you issue the query above, it could start looking for row 100 WITHOUT 
reexamining rows 1-99.

This could be tricky in a OLTP environment, where the "cookie" could be 
invalidated by changes to the database.  But in warehouse read-mostly or read-only 
environments, it could yield vastly improved performance for database web applications.

If I'm not mistaken, Postgres (nor Oracle, MySQL or other RDBMS) can't do this. 
 I would love to be corrected.

The problem is that relational databases were invented before the web and its stateless 
applications.  In the "good old days", you could connect to a database and work 
for hours, and in that environment cursors and such work well -- the RDBMS maintains the 
internal state of the indexing system.  But in a web environment, state information is 
very difficult to maintain.  There are all sorts of systems that try (Enterprise Java 
Beans, for example), but they're very complex.

Craig


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James


Brian Hurt wrote:
One of our biggest single problems is this very thing.  It's not a 
Postgres problem specifically, but more embedded in the idea of a 
relational database: There are no "job status" or "rough estimate of 
results" or "give me part of the answer" features that are critical to 
many real applications.


For the "give me part of the answer", I'm wondering if cursors wouldn't 
work (and if not, why not)?


There is no mechanism in Postgres (or any RDB that I know of) to say, "Give me rows 
1000 through 1010", that doesn't also execute the query on rows 1-1000.  In other 
words, the RDBMS does the work for 1010 rows, when only 10 are needed -- 100 times more 
work than is necessary.

Limit/Offset will return the correct 10 rows, but at the cost of doing the 
previous 1000 rows and discarding them.

Web applications are stateless.  To use a cursor, you'd have to keep it around for hours 
or days, and create complex "server affinity" code to direct a user back to the 
same server of your server farm (where that cursor is being held), on the chance that the 
user will come back and ask for rows 1000 through 1010, then a cursor isn't up to the 
task.

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James

Michael Stone wrote:

On Thu, Mar 22, 2007 at 01:30:35PM +0200, [EMAIL PROTECTED] wrote:

approximated count?

why? who would need it? where you can use it?

Do a google query. Look at the top of the page, where it says "results N
to M of about O". For user interfaces (which is where a lot of this
count(*) stuff comes from) you quite likely don't care about the exact
count...

Right on, Michael.

One of our biggest single problems is this very thing. It's not a Postgres problem specifically, but more
embedded in the idea of a relational database: There are no "job status" or "rough estimate of
results" or "give me part of the answer" features that are critical to many real applications.

In our case (for a variety of reasons, but this one is critical), we actually
can't use Postgres indexing at all -- we wrote an entirely separate indexing
system for our data, one that has the following properties:

1. It can give out "pages" of information (i.e. "rows 50-60") without
rescanning the skipped pages the way "limit/offset" would.
2. It can give accurate estimates of the total rows that will be returned.
3. It can accurately estimate the time it will take.

For our primary business-critical data, Postgres is merely a storage system, not a search
system, because we have to do the "heavy lifting" in our own code. (To be
fair, there is no relational database that can handle our data.)

Many or most web-based search engines face these exact problems.

Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Re: [PERFORM] Determining server load from client

2007-03-20 Thread Craig A. James


Dan Harris wrote:
I've found that it would be helpful to be able to tell how busy my 
dedicated PG server is ...


I have seen some other nice back-end things exposed through PG functions 
( e.g. database size on disk ) and wondered if there was anything 
applicable to this.


I'd write a simple pg-perl function to do this.  You can access operating-system calls to 
find out the system's load.  But notice that you need "Untrusted Perl" to do 
this, so you can only do it on a system where you trust every application that connects 
to your database.  Something like this:

create or replace function get_stats()
 returns text as '
 open(STAT, ";
 close STAT;
 return join("", @stats);
' language plperlu;

See http://www.postgresql.org/docs/8.1/interactive/plperl-trusted.html

Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-08 Thread Craig A. James


Carlos,

Now, yet another thing that you (Craig) seem to be missing:  you're
simply putting the expense of all this time under the expenses column
in exchange for solving the particular problem...


More like I was trying to keep my response short ;-).  I think we're all in 
agreement on pretty much everything:

 1. Understand your problem
 2. Find potential solutions
 3. Find the technical, economic AND situational tradeoffs
 4. Choose the best course of action

My original comment was directed at item #3.  I was trying to remind everyone 
that a simple cost analysis may point to solutions that simply aren't possible, 
given business constraints.

I know we also agree that we should constantly fight corporate stupidity and 
short-sighted budgetary oversight.  But that's a second battle, one that goes 
on forever.  Sometimes you just have to get the job done within the current 
constraints.

'Nuff said.

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-08 Thread Craig A. James


Rodrigo Madera wrote:
I would just like to note here that this is an example of inefficient 
strategy.


We could all agree (up to a certain economical point) that Alex saved 
the most expensive one thousand dollars of his life.


I don't know the financial status nor the size of your organization, but 
I'm sure that you have selected the path that has cost you more.


In the future, an investment on memory for a (let's say) rather small 
database should be your first attempt.


Alex may have made the correct, rational choice, given the state of accounting 
at most corporations.  Corporate accounting practices and the budgetary process 
give different weights to cash and labor.  Labor is fixed, and can be grossly 
wasted without (apparently) affecting the quarterly bottom line.  Cash 
expenditures come directly off profits.

It's shortsighted and irrational, but nearly 100% of corporations operate this 
way.  You can waste a week of your time and nobody complains, but spend a 
thousand dollars, and the company president is breathing down your neck.

When we answer a question on this forum, we need to understand that the person 
who needs help may be under irrational, but real, constraints, and offer 
appropriate advice.  Sure, it's good to fight corporate stupidity, but 
sometimes you just want to get the system back online.

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Estimate the size of the SQL file generated by pg_dump utility

2007-03-06 Thread Craig A. James


Bill Moran wrote:

I'm curious, what problem does the disclaimer cause?

I wrote the following TOS for my personal system:
https://www.potentialtech.com/cms/node/9
Excerpt of the relevant part:
"If you send me email, you are granting me the unrestricted right to use
the contents of that email however I see fit, unless otherwise agreed in
writing beforehand. You have no rights to the privacy of any email that you
send me. If I feel the need, I will forward emails to authorities or make
their contents publicly available. By sending me email you consent to this
policy and agree that it overrides any disclaimers or policies that may
exist elsewhere."

I have no idea if that's legally binding or not, but I've talked to a few
associates who have some experience in law, and they all argue that email
disclaimers probably aren't legally binding anyway -- so the result is
undefined.


No, it's not legally binding.  Agreements are only binding if both parties 
agree, and someone sending you email has not consented to your statement.  If I 
send you something with a copyright mark, you'd better respect it unless you 
have a signed agreement granting you rights.  Federal law always wins.

Disclaimers are bad for two reasons.  First, they're powerless.  Just because Acme Corp. 
attaches a disclaimer doesn't mean they've absolved themselves of responsibility for the 
actions of their employees.  Second, they're insulting to the employees.  It's a big red 
flag saying, "We, Acme Corp., hire clowns we don't trust, and THIS person may be one 
of them!"

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Identical Queries

2007-03-01 Thread Craig A. James


Stephan Szabo wrote:

I tried posting to the bugs, and they said this is a better question for here.
I have to queries. One runs in about 2 seconds. The other takes upwards
of 2 minutes. I have a temp table that is created with 2 columns. This
table is joined with the larger database of call detail records.
However, these 2 queries are handled very differently.


Even for a temporary table, you should run ANALYZE on it after you fill it but 
before you query or join to it.  I found out (the hard way) that a temporary 
table of just 100 rows will generate dramatically different plans before and 
after ANALYZE.

Craig


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] How to debug performance problems

2007-02-21 Thread Craig A. James


Ray,


I'd like to have a toolbox prepared for when performance goes south.
I'm clueless.  Would someone mind providing some detail about how to
measure these four items Craig listed:


I hope I didn't give the impression that these were the only thing to look at 
... those four items just popped into my head, because they've come up 
repeatedly in this forum.  There are surely more things that could be suspect; 
perhaps others could add to your list.

You can find the answers to each of the four topics I mentioned by looking 
through the archives of this list.  It's a lot of work.  It would be really 
nice if there was some full-time employee somewhere whose job was to monitor 
this group and pull out common themes that were put into a nice, tidy manual.  
But this is open-source development, and there is no such person, so you have 
to dig in and find it yourself.

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] How to debug performance problems

2007-02-19 Thread Craig A. James


Andreas Tille wrote:

My web application was running fine for years without any problem
and the performance was satisfying.  Some months ago I added a
table containing 450 data rows ...

Since about two weeks the application became *drastically* slower
and I urgently have to bring back the old performance.  As I said
I'm talking about functions accessing tables that did not increased
over several years and should behave more or less the same.


Don't assume that the big table you added is the source of the problem.  It 
might be, but more likely it's something else entirely.  You indicated that the 
problem didn't coincide with creating the large table.

There are a number of recurring themes on this discussion group:

 * A long-running transaction keeps vacuum from working.

 * A table grows just enough to pass a threshold in the
   planner and a drastically different plan is generated.
 
 * An index has become bloated and/or corrupted, and you

   need to run the REINDEX command.

And several other common problems.

The first thing is to find out which query is taking a lot of time.  I'm no expert, but there have been several explanations on this forum recently how to find your top time-consuming queries.  Once you find them, then EXPLAIN ANALYZE should get you started 


Craig

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-07 Thread Craig A. James


Craig A. James wrote:
The "idiom" to replace count() was 
"select col from tbl order by col desc limit 1".  It worked miracles for 
my app.


Sorry, I meant to write, "the idiom to replace MAX()", not count()...  MAX() 
was the function that was killing me, 'tho count() also gave me problems.

Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-06 Thread Craig A. James

Guy,

The application is fairly straightforward, but as you say, what is
working okay with BigDBMS isn't working as well under PG. I'm going to
try other configuration suggestions made by others before I attempt
logic changes. The core logic is unchangeable; millions of rows of data
in a single table will be updated throughout the day. If PG can't
handle high volume updates well, this may be brick wall.

Here are a couple things I learned.

ANALYZE is VERY important, surprisingly so even for small tables. I had a case last week
where a temporary "scratch" table with just 100 rows was joined to two more
tables of 6 and 12 million rows. You might think that a 100-row table wouldn't need to
be analyzed, but it does: Without the ANALYZE, Postgres generated a horrible plan that
took many minutes to run; with the ANALYZE, it took milliseconds. Any time a table's
contents change dramatically, ANALYZE it, ESPECIALLY if it's a small table. After all,
changing 20 rows in a 100-row table has a much larger affect on its statistics than
changing 20 rows in a million-row table.

Postgres functions like count() and max() are "plug ins" which has huge architectural advantages.
But in pre-8.1 releases, there was a big speed penalty for this: functions like count() were very, very slow,
requiring a full table scan. I think this is vastly improved from 8.0x to 8.1 and forward; others might be
able to comment whether count() is now as fast in Postgres as Oracle. The "idiom" to replace
count() was "select col from tbl order by col desc limit 1". It worked miracles for my app.

Postgres has explicit garbage collection via VACUUM, and you have to design your
application with this in mind. In Postgres, update is delete+insert, meaning updates
create garbage. If you have very "wide" tables, but only a subset of the
columns are updated frequently, put these columns in a separate table with an index to
join the two tables. For example, my original design was something like this:

integer primary key
very large text column
... a bunch of integer columns, float columns, and small text columns

The properties were updated by the application, but the large text column never
changed. This led to huge garbage-collection problems as the large text field
was repeatedly deleted and reinserted by the updates. By separating these into
two tables, one with the large text column, and the other table with the
dynamic, but smaller, columns, garbage is massively reduced, and performance
increased, both immediately (smaller data set to update) and long term (smaller
vacuums). You can use views to recreate your original combined columns, so the
changes to your app are limited to where updates occur.

If you have a column that is *frequently* updated (say, for example, a user's
last-access timestamp each time s/he hits your web server) then you definitely
want this in its own table, not mixed in with the user's name, address, etc.

Partitioning in Postgres is more powerful than in Oracle. Use it if you can.

Partial indexes are VERY nice in Postgres, if your data is poorly distributed
(for example, a mostly-NULL column with a small percentage of very important
values).

I'm sure there are more things that others can contribute.

Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-05 Thread Craig A. James

Guy Rouillier wrote:
I've got back access to my test system. I ran another test run with the
same input data set. This time I put pg_xlog on a different RAID volume
(the unused one that I suspect is a software RAID), and I turned
fsync=off in postgresql.conf. I left the rest of the configuration
alone (all foreign keys removed), etc. Unfortunately, this only dropped
elapsed time down to about 28000 seconds (from 3), still
significantly more than BigDBMS. Additional info inline below.

Although tuning is extremely important, you also have to look at the
application itself. I discovered (the hard way) that there's simply no
substitute for a bit of redesign/rewriting of the schema and/or SQL statements.

Many of us who "grew up" on Oracle assume that their SQL is standard stuff, and that
Oracle's optimizer is "the way it's done." But in fact most Oracle applications are
tweaked and tuned to take advantage of Oracle's strengths and avoid its weaknesses. If you
designed an application from the ground up to use Postgres, then migrated to Oracle, you would
probably be equally frustrated by Oracle's poor performance on your Postgres-tuned application.

I don't know if you have access to the application's SQL, or the time to
experiment a bit, but unless your schema is trival and your SQL is boneheaded
simple, you're not going to get equal performance from Postgres until you do
some analysis of your application under real-world conditions, and optimize the
problem areas.

In my case, I found just a few specific SQL constructs that, with a bit of
tuning, made massive differences in performance.

Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Trivial function query optimized badly

2007-01-03 Thread Craig A. James


Tom Lane wrote:

"Craig A. James" <[EMAIL PROTECTED]> writes:

  CREATE OR REPLACE FUNCTION cansmiles(text) RETURNS text
  AS '/usr/local/pgsql/lib/libchem.so', 'cansmiles'
  LANGUAGE 'C' STRICT IMMUTABLE;


Umm ... this is a single-argument function.


db=> explain analyze select version_id, 'Brc1ccc2nc(cn2c1)C(=O)O' from version 
where version.isosmiles = cansmiles('Brc1ccc2nc(cn2c1)C(=O)O', 1);


And this query is invoking some other, two-argument function; which
apparently hasn't been marked IMMUTABLE, else it'd have been folded
to a constant.


Good catch, mystery solved.  There are two definitions for this function, the first just a 
"wrapper" for the second with the latter parameter defaulting to "1".  The second 
definition was missing the "IMMUTABLE" keyword.

Thanks!
Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Trivial function query optimized badly

2007-01-03 Thread Craig A. James


Adam Rich wrote:

Craig,
What version of postgres are you using?  I just tested this on PG 8.1.2
and was unable to reproduce these results.  I wrote a simple function
that returns the same text passed to it, after sleeping for 1 second.
I use it in a where clause, like your example below, and regardless of
the number of rows in the table, it still takes roughly 1 second,
indicating to me the function is only called once.


Sorry, I forgot that critical piece of info: I'm using 8.1.4.

Your results would indicate that 8.1.2 creates a different plan than 8.1.4, or 
else there's some configuration parameter that's different between your 
installation and mine that causes a radically different plan to be used.  I 
assume you vacuum/analyzed the table before you ran the query.

Is it possible that your function really isn't immutable? Would PG 
realize this and fall back to treating it as VOLATILE ?


Now that you say this, this seems more like a bug with the definition of 
IMMUTABLE.  The function should only be called once if it's given a constant 
string, right?  So the fact that Postgres called it once per row is just wrong.

Craig


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

[PERFORM] Trivial function query optimized badly

2007-01-03 Thread Craig A. James


Well, once again I'm hosed because there's no way to tell the optimizer the 
cost for a user-defined function.  I know this issue has already been raised 
(by me!) several times, but I have to remind everyone about this.  I frequently 
must rewrite my SQL to work around this problem.

Here is the function definition:

 CREATE OR REPLACE FUNCTION cansmiles(text) RETURNS text
 AS '/usr/local/pgsql/lib/libchem.so', 'cansmiles'
 LANGUAGE 'C' STRICT IMMUTABLE;

Here is the bad optimization:

db=> explain analyze select version_id, 'Brc1ccc2nc(cn2c1)C(=O)O' from version 
where version.isosmiles = cansmiles('Brc1ccc2nc(cn2c1)C(=O)O', 1);
  QUERY PLAN   


Seq Scan on version  (cost=0.00..23.41 rows=1 width=4) (actual 
time=1434.281..1540.253 rows=1 loops=1)
  Filter: (isosmiles = cansmiles('Brc1ccc2nc(cn2c1)C(=O)O'::text, 1))
Total runtime: 1540.347 ms
(3 rows)

I've had to break it up into two separate queries.  Ironically, for large databases, 
Postgres does the right thing -- it computes the function, then uses the index on the 
"isosmiles" column.  It's blazingly fast and very satisfactory.  But for small 
databases, it apparently decides to recompute the function once per row, making the query 
N times slower (N = number of rows) than it should be!

In this instance, there are 1000 rows, and factor of 10^4 is a pretty dramatic 
slowdown...  To make it work, I had to call the function separately then use 
its result to do the select.


db=> explain analyze select cansmiles('Brc1ccc2nc(cn2c1)C(=O)O', 1);
QUERY PLAN 


Result  (cost=0.00..0.01 rows=1 width=0) (actual time=1.692..1.694 rows=1 
loops=1)
Total runtime: 1.720 ms
(2 rows)

db=> explain analyze select version_id, 'Brc1ccc2nc(cn2c1)C(=O)O' from version 
where version.isosmiles = 'Brc1ccc2nc(cn2c1)C(=O)O';
QUERY PLAN  
-

Index Scan using i_version_isosmiles on version  (cost=0.00..5.80 rows=1 
width=4) (actual time=0.114..0.117 rows=1 loops=1)
  Index Cond: (isosmiles = 'Brc1ccc2nc(cn2c1)C(=O)O'::text)
Total runtime: 0.158 ms
(3 rows)

Craig


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] glibc double-free error

2007-01-01 Thread Craig A. James


Tom Lane wrote:

"Craig A. James" <[EMAIL PROTECTED]> writes:

I'm using pg_dump/pg_restore to quickly copy databases between servers.  But my 
server keeps crashing when I run pg_restore:
glibc detected *** double free or corruption (!prev): 0x0a00b1a0



What can I do to help diagnose this problem?


Either dig into it yourself with gdb, or send me a not-too-large example
dump file (off-list)...


Hmmm ... after moving to our production server, four hours of work copying a 
dozen databases, there hasn't been a single glibc problem.  The development 
server is Fedora Core 3, the productions server is Fedora Core 4.  Unless it 
happens on FC4, I'm diagnosing that it's a glibc bug or incompatibility that 
was already fixed.

Thanks,
Craig



---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] glibc double-free error

2007-01-01 Thread Craig A. James


I'm using pg_dump/pg_restore to quickly copy databases between servers.  But my 
server keeps crashing when I run pg_restore:

   glibc detected *** double free or corruption (!prev): 0x0a00b1a0

Postgres: 8.1.4
  Linux: 2.6.12-1.1381_FC3
  glibc: 2.3.6-0.fc3.1

Server: Dell
  CPU: Xeon 2.80GHz
Memory: 4 GB

This is pretty repeatable.  Any particular pg_dump file that causes the crash 
will cause it every time it is used, and it happens with a lot of my databases.

What can I do to help diagnose this problem?

Craig




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-11 Thread Craig A. James


Ron wrote:
We are not going to get valuable contributions nor help people become 
more valuable to the community by "flaming them into submission".


Let's support getting definitive evidence.  No matter who brings it to 
the table ;-)


Thanks, Ron, for a voice of respect and reason.  Since I first started using Usenet back 
in 1984, inexplicable rudeness has been a plague on otherwise civilized people.  We're a 
community, we're all in this to help one another.  Sometimes we give good advice, and 
sometimes even those "wearing the mantle of authority" can make boneheaded 
comments.  I know I do, and when it happens, I always appreciate it when I'm taken to 
task with good humor and tolerance.

When someone comes to this forum with an idea you disagree with, no matter how 
brash or absurd their claims, it's so easy to challenge them with grace and 
good humor, rather than chastizing with harsh words and driving someone from 
our community.  If you're right, you will have taught a valuable lesson to 
someone.  And if on occasion a newcomer shows us something new, then we've all 
learned.  Either way, we have a new friend and contributor to the community.

Craig

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Bad iostat numbers

2006-12-05 Thread Craig A. James


Alex Turner wrote:
The problem I see with software raid is the issue of a battery backed 
unit: If the computer loses power, then the 'cache' which is held in 
system memory, goes away, and fubars your RAID.


I'm not sure I see the difference.  If data are cached, they're not written 
whether it is software or hardware RAID.  I guess if you're writing RAID 1, the 
N disks could be out of sync, but the system can synchronize them once the 
array is restored, so that's no different than a single disk or a hardware 
RAID.  If you're writing RAID 5, then the blocks are inherently error 
detecting/correcting, so you're still OK if a partial write occurs, right?

I'm not familiar with the inner details of software RAID, but the only 
circumstance I can see where things would get corrupted is if the RAID driver 
writes a LOT of blocks to one disk of the array before synchronizing the 
others, but my guess (and it's just a guess) is that the writes to the N disks 
are tightly coupled.

If I'm wrong about this, I'd like to know, because I'm using software RAID 1 
and 1+0, and I'm pretty happy with it.

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Craig A. James


Rod Taylor wrote:

Just a minor clarification here: INSERT does not create dead rows, only
UPDATE and DELETE do. Thus, if you only insert rows, you do not need to
vacuum (although you probably need to analyze).


Is there no real-time garbage collection at all in Postgres?  And if so, is 
this because nobody has had time to implement garbage collection, or for a more 
fundamental reason, or because VACUUM is seen as sufficient?

I'm just curious ... Vacuum has always seemed to me like an ugly wart on the pretty face of Postgres.  (I say this even though I implemented an identical solution on a non-relational chemistry database system a long time ago.  I didn't like it then, either.) 


Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Craig A. James

You realize that it had to be turned on explicitly on IRIX, right? But 
don't let facts get in the way of a good rant...


On the contrary, with Irix 4 and earlier it was the default, but it caused so many problems that 
SGI switched the default to OFF in IRIX 5.  But because it had been available for so long, many 
important apps had come to rely on it, so most sites had to immediately re-enable virtual swap on 
every IRIX 5 server that came in.  Admins just got used to doing it, so it became a 
"default" at most sites, and admins often couldn't be convinced to disable it for 
database server machines, because "That's our standard for IRIX configuration."

I worked at a big molecular modeling/visualization company; our visualization programs 
*required* virtual swap, and our server programs *prohibited* virtual swap.  Imagine how 
our sales people felt about that, telling customers that they'd have to buy two $30,000 
machines just because of one kernel parameter.  Of course, they didn't, and our server 
apps took the heat as being "unreliable."

SGI called it "virtual swap" which I always thought was a hoot.  You have 
virtual memory, which is really your swap space, and then virtual swap, which is some 
kind of dark hole...

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Postgres server crash

2006-11-19 Thread Craig A. James


Michael Stone wrote:
At one point someone complained about the ability to configure, e.g., 
IRIX to allow memory overcommit. I worked on some large IRIX 
installations where full memory accounting would have required on the 
order of 100s of gigabytes of swap, due to large shared memory 
allocations.


These were mostly scientific and graphical apps where reliability took a back 
seat to performance and to program complexity.  They would allocate 100's of GB 
of swap space rather than taking the time to design proper data structures.  If 
the program crashed every week or two, no big deal -- just run it again.  
Overallocating memory is a valuable technique for such applications.

But overallocating memory has no place in a server environment.  When memory 
overcommittment is allowed, it is impossible to write a reliable application, 
because no matter how carefully and correctly you craft your code, someone 
else's program that leaks memory like Elmer Fudd's rowboat after his shotgun 
goes off, can kill your well-written application.

Installing Postgres on such a system makes Postgres unreliable.

Tom Lane wrote:

That might have been right when it was written (note the reference to a
2.2 Linux kernel), but it's 100% wrong now.  
[Setting /proc/sys/vm/overcommit_memory to] 0 is the default, not-safe

setting.


I'm surprised that the Linux kernel people take such a uncritical view of 
reliability that they set, as *default*, a feature that makes Linux an 
unreliable platform for servers.

And speaking of SGI, this very issue was among the things that sank the 
company.  As the low-end graphics cards ate into their visualization market, 
they tried to become an Oracle Server platform.  Their servers were *fast*.  
But they crashed -- a lot.  And memory-overcommit was one of the reasons.  IRIX 
admins would brag that their systems only crashed every couple of weeks.  I had 
HP and Sun systems that would run for years.

Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Postgres server crash

2006-11-18 Thread Craig A. James


Richard Troy wrote:

I did that - spent about two f-ing hours looking for what I wanted. (Guess
I entered poor choices for my searches. -frown- ) There are a LOT of
articles that TALK ABOUT OOM, but prescious few actually tell you what you
can do about it.

Trying to save you some time:

On linux you can use the sysctl utility to muck with vm.overcommit_memory;
You can disable the "feature."

Google _that_ for more info!



Here's something I found googling for "memory overcommitment"+linux

 http://archives.neohapsis.com/archives/postfix/2000-04/0512.html

 From /usr/src/linux/Documentation/sysctl/vm.txt

 "overcommit_memory:

 This value contains a flag that enables memory overcommitment.
 When this flag is 0, the kernel checks before each malloc()
 to see if there's enough memory left. If the flag is nonzero,
 the system pretends there's always enough memory."

 This flag is settable in /proc/sys/vm

Lo and behold, here it is on my system:

  $ cat /proc/sys/vm/overcommit_memory
  0
  $ cat /proc/sys/vm/overcommit_ratio 
  50


Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Craig A. James


By the way, in spite of my questions and concerns, I was *very* impressed by 
the recovery process.  I know it might seem like old hat to you guys to watch 
the WAL in action, and I know on a theoretical level it's supposed to work, but 
watching it recover 150 separate databases, and find and fix a couple of 
problems was very impressive.  It gives me great confidence that I made the 
right choice to use Postgres.

Richard Huxton wrote:

 2. Why didn't the database recover?  Why are there two processes
that couldn't be killed?


I'm guessing it didn't recover *because* there were two processes that 
couldn't be killed. Responsibility for that falls to the 
operating-system. I've seen it most often with faulty drivers or 
hardware that's being communicated with/written to. However, see below.


It can't be a coincidence that these were the only two processes in a SELECT 
operation.  Does the server disable signals at critical points?

I'd make a wild guess that this is some sort of deadlock problem -- these two 
servers have disabled signals for a critical section of SELECT, and are waiting 
for something from the postmaster, but postmaster is dead.

This is an ordinary system, no hardware problems, stock RH FC3 kernel, stock PG 
8.1.4, with 4 GB memory, and at the moment the database is running on a single 
SATA disk.  I'm worried that a production server can get into a state that 
requires manual intervention to recover.

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Postgres server crash

2006-11-16 Thread Craig A. James


Russell Smith wrote:

For the third time today, our server has crashed...


I would guess it's the linux OOM if you are running linux. You need to 
turn off killing of processes when you run out of memory.  Are you 
getting close to running out of memory?


Good suggestion, it was a memory leak in an add-on library that we plug in to 
the Postgres server.

OOM?  Can you give me a quick pointer to what this acronym stands for and how I can 
reconfigure it?  It sounds like a "feature" old UNIX systems like SGI IRIX had, 
where the system would allocate virtual memory that it didn't really have, then kill your 
process if you tried to use it.  I.e. malloc() would never return NULL even if swap space 
was over allocated.  Is this what you're talking about?  Having this enabled on a server 
is deadly for reliability.

Thanks,
Craig


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

[PERFORM] Postgres server crash

2006-11-15 Thread Craig A. James


For the third time today, our server has crashed, or frozen, actually something 
in between.  Normally there are about 30-50 connections because of mod_perl 
processes that keep connections open.  After the crash, there are three 
processes remaining:

# ps -ef | grep postgres
postgres 23832 1  0 Nov11 pts/100:02:53 /usr/local/pgsql/bin/postmaster 
-D /postgres/main
postgres  1200 23832 20 14:28 pts/100:58:14 postgres: pubchem pubchem 
66.226.76.106(58882) SELECT
postgres  4190 23832 25 14:33 pts/101:09:12 postgres: asinex asinex 
66.226.76.106(56298) SELECT

But they're not doing anything: No CPU time consumed, no I/O going on, no 
progress.  If I try to connect with psql(1), it says:

  psql: FATAL:  the database system is in recovery mode

And the server log has:

LOG:  background writer process (PID 23874) was terminated by signal 9
LOG:  terminating any other active server processes
LOG:  statistics collector process (PID 23875) was terminated by signal 9
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the 
current transaction and exit, because another server process exited ab
normally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat 
your command.
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the 
current transaction and exit, because another server process exited ab
... repeats about 50 times, one per process.

Questions:
 1. Any idea what happened and how I can avoid this?  It's a *big* problem.
 2. Why didn't the database recover?  Why are there two processes
that couldn't be killed?
 3. Where did the "signal 9" come from?  (Nobody but me ever logs
in to the server machine.)

Help!

Thanks,
Craig


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] Hundreds of database and FSM

2006-11-15 Thread Craig A. James

A few months ago a couple guys got "bragging rights" for having the most
separate databases. A couple guys claimed several hundred databases and one said he had
several thousand databases. The concensus was that Postgres has no problem handling many
separate databases.

I took that to heart and redesigned our system; we now have about 150 "primary data
sources" that are used to build couple of "warehouses" that our customers actually
search. Each database has about 20 tables. The total size (all databases and all tables together)
is not huge, about 40 million rows. Eventually the warehouse (customer accessible) databases will
be moved to separate servers, configured and indexed specifically for the task.

The only problem I've encountered is messages in the log:

NOTICE: number of page slots needed (131904) exceeds max_fsm_pages (10)
HINT: Consider increasing the configuration parameter "max_fsm_pages" to a
value over 131904.

So I dutifully followed this advice:

max_fsm_pages = 32
max_fsm_relations = 2

This is based on our current 150 databases times 20 tables, or 3000 tables total. But I
wasn't sure if sequences count as "relations", which would double the number.
So I set it at 20K relations to allow for growth.

Is there anything else I need to worry about? What happens if I go to, say,
500 databases (aside from increasing the FSM numbers even more)? 1000
databases?

The servers are 4 GB, dual Xeon, Postgres 8.1.4 on Linux FC4.

Thanks,
Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Craig A. James


Spiegelberg, Greg wrote:
The data are on two different computers, and I do processing 
of the data as it passes through the application.  Otherwise, 
the INSERT INTO ... SELECT is my first choice.


Would dblink() help in any way?


It might if perl wasn't so damned good at this. ;-)

Craig


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Best COPY Performance

2006-10-25 Thread Craig A. James


Jim C. Nasby wrote:

Wait... so you're using perl to copy data between two tables? And using
a cursor to boot? I can't think of any way that could be more
inefficient...

What's wrong with a plain old INSERT INTO ... SELECT? Or if you really
need to break it into multiple transaction blocks, at least don't
shuffle the data from the database into perl and then back into the
database; do an INSERT INTO ... SELECT with that same where clause.


The data are on two different computers, and I do processing of the data as it 
passes through the application.  Otherwise, the INSERT INTO ... SELECT is my 
first choice.

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Best COPY Performance

2006-10-24 Thread Craig A. James


Jim C. Nasby wrote:

Well, given that perl is using an entire CPU, it sounds like you should
start looking either at ways to remove some of the overhead from perl,
or to split that perl into multiple processes.


I use Perl for big database copies (usually with some processing/transformation 
along the way) and I've never seen 100% CPU usage except for brief periods, 
even when copying BLOBS and such.  My typical copy divides operations into 
blocks, for example doing

 N = 0
 while (more rows to go) {
begin transaction
select ... where primary_key > N order by primary_key limit 1000
while (fetch a row)
   insert into ...
N = (highest value found in last block)
commit
  }

Doing it like this in Perl should keep Postgres busy, with Perl using only 
moderate resources.  If you're seeing high Perl CPU usage, I'd look first at 
the Perl code.

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Best COPY Performance

2006-10-23 Thread Craig A. James


Jim C. Nasby wrote:

http://stats.distributed.net used to use a perl script to do some
transformations before loading data into the database. IIRC, when we
switched to using C we saw 100x improvement in speed, so I suspect that
if you want performance perl isn't the way to go. I think you can
compile perl into C, so maybe that would help some.


I use Perl extensively, and have never seen a performance problem.  I suspect the 
perl-to-C "100x improvement" was due to some other factor, like a slight change 
in the schema, indexes, or the fundamental way the client (C vs Perl) handled the data 
during the transformation, or just plain bad Perl code.

Modern scripting languages like Perl and Python make programmers far, far more 
productive than the bad old days of C/C++.  Don't shoot yourself in the foot by 
reverting to low-level languages like C/C++ until you've exhausted all other 
possibilities.  I only use C/C++ for intricate scientific algorithms.

In many cases, Perl is *faster* than C/C++ code that I write, because I can't 
take the time (for example) to write the high-performance string manipulation 
that have been fine-tuned and extensively optimized in Perl.

Craig


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Optimization of this SQL sentence

2006-10-17 Thread Craig A. James


Alexander Staubo wrote:

On Oct 17, 2006, at 17:10 , Craig A. James wrote:

 These tables are particularly egregious examples of ignorant 
database design. You need to understand the relational model


This email is a *particularly* egregious example of rudeness.  You owe 
Mr. Staubo, and the Postgress community, an apology.


I'm sorry you feel that way, but I don't think I was out of line.  
... If you think the database design in question is *not* 
ignorant database design, please do explain why, but on technical 
grounds. (Ignorance, of course, is not a sin.)


This is not about design.  It's about someone who came for help, and got a 
derogatory remark.  Is it really so hard to be helpful *and* use polite, 
encouraging language?

Craig



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Optimization of this SQL sentence

2006-10-17 Thread Craig A. James



 These tables are particularly egregious examples of ignorant database 
design. You need to understand the relational model 


This email is a *particularly* egregious example of rudeness.  You owe Mr. 
Staubo, and the Postgress community, an apology.

There is absolutely no reason to insult people who come to this forum for help.  That's 
why the forum is here, to help people who are "ignorant" and want to improve 
their knowledge.

Craig


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Hints proposal

2006-10-15 Thread Craig A. James


So let's cut to the bone: If someone thinks a proposal is a bad idea, and 
they're volunteering their time on an open-source project, why would they 
implement the proposal?

In all the heat and smoke, I believe there are two basic conclusions we all 
agree on.

1. Optimizer:
  a) A perfect optimizer would be a wonderful thing
  b) Optimization is a hard problem
  c) Any problem that can be solve by improving the optimizer *should*
 be solved by improving the optimizer.

2. Hints
  a) On a aesthetic/theoretical level, hints suck.  They're ugly and rude
  b) On a practical level, introducing hints will cause short- and long-term 
problems
  c) Hints would help DBAs solve urgent problems for which there is no other 
solution

The disagreements revolve around the degree to which 1 conflicts with 2.

1. Developers feel very strongly about 2(a) and 2(b).
2. DBAs "in the trenches" feel very strongly about 2(c).

So my question is: Is there any argument that can be made to persuade those of 
you who are volunteering your time on the optimizer to even consider a HINTS 
proposal?  Has all this discussion changed your perspective on 2(c), and why it 
really matters to some of us?  Are we just wasting our time, or is this a 
fruitful discussion?

Thanks,
Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Hints proposal

2006-10-15 Thread Craig A. James


Josh Berkus wrote:
I actually think the way to attack this issue is to discuss the kinds of 
errors the planner makes, and what tweaks we could do to correct them. 
Here's the ones I'm aware of:


-- Incorrect selectivity of WHERE clause
-- Incorrect selectivity of JOIN
-- Wrong estimate of rows returned from SRF
-- Incorrect cost estimate for index use

Can you think of any others?


The one that started this discussion: Lack of cost information for functions.  
I think this feature is a good idea independent of the whole HINTS discussion.

At a minimum, a rough categorization is needed, such as "Lighning fast / Fast / Medium / Slow 
/ Ludicrously slow", with some sort if milliseconds or CPU cycles associated with each 
category.  Or perhaps something like, "This is (much faster|faster|same as|slower|much slower) 
than reading a block from the disk."

If I understand Tom and others, the planner already is capable of taking 
advantage of this information, it just doesn't have it yet.  It could be part 
of the CREATE FUNCTION command.

  CREATE OR REPLACE FUNCTION foobar(text, text, text) RETURNS text
 AS '/usr/local/pgsql/lib/foobar.so', 'foobar'
 COST LUDICROUSLY_SLOW
 LANGUAGE 'C' STRICT;

Better yet ('tho I have no idea how hard this would be to implement...) would 
be an optional second function with the same parameter signature as the main 
function, but it would return a cost estimate:

  CREATE OR REPLACE FUNCTION foobar(text, text, text) RETURNS text
 AS '/usr/local/pgsql/lib/foobar.so', 'foobar'
 COST foobar_cost
 LANGUAGE 'C' STRICT;

The planner could call it with the same parameters it was about to use, and get 
an accurate estimate for the specific operation that is about to be done.  In 
my particular case (running an NP-complete problem), there are cases where I 
can determine ahead of time that the function will be fast, but in most cases 
it is *really* slow.

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Simple join optimized badly?

2006-10-09 Thread Craig A. James


Brian Herlihy wrote:
PG does support hints actually.. 
The only thing is, the hints are expressed in an obscure, ad-hoc and

implementation dependant language.

For example, the "Don't use index X" hint (the one I used) can be accessed by
replacing your index with an index on values derived from the actual index...


And then there's 


   select ... from (select ... offset 0)

where the "offset 0" prevents any rewriting between the two levels of query.  
This replaces joins and AND clauses where the planner makes the wrong choice of join 
order or filtering.  I grepped my code and found four of these (all workarounds for the 
same underlying problem).

Imagine I got run over by a train, and someone was reading my code.  Which 
would be easier for them to maintain: Code with weird SQL, or code with 
sensible, well-written SQL and explicit hints?  Luckily for my (hypothetical, I 
hope) successor, I put massive comments in my code explaining the strange SQL.

The bad applications are ALREADY HERE.  And they're WORSE to maintain than if 
we had a formal hint language.  The argument that hints lead to poor 
application is true.  But lack of hints leads to worse applications.

Craig


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Simple join optimized badly?

2006-10-08 Thread Craig A. James


Mark Kirkwood wrote:
The result?  I can't use my function in any WHERE clause that involves 
any other conditions or joins.  Only by itself.  PG will occasionally 
decide to use my function as a filter instead of doing the join or the 
other WHERE conditions first, and I'm dead.


this is an argument for cost-for-functions rather than hints AFAICS.


Perhaps you scanned past what I wrote a couple paragraphs farther down.  I'm 
going to repeat it because it's the KEY POINT I'm trying to make:

Craig James wrote:
Now you might argue that function-cost needs to be added to the 
optimizer's arsenal of tricks.  And I'd agree with you: That WOULD be a 
better solution than hints.  But I need my problem solved TODAY, not 
next year.  Hints can help solve problems NOW that can be brought to the 
PG team's attention later, and in the mean time let me get my 
application to work.


Craig


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Simple join optimized badly?

2006-10-08 Thread Craig A. James


Bruce Momjian wrote:

I can do 100! on my computer, but can't do it in my head.


A poor example.  100! is a simple repetative calculation, something computers 
are very good at.  Optimizing an SQL query is very difficult, and a completely 
different class of problem.

The fact is the PG team has done a remarkable job with the optimizer so far.   
I'm usually very happy with its plans.  But humans still beat computers at many 
tasks, and there are unquestionably areas where the PG optimizer is not yet 
fully developed.

When the optimizer reaches its limits, and you have to get your web site 
running, a HINT can be invaluable.

I said something in a previous version of this topic, which I'll repeat here.  
The PG documentation for HINTs should be FILLED with STRONG ADMONITIONS to post 
the problematic queries here before resorting to hints.

There will always be fools who abuse hints.  Too bad for them, but don't make 
the rest of us suffer for their folly.

Craig


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Simple join optimized badly?

2006-10-08 Thread Craig A. James

... and add 100 other problems.  Hints are used because the DBA thinks that 
they are smarter than the optimizer; 99% of the time, they are wrong.  
Just try manually optimizing a complex query, you'll see -- with three 
join types, several scan types, aggregates, bitmaps, internal and external 
sorts, and the ability to collapse subqueries it's significantly more than 
a human can figure out accurately.  


Sorry, this is just wrong, wrong, wrong.

I've heard this from several PG developers every time hints have come up in my roughly 
eighteen months as a PG application developer.  And in between every assertion that 
"the application programmers aren't as smart as the optimizer", there are a 
dozen or two examples where posters to this list are told to increase this setting, 
decrease that one, adjust these other two, and the end result is to get the plan that the 
application programmer -- AND the PG professionals -- knew was the right plan to start 
with.

People are smarter than computers.  Period.

Now I'll agree that the majority, perhaps the great majority, of questions to 
this group should NOT be solved with hints.  You're absolutely right that in 
most cases hints are a really bad idea.  People will resort to hints when they 
should be learning better ways to craft SQL, and when they should have read the 
configuration guides.

But that doesn't alter the fact that many, perhaps most, complicated 
application will, sooner or later, run into a showstopper case where PG just 
optimizes wrong, and there's not a damned thing the app programmer can do about 
it.

My example, discussed previously in this forum, is a classic.  I have a VERY 
expensive function (it's in the class of NP-complete problems, so there is no 
faster way to do it).  There is no circumstance when my function should be used 
as a filter, and no circumstance when it should be done before a join.  But PG 
has no way of knowing the cost of a function, and so the optimizer assigns the 
same cost to every function.  Big disaster.

The result?  I can't use my function in any WHERE clause that involves any 
other conditions or joins.  Only by itself.  PG will occasionally decide to use 
my function as a filter instead of doing the join or the other WHERE conditions 
first, and I'm dead.

The interesting thing is that PG works pretty well for me on big tables -- it 
does the join first, then applies my expensive functions.  But with a SMALL 
(like 50K rows) table, it applies my function first, then does the join.  A 
search that completes in 1 second on a 5,000,000 row database can take a minute 
or more on a 50,000 row database.

Instead, I have to separate the WHERE terms into two SQL statements, and do the 
join myself.  I do the first half of my query, suck it all into memory, do the 
second half, suck it into memory, build a hash table and join the two lists in 
memory, then take the joined results and apply my function to it.

This is not how a relational database should work.  It shouldn't fall over dead 
just when a table's size SHRINKS beyond some threshold that causes the planner 
to switch to a poor plan.

Since these tables are all in the same database, adjusting configuration 
parameters doesn't help me.  And I suppose I could use SET to disable various 
plans, but how is that any different from a HINT feature?

Now you might argue that function-cost needs to be added to the optimizer's 
arsenal of tricks.  And I'd agree with you: That WOULD be a better solution 
than hints.  But I need my problem solved TODAY, not next year.  Hints can help 
solve problems NOW that can be brought to the PG team's attention later, and in 
the mean time let me get my application to work.

Sorry if I seem particularly hot under the collar on this one.  I think you PG designers 
have created a wonderful product.  It's not the lack of hints that bothers me, it's the 
"You app developers are dumber than we are" attitude.  We're not.  Some of us 
know what we're doing, and we need hints.

If it is just a matter of resources, that's fine.  I understand that these 
things take time.  But please don't keep dismissing the repeated and serious 
requests for this feature.  It's important.

Thanks for listening.
Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Simple join optimized badly?

2006-10-06 Thread Craig A. James


I have two tables, SAMPLE and HITLIST that when joined, generate a monsterous 
sort.

 HITLIST_ROWS has about 48,000 rows
 SAMPLE has about 16 million rows

 The joined column is indexed in SAMPLE
 HITLIST_ROWS is a scratch table which is used a few times then discarded.
 HITLIST_ROWS has no indexes at all

There are two plans below.  The first is before an ANALYZE HITLIST_ROWS, and 
it's horrible -- it looks to me like it's sorting the 16 million rows of the 
SEARCH table.  Then I run ANALYZE HITLIST_ROWS, and the plan is pretty decent.

First question: HITLIST_ROWS so small, I don't understand why the lack of 
ANALYZE should cause SAMPLE's contents to be sorted.

Second question: Even though ANALYZE brings it down from 26 minutes to 47 
seconds, a huge improvement, it still seems slow to me.  Its going at roughly 1 
row per millisecond -- are my expectations too high?  This is a small-ish Dell 
computer (Xeon), 4 GB memory, with a four-disk SATA software RAID0 (bandwidth 
limited to about 130 MB/sec due to PCI cards).  Other joins of a similar size 
seem much faster.

It looks like I'll need to do an ANALYZE every time I modify HITLIST_ROWS, 
which seems like a waste because HITLIST_ROWS is rarely used more than once or 
twice before being truncated and rebuilt with new content.  (HITLIST_ROWS can't 
be an actual temporary table, though, because it's a web application and each 
access is from a new connection.)

This is Postgres 8.0.3.  (We're upgrading soon.)

Thanks,
Craig



explain analyze select t.SAMPLE_ID from SAMPLE t, HITLIST_ROWS ph where 
t.VERSION_ID = ph.ObjectID);
 QUERY PLAN   
---

Merge Join  (cost=4782.35..1063809.82 rows=613226 width=4) (actual 
time=174.212..1593886.582 rows=176294 loops=1)
  Merge Cond: ("outer".version_id = "inner".objectid)
  ->  Index Scan using i_sample_version_id on sample t  (cost=0.00..1008713.68 
rows=16446157 width=8) (actual time=0.111..1571911.208 rows=16446157 loops=1)
  ->  Sort  (cost=4782.35..4910.39 rows=51216 width=4) (actual 
time=173.669..389.496 rows=176329 loops=1)
Sort Key: ph.objectid
->  Seq Scan on hitlist_rows_378593 ph  (cost=0.00..776.16 rows=51216 
width=4) (actual time=0.015..90.059 rows=48834 loops=1)
Total runtime: 1594093.725 ms
(7 rows)

chmoogle2=> analyze HITLIST_ROWS;
ANALYZE
chmoogle2=> explain analyze select t.SAMPLE_ID from SAMPLE t, HITLIST_ROWS ph 
where t.VERSION_ID = ph.ObjectID;
 QUERY PLAN   
---

Hash Join  (cost=874.43..457976.83 rows=584705 width=4) (actual 
time=302.792..47796.719 rows=176294 loops=1)
  Hash Cond: ("outer".version_id = "inner".objectid)
  ->  Seq Scan on sample t  (cost=0.00..369024.57 rows=16446157 width=8) 
(actual time=46.344..26752.343 rows=16446157 loops=1)
  ->  Hash  (cost=752.34..752.34 rows=48834 width=4) (actual 
time=149.548..149.548 rows=0 loops=1)
->  Seq Scan on hitlist_rows_378593 ph  (cost=0.00..752.34 rows=48834 
width=4) (actual time=0.048..80.721 rows=48834 loops=1)
Total runtime: 47988.572 ms
(6 rows)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] RAID 0 not as fast as expected

2006-09-14 Thread Craig A. James


Alan Hodgson wrote:
On Thursday 14 September 2006 11:05, "Craig A. James" 
<[EMAIL PROTECTED]> wrote:

I'm experiment with RAID, looking for an inexpensive way to boost
performance.  I bought 4 Seagate 7200.9 120 GB SATA drives and two SIIG
dual-port SATA cards.  (NB: I don't plan to run RAID 0 in production,
probably RAID 10, so no need to comment on the failure rate of RAID 0.)



Are those PCI cards?  If yes, it's just a bus bandwidth limit.


Ok, that makes sense.

  One SATA disk = 52 MB/sec
  4-disk RAID0  = 106 MB/sec
  
  PCI at 33 MHz x 32 bits (4 bytes) = 132 MB/sec.


I guess getting to 80% of the theoretical speed is as much as I should expect.

Thanks,
Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[PERFORM] RAID 0 not as fast as expected

2006-09-14 Thread Craig A. James


I'm experiment with RAID, looking for an inexpensive way to boost performance.  
I bought 4 Seagate 7200.9 120 GB SATA drives and two SIIG dual-port SATA cards. 
 (NB: I don't plan to run RAID 0 in production, probably RAID 10, so no need to 
comment on the failure rate of RAID 0.)

I used this raw serial-speed test:

  time sh -c "dd if=/dev/zero of=./bigfile bs=8k count=100 && sync"
  (unmount/remount)
  time sh -c "dd if=./bigfile of=/dev/null bs=8k count=100 && sync"

Which showed that the RAID 0 4-disk array was almost exactly twice as fast as 
each disk individually.  I expected 4X performance for a 4-disk RAID 0.  My 
suspicion is that each of these budget SATA cards is bandwidth limited; they 
can't actually handle two disks simultaneously, and I'd need to get four 
separate SATA cards to get 4X performance (or a more expensive card such as the 
Areca someone mentioned the other day).

On the other hand, it "feels like" (using our application) the seek performance 
is quite a bit better, which I'd expect given my hypothesis about the SIIG cards.  I 
don't have concrete benchmarks on seek speed.

Thanks,
Craig

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Poor performance on seq scan

2006-09-12 Thread Craig A. James




 tin tout  KB/t tps  MB/s   KB/t   tps  MB/s  us ni sy in id
   1   14 128.00   1  0.10  128.00   1  0.10   5  0 94  1  0
   0   12 123.98 104 12.56  123.74 104 12.56   8  0 90  2  0
   0   12 125.66 128 15.75  125.26 128 15.68  10  0 85  6  0
   0   12 124.66 129 15.67  124.39 129 15.64  12  0 85  3  0
   0   12 117.13 121 13.87  117.95 121 13.96  12  0 84  5  0
   0   12 104.84 118 12.05  105.84 118 12.19  10  0 87  2  0


Why is that showing 85+ percent *system* CPU time??  I could believe a
lot of idle CPU if the query is I/O bound, or a lot of user time if PG
was being a hog about doing the ~~ comparisons (not too unlikely BTW).
But if the kernel is eating all the CPU, there's something very wrong,
and I don't think it's Postgres' fault.


There IS a bug for SATA disk drives in some versions of the Linux kernel.  On a 
lark I ran some of the I/O tests in this thread, and much to my surprise 
discovered my write speed was 6 MB/sec ... ouch!  On an identical machine, 
different kernel, the write speed was 54 MB/sec.

A couple of hours of research turned up this:

  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=168363

The fix for me was to edit /boot/grub/grub.conf, like this:

  kernel /vmlinuz-2.6.12-1.1381_FC3 ro root=LABEL=/ rhgb quiet \
  ramdisk_size=1200 ide0=noprobe ide1=noprobe

Notice the "ideX=noprobe".  Instant fix -- after reboot the disk write speed 
jumped to what I expected.

Craig


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[PERFORM] Moving a tablespace

2006-08-22 Thread Craig A. James


Suppose, hypothetically of course, someone lacked foresight, and put a tablespace somewhere with a 
dumb name, like "/disk2", instead of using a symbolic link with a more descriptive name.  
 And then /disk2 needs to be renamed, say to "/postgres_data", and this (hypothetical) 
DBA realizes he has made a dumb mistake.

Is there a way to move a tablespace to a new location without a dump/restore?  I, er, 
this hypothetical guy, knows he can move it and put a symbolic link in for /disk2, but 
this is somewhat unsatisfactory since "/disk2" would have to exist forever.

Thanks,
Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Forcing using index instead of sequential scan?

2006-07-23 Thread Craig A. James


Michael Stone wrote:

On Sat, Jul 22, 2006 at 10:26:53AM -0700, Craig A. James wrote:
This causes massive file-system activity and flushes all files that 
the kernel has cached.  If you run this between each Postgres test 
(let it run for a couple minutes), it gives you an apples-to-apples 
comparison between successive benchmarks, and eliminates the effects 
of caching.


Assuming a system with small ram or an unusually large system 
installation. Unmounting is a much more realiable mechanism.


Indeed, but it only works if you can.  For example, in my small-ish 
installation, my WAL and system tables are mounted on the root disk.  Or 
someone might not have super-user access.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Forcing using index instead of sequential scan?

2006-07-22 Thread Craig A. James


The real issue here is caching across successive queries, an effect that
Postgres doesn't deal with very well at the moment.  If you run these
queries from a standing start (freshly booted machine) you'll likely
find that the indexscan plan is indeed slower than the seqscan/hash
plan, just like the planner thinks.


Here's a little trick I learned to speed up this test.

  find / -type f -exec grep foobar {} \;

This causes massive file-system activity and flushes all files that the kernel 
has cached.  If you run this between each Postgres test (let it run for a 
couple minutes), it gives you an apples-to-apples comparison between successive 
benchmarks, and eliminates the effects of caching.

If you run this as a regular user (NOT super-user or 'postgres'), you won't 
have permission to access your Postgres files, so you're guaranteed they'll be 
flushed from the cache.

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] increment Rows in an SQL Result Set postgresql

2006-07-15 Thread Craig A. James


Hassan,


1. I have a function that given two parameter produces an arbitrary id, and
text. However arbitrary the id and text are, they are in certain order. i.e. it
is imperative that whatever processing I do, the order is preserved.


What type of function is this?  Did you write it in C?  An SQL procedure?

If the function is written in C, you can create a static local variable which 
you increment every time you call your function, and which you return along 
with your other two values.  As long as your client is connected to the 
back-end server, you're guaranteed that it's a single process, and it's not 
multi-threaded, so this is a safe approach.  However, note that if you 
disconnect and reconnect, your counter will be reset to zero.

If your function is written in a different language or is a procedure, you 
might create a sequence that your function can query.

The trick is that it is the function itself that must return the incremented 
value, i.e. you must return three, not two, values from your function.  That 
way, you're not relying on any specific features of the planner, so your three 
values will stick together.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Kill a session

2006-07-14 Thread Craig A. James


Tom Lane wrote:

"Craig A. James" <[EMAIL PROTECTED]> writes:

Bottom line is that I was expecting "instant death" with SIGTERM, but
instead got an agonizing, drawn out -- but safe -- death of the query.


What was the query exactly?

Our expectation is that all or at least most queries should respond to
SIGINT or SIGTERM interrupts pretty rapidly, say on a less-than-a-second
timescale.  However there are various loops in the backend that fail to
execute CHECK_FOR_INTERRUPTS sufficiently often :-(.  We've been
gradually finding and fixing these, and will be glad to fix your case
if you provide enough details to pin it down.  You might be interested
in this current thread about a similar problem:

http://archives.postgresql.org/pgsql-patches/2006-07/msg00039.php


Thanks, this is good information.  The qsort is a distinct possibility.  The 
query is a big

  insert into some_hitlist (select id from another_hitlist join data_table on 
(...))

where the hitlists are unindexed.  So it may be using a merge-join with qsort.  
When I have a few minutes, I'll turn on logging in the app and find the exact 
SQL, and run an EXPLAIN ANALYZE and see what's really happening.

It's also possible that the INSERT itself is the problem, or adds to the 
problem.  The SIGINT may come after a few million rows have been inserted, so 
it would have to clean that up, right?

Thanks,
Craig


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Kill a session

2006-07-13 Thread Craig A. James

Steinar H. Gunderson wrote:

On Wed, Jul 12, 2006 at 08:43:18AM -0700, Craig A. James wrote:

Then you killed the wrong backend...
No queries run in postmaster. They all run in postgres backends. The
postmaster does very little actual work, other than keeping track of
everybody else.
It turns out I was confused by this: ps(1) reports a process called
"postgres", but top(1) reports a process called "postmaster", but they both
have the same pid. I guess postmaster replaces its own name in the process
table when it's executing a query, and it's not really the postmaster even
though top(1) calls it postmaster.

So "kill -15 " is NOT killing the process -- to kill the process, I
have to use signal 9. But if I do that, ALL queries in progress are
aborted. I might as well shut down and restart the database, which is an
unacceptable solution for a web site.

I don't follow your logic here. If you do "kill -15 " of the postmaster
doing the work, the query should be aborted without taking down the entire
cluster. I don't see why you'd need -9 (which is a really bad idea anyhow)...

I've solved this mystery. "kill -15" doesn't immediately kill the job -- it
aborts the query, but it might take 15-30 seconds to clean up.

This confused me, because the query I was using to test took about 30 seconds, so the SIGTERM didn't seem to make a difference. But when I used a harder query, one that would run for 5-10 minutes, SIGTERM still stopped it after 15 seconds, which isn't great but it's acceptable.

Bottom line is that I was expecting "instant death" with SIGTERM, but instead
got an agonizing, drawn out -- but safe -- death of the query. At least that's my
deduction based on experiments. I haven't dug into the source to confirm.

Thanks everyone for your answers. My "kill this query" feature is now
acceptable.

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Kill a session

2006-07-13 Thread Craig A. James


Thanks for your reply, Mark:
I'm back to my original question: How do you kill a runaway query 
without bringing down the whole database?  Is there really no answer 
to this?


... if you actually want to disconnect a backend 
process then there is nothing to let you do this remotely. I recently 
did a patch for Bizgres that just implements the 
pg_terminate_backend(pid) function (currently #ifdef'ed out of the 
codebase) as a contrib so it can be easily installed. See 
http://pgfoundry.org/pipermail/bizgres-general/2006-May/000484.html


This answers my question.  I've finally got a statement in concrete terms, Postgres has 
no way to kill a backend process via an SQL statement.  "If Mark had to resort to 
this, then there is no other way."

If you want to try it out, please read the README (it discusses possible 
dangers associated with sending SIGTERM to backends). And I would 
certainly be interested in hearing what level of success (or otherwise) 
you have with it!


Thanks, but I've already implemented my own, which is essentially identical in concept to 
yours, but simpler in the sense of being even less safe than yours -- I just let anyone 
send the signal, since I have no users other than my own app.  I'll keep my version since 
it's embedded in my own plug-in.  That way I won't have to keep remembering to modify the 
Postgres code when I upgrade.  I like to keep Postgres "stock".

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] [Fwd: Delivery Status Notification (Failure)]

2006-07-12 Thread Craig A. James


I wrote:
I can't find an address to complain about the mailing list itself, so 
apologies but I'm posting directly to this list.  Every time I post to 
this group, I get returned mails about OTHER subscribers' invalid 
accounts, like the one below.


Michael Glaesemann replied:
Is this when you're replying to a post or creating a new post? If the 
former, and you're using reply-to-all, you'll be sending one message to 
the list and another directly to the poster of the message you're 
responding to. 


And Richard Broersma Jr replied:

Does the message come from postgresql.org or is the bounced email coming from 
these specific users
when you include them in reply-all?


Thanks to both for your answers.  But no -- It's for new posts.  In fact, when writing the email 
that started this thread, it was only to pgsql-performance@postgresql.org (I double-checked by 
using emacs on my Thunderbird "Sent" folder), yet I still got another 
"undeliverable" reply along with your message:


   This is an automatically generated Delivery Status Notification.
   Delivery to the following recipients failed.
  [EMAIL PROTECTED]



Craig


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] [Fwd: Delivery Status Notification (Failure)]

2006-07-12 Thread Craig A. James


I can't find an address to complain about the mailing list itself, so apologies 
but I'm posting directly to this list.  Every time I post to this group, I get 
returned mails about OTHER subscribers' invalid accounts, like the one below.  
What's up?  This seems to be a new phenomenon.  Should the [EMAIL PROTECTED] be 
getting these and discarding them?

Thanks,
Craig


 Original Message 
Subject: Delivery Status Notification (Failure)
Date: Wed, 12 Jul 2006 13:15:16 -0400
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]

This is an automatically generated Delivery Status Notification.

Delivery to the following recipients failed.

  [EMAIL PROTECTED]





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Kill a session

2006-07-12 Thread Craig A. James


Magnus Hagander wrote:
This raises the question: Why doesn't Postgres have a "kill 
session" command that works?  Oracle has it, and it's 
invaluable; there is no substitute.  Various writers to these 
PG lists have raised the question repeatedly.  Is it just a 
matter that nobody has had the time to do it (which I 
respect!), or is there a reason why the Postgres team decided 
a "kill session" is a bad idea?


I beleive the function to kill a backend is actually in the codebase,
it's just commented out because it's considered dangerous. There are
some possible issues (see -hackers archives) about sending SIGTERM
without actually shutting down the whole cluster.

Doing the client-side function to call is the easy part.

In many cases you just need to cancel a query, in which case you can use
pg_cancel_backend() for exmaple. If you need to actually kill it, your
only supported way is to restart postgresql. 


In other words, are you confirming that there is no way to kill a query from 
another process, other than shutting down the database?  My understanding of 
the documentation tells me I can't use cancel, because the process doing the 
killing isn't the original process.

But in spite earlier posting in these forums that say the 
killing the backend was the way to go, this doesn't really 
work.  First, even though the "postgres" backend job is 
properly killed, a "postmaster" job keeps running at 99% CPU, 
which is pretty useless.  Killing the client's backend didn't 
kill the process actually doing the work!


Then you killed the wrong backend...
No queries run in postmaster. They all run in postgres backends. The
postmaster does very little actual work, other than keeping track of
everybody else.


It turns out I was confused by this: ps(1) reports a process called "postgres", but 
top(1) reports a process called "postmaster", but they both have the same pid.  I guess 
postmaster replaces its own name in the process table when it's executing a query, and it's not 
really the postmaster even though top(1) calls it postmaster.

So "kill -15 " is NOT killing the process -- to kill the process, I have 
to use signal 9.  But if I do that, ALL queries in progress are aborted.  I might as well shut 
down and restart the database, which is an unacceptable solution for a web site.

I'm back to my original question: How do you kill a runaway query without 
bringing down the whole database?  Is there really no answer to this?

Thanks,
Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

[PERFORM] Kill a session

2006-07-11 Thread Craig A. James

There have been dozens, perhaps hundreds, of entries in the pg-admin,
pg-general, and pg-performance lists regarding killing a session, but as far as
I can tell, there is no Postgres solution. Did I miss something?

This raises the question: Why doesn't Postgres have a "kill session" command that works?
Oracle has it, and it's invaluable; there is no substitute. Various writers to these PG lists have
raised the question repeatedly. Is it just a matter that nobody has had the time to do it (which I
respect!), or is there a reason why the Postgres team decided a "kill session" is a bad
idea?

The rest of this email is just to illustrate the convoluted solution I've had
to adopt, and even with this, I can't get it to work quite right.

Background: In our web app, we give our users a fair amount of power to formulate
difficult queries. These long-running queries are fork/exec'd from the Apache CGI, and
we give the user a "job status" page, with the option to kill the job.

I can kill off the CGI, since Apache owns the process. But the "stock answer"
of

kill -2 backend-pid

won't work, because I don't want my Apache jobs running as super-user (!) or as
setuid processes.

So here's my solution: Install a couple of C extensions like this:

Datum get_session_id(PG_FUNCTION_ARGS)
{
PG_RETURN_INT32(getpid());
}

Datum kill_session(PG_FUNCTION_ARGS)
{
int4 session_id, status;
session_id = PG_GETARG_INT32(0);
fprintf(stderr, "KILLING SESSION: %d, 15\n", session_id);
status = kill(session_id, 15);
PG_RETURN_BOOL((status == 0) ? true : false);
}

These are installed with the appropriate "CREATE OR REPLACE ..." sql. Although
this is dangerous (anyone who can log in to Postgres can kill any Postgres job!), its
safe enough in a controlled enviroment. It allows an Apache CGI to issue the kill(2)
command through the Postgres backend, which is running as the Postgres user, and thus has
permission to do the deed. When I start a job, I record the backend's PID, which allows
another process to connect and kill the first one. Alright, it's a hack, but it's the
best I could think of.

But in spite earlier posting in these forums that say the killing the backend was the way to go,
this doesn't really work. First, even though the "postgres" backend job is properly
killed, a "postmaster" job keeps running at 99% CPU, which is pretty useless. Killing
the client's backend didn't kill the process actually doing the work!

Second, the "KILLING SESSION" message to stderr is only printed in the PG log file sporadically.
This confuses me, since the "KILLING SESSION" is printed by a *different* process than the one
being killed, so it shouldn't be affected. So what happens to fprintf()'s output? Most of the time, I just
get "unexpected EOF on client connection" in the log which is presumably the postmaster complaining
that the postgres child process died.

I know the kill_session() is working because it returns "true", and the job is
in fact killed. But the query keeps running in postmaster (or is it something else, like
a rollback?), and the stderr output disappears.

Thanks,
Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

[PERFORM] need vacuum after insert/truncate/insert?

2006-07-06 Thread Craig A. James


If I insert a bunch of rows, then truncate, then insert a bunch more rows, do I 
need to vacuum?  I've been assuming that TRUNCATE TABLE is a brute-force 
technique that more-or-less tosses the old table and starts fresh so that no 
vacuum is necessary.

Second question: Same scenario as above, but now the table has indexes.  Is a reindex 
needed, or are the indexes they "truncated" too?

Thanks,
Craig


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Is postgresql ca do the job for software deployed in

2006-07-03 Thread Craig A. James


Richard Broersma Jr wrote:

Each table with-in the database is assigned an OID and is located inside the DB 
directory.  So if
there is a file-system limitation on the number of files with-in a given 
directory it would also
be a limit to the number of tables that could be created for each database.


You could handle this with tablespaces.  For example, create ten tablespaces, 
and then assign customer databases to them in round-robin fashion.  This also 
lets you assign databases to different disks to balance the I/O load.

Craig

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] Sort order in sub-select

2006-07-02 Thread Craig A. James


Here is a subtle question about SQL.  I have a one-to-many pair of tables (call them "P" 
and "C" for parent and child).  For each row of P, there are many rows in C with data, 
and I want to sort P on the min(c.data).  The basic query is simple:

  select p_id, min(data) as m from c group by p_id order by m;

Now the problem: I also want to store this, in sorted order, as a "hitlist", so 
I have a table like this:

  create table hitlist(p_id integer, sortorder integer);

and a sequence to go with it.  The first thing I tried doesn't work:

  insert into hitlist(p_id, sortorder)
(select p_id, nextval('hitlist_seq') from
   (select p_id, min(data) as m from c group by p_id order by m);

Apparently, the sort order returned by the innermost select is NOT maintained as you go 
through the next select statement -- the rows seem to come out in random order.  This 
surprised me.  But in thinking about the definition of SQL itself, I guess there's no 
guarantee that sort order is maintained across sub-selects.  I was caught by this because 
in Oracle, this same query works "correctly" (i.e. the hitlist ends up in 
sorted order), but I suspect that was just the luck of their implementation.

Can anyone confirm this, that the sort order is NOT guaranteed to be maintained 
through layers of SELECT statements?

The obvious solution is to make the hitlist.sortorder column have the nextval() 
as its default and eliminate the first sub-select.  But I thought the two would 
be equivalent.

Thanks,
Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

[PERFORM] Sort order in sub-select

2006-06-29 Thread Craig A. James


Here is a question about SQL.  I have a one-to-many pair of tables (call them "P" and 
"C" for parent and child).  For each row of P, there are many rows in C with data, and I 
want to sort P on the min(c.data).  The basic query is simple:

 select p_id, min(data) as m from c group by p_id order by m;

Now the problem: I also want to store this, in sorted order, as a "hitlist", so 
I have a table like this:

 create table hitlist(p_id integer, sortorder integer);

and a sequence to go with it.  The first thing I tried doesn't work:

 insert into hitlist(p_id, sortorder)
   (select p_id, nextval('hitlist_seq') from
  (select p_id, min(data) as m from c group by p_id order by m);

Apparently, the sort order returned by the innermost select is NOT maintained as you go 
through the next select statement -- the rows seem to come out in random order.  This 
surprised me.  But in thinking about the definition of SQL itself, I guess there's no 
guarantee that sort order is maintained across sub-selects.  I was caught by this because 
in Oracle, this same query works "correctly" (i.e. the hitlist ends up in 
sorted order), but I suspect that was just the luck of their implementation.

Can anyone confirm this, that the sort order is NOT guaranteed to be maintained 
through layers of SELECT statements?

The apparent solution is to make the hitlist.sortorder column have nextval() as 
its default and eliminate the first sub-select.  But I thought the two would be 
equivalent.

Thanks,
Craig


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[PERFORM] explain analyze reports 20x more time than actual

2006-06-29 Thread Craig A. James


I have a query that needs to run faster, with the obvious solution being to add an index. 
 But to confirm this, I ran explain analyze.  When I run the actual query, it 
consistently takes 6-7 seconds by the wall clock.  My application with a 
"verbose" mode enabled reports 6.6 seconds consistently.  However, when I run 
EXPLAIN ANALYZE, it takes 120 seconds!  This is 20x longer, and it leads me to distrust 
the plan that it claims to be executing.  How can the actual run time be so much faster 
than that claimed by EXPLAIN ANALYZE?  How can I find out the actual plan it's using?

Thanks,
Craig


Details:
 Postgres 8.0.3
   shared_buffers = 2
   work_mem = 50
   effective_cache_size = 43
 Dell w/ Xeon
   Linux kernel 2.6.9-1.667smp
   4 GB memory

=> explain analyze select SAMPLE.SAMPLE_ID, 
SAMPLE.VERSION_ID,SAMPLE.SUPPLIER_ID,SAMPLE.CATALOGUE_ID,SAMPLE.PREP_ID from 
HITLIST_ROWS_281430 join SAMPLE on (HITLIST_ROWS_281430.OBJECTID = 
SAMPLE.SAMPLE_ID) where  SAMPLE.VERSION_ID in 
(7513672,7513650,7513634,7513620,7513592,7513590,7513582,7513576,7513562,7513560) 
order by HITLIST_ROWS_281430.SortOrder;
  QUERY PLAN   


Sort  (cost=234964.38..234964.52 rows=58 width=24) (actual 
time=120510.842..120510.889 rows=10 loops=1)
  Sort Key: hitlist_rows_281430.sortorder
  ->  Hash Join  (cost=353.68..234962.68 rows=58 width=24) (actual 
time=81433.194..120510.753 rows=10 loops=1)
Hash Cond: ("outer".objectid = "inner".sample_id)
->  Seq Scan on hitlist_rows_281430  (cost=0.00..177121.61 
rows=11497361 width=8) (actual time=0.008..64434.110 rows=11497361 loops=1)
->  Hash  (cost=353.48..353.48 rows=82 width=20) (actual 
time=0.293..0.293 rows=0 loops=1)
  ->  Index Scan using i_sample_version_id, i_sample_version_id, 
i_sample_version_id, i_sample_version_id, i_sample_version_id, 
i_sample_version_id, i_sample_version_id, i_sample_version_id, 
i_sample_version_id, i_sample_version_id on sample  (cost=0.00..353.48 rows=82 
width=20) (actual time=0.042..0.201 rows=12 loops=1)
Index Cond: ((version_id = 7513672) OR (version_id = 
7513650) OR (version_id = 7513634) OR (version_id = 7513620) OR (version_id = 
7513592) OR (version_id = 7513590) OR (version_id = 7513582) OR (version_id = 
7513576) OR (version_id = 7513562) OR (version_id = 7513560))
Total runtime: 120511.485 ms
(9 rows)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PERFORM] Sun Donated a Sun Fire T2000 to the PostgreSQL

2006-06-22 Thread Craig A. James


Arjen van der Meijden wrote:
First of all, this graph has no origin. Its a bit difficult to test with 
less than one cpu.


Sure it does.  I ran all the tests.  They all took infinite time, and I got 
zero results.  And my results are 100% accurate and reliable.  It's perfectly 
valid data. :-)

Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] function not called if part of aggregate

2006-06-12 Thread Craig A. James


Greg Stark wrote:

However that's not enough to explain what you've shown. How about you show the
actual query and actual plan you're working with? The plan you've shown can't
result from the query you sent.


Mea culpa, sort of.  But ... in fact, the plan I sent *was* from query I sent, with the 
table/column names changed for clarity.  This time I'll send the plan "raw".  
(This is PG 8.0.1.)

chm=> explain select count(1) from (select normalize_add_salt(smiles) from
chm(> salt_smiles order by db_no) as foo;
QUERY PLAN 


Aggregate  (cost=69.95..69.95 rows=1 width=0)
  ->  Subquery Scan foo  (cost=0.00..67.93 rows=806 width=0)
->  Index Scan using salt_smiles_pkey on salt_smiles  (cost=0.00..59.87 
rows=806 width=30)
(3 rows)

As pointed out by Tom and others, this query DOES in fact call the 
normalize_add_salt() function.

Now here's the weird part. (And where my original posting went wrong -- sorry 
for the error!  I got the two queries mixed up.)

I originally had a more complex query, the purpose being to guarantee that the 
function was called on the strings in the order specified.  (More on this 
below.)  Here is the original query I used:

chm=> explain select count(1) from (select normalize_add_salt(smiles)
chm(>   from (select smiles from salt_smiles order by db_no) as foo) as bar;
QUERY PLAN 


Aggregate  (cost=67.94..67.94 rows=1 width=0)
  ->  Subquery Scan foo  (cost=0.00..65.92 rows=806 width=0)
->  Index Scan using salt_smiles_pkey on salt_smiles  (cost=0.00..57.86 
rows=806 width=30)
(3 rows)

Notice that the plans are essentially identical, yet in this one the function does NOT get called.  
I proved this by brute force, inserting "char **p = NULL; *p = "foo";" into the 
C code to guarantee a segmentation violation if the function gets called.  In the first case it 
does SIGSEGV, and in the second case it does not.

Now the reason for this more-complex query with an additional subselect is that 
the SMILES (which, by the way, are a lexical way of representing chemical 
structures - see www.daylight.com), must be passed to the function in a 
particular order (hence the ORDER BY).  In retrospect I realize the optimizer 
apparently flattens this query anyway (hence the identical plans, above).

But the weird thing is that, in spite of flattening, which would appear to make 
the queries equivalent, the function gets called in one case, and not in the 
other.

Steinar H. Gunderson asked:
 select count(1) from (select foo_init(value) from foo_init_table order by 
 value_id) as foo;

Why not just count(foo_init(value))?


Because the SMILES must be processed in a specific order, hence the more 
complex queries.

The simple answer to this whole problem is what Steinar wrote:
This works well, but it requires me to actually retrieve the function's 
value 800 times.


Is this actually a problem?


No, it's just a nuisance.  It occurs to me that in spite of the ORDER BY 
expression, Postgres is free to evaluate the function first, THEN sort the 
results, which means the SMILES would be processed in random order anyway.  
I.e. my ORDER BY clause is useless for the intended purpose.

So the only way I can see to get this right is to pull the SMILES into my application with the ORDER BY to ensure I have them in the correct order, then send them back one at a time via a "select normalize_add_salt(smiles)", meaning I'll retrieve 800 strings and then send them back. 


I just thought there ought to be a way to do this all on the PG server instead of sending 
all these strings back and forth.  I'd like to say to Postgres, "Just do it this 
way, OK?"  But the optimizer can't be turned off, so I guess I have to do it the 
slow way.  The good news is that this is just an initialization step, after which I 
typically process thousands of molecules, so the extra overhead won't kill me.

Thanks to all for your help.

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[PERFORM] function not called if part of aggregate

2006-06-11 Thread Craig A. James


My application has a function, call it "foo()", that requires initialization from a table 
of about 800 values.  Rather than build these values into the C code, it seemed like a good idea to 
put them on a PG table and create a second function, call it "foo_init()", which is 
called for each value, like this:

  select foo_init(value) from foo_init_table order by value_id;

This works well, but it requires me to actually retrieve the function's value 
800 times.  So I thought I'd be clever:

  select count(1) from (select foo_init(value) from foo_init_table order by 
value_id) as foo;

And indeed, it count() returns 800, as expected.  But my function foo_init() 
never gets called!  Apparently the optimizer figures out that foo_init() must 
return one value for each row, so it doesn't bother to actually call the 
function.

db=> explain select count(1) from (select foo_init(value) from foo_init_table 
order by db_no) as foo;
query plan 


aggregate  (cost=69.95..69.95 rows=1 width=0)
  ->  Subquery Scan foo  (cost=0.00..67.93 rows=806 width=0)
->  Index Scan using foo_init_table_pkey on foo_init_table  
(cost=0.00..59.87 rows=806 width=30)

This doesn't seem right to me -- how can the optimizer possibly know that a 
function doesn't have a side effect, as in my case?  Functions could do all 
sorts of things, such as logging activity, filling in other tables, etc, etc.

Am I missing something here?

Thanks,
Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Some queries starting to hang

2006-06-06 Thread Craig A. James


Tom Lane wrote:

The idea I just had was: why do we need EXPLAIN ANALYZE to run to
completion? In severe cases like this thread, we might be able to
discover the root cause by a *partial* execution of the plan, as long as
it was properly instrumented. That way, the OP might have been able to
discover the root cause himself...



I don't think that helps, as it just replaces one uncertainty by
another: how far did the EXPLAIN really get towards completion of the
plan?  You still don't have any hard data.


But at least you have some data, which is better than no data.  Even knowing 
that the plan got stuck on a particular node of the query plan could be vital 
information.  For a query that never finishes, you can't even find out where 
it's getting stuck.

That's why Simon's proposal might help in some particularly difficult 
situations.

Regards,
Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Some queries starting to hang

2006-06-06 Thread Craig A. James


Simon Riggs wrote:

Well, it's a big query.  If it ought to take a second or two, and
instead is taking an hour or two (1800 times the expected runtime), that
might be close enough to "never" to exhaust Chris' patience.  Besides,
we don't know whether the 1800 might itself be an underestimate (too bad
Chris didn't provide EXPLAIN ANALYZE results).  
 
This is a good example of a case where the inefficiency of EXPLAIN

ANALYZE would be a contributory factor to it not actually being
available for diagnosing a problem.


This is a frustration I have, but Simon expressed it much more concisely.  The first 
question one gets in this forum is, "did you run EXPLAIN ANALYZE?"  But if 
EXPLAIN ANALYZE never finishes, you can't get the information you need to diagnose the 
problem.  Simon's proposal,


e.g. EXPLAIN ANALYZE ERRLIMIT 10 SELECT ...


or something similar, would be a big help.  I.e. "If you can't finish in a 
reasonable time, at least tell me as much as you can."

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Query hanging/not finishing inconsistently

2006-05-22 Thread Craig A. James


Meetesh Karia wrote:

Hi all,

We've recently started having a problem where a query that normally 
executes in ~15ms starts to take upwards of 20s to complete.  When the 
connection that ran query is returned to the connection pool, it appears 
as though a transaction is still in progress so the connection pool 
tries to cancel the transaction and close the connection.  This fails 
and the connection is removed from the connection pool.  At this point, 
the situation rapidly degrades and we run out of connections to the 
postgres server.


An inspection of the pg_stat_activity table shows that practically every 
connection is running the above-mentioned query and some of those 
queries have been active for many minutes!  We've looked at the pg_locks 
table as well and the only exclusive locks are on transactions that are 
open.  All other locks are AccessShareLocks.  Also, as far as we can 
tell (from looking at the Hibernate stats), every db session that is 
opened is closed.


When this happens, if I kill one of the running postgres processes (just 
by picking the last process returned from "ps -ef | grep postgres"), the 
other queries will immediately finish and the system will respond.  
However, within 15 minutes, we'll be back in the same state as before.  
At that point, I've cycled Apache, Tomcat and Postgres and the system 
then seems to come back.


This sounds suspiciously like a question I asked a few weeks ago, on April 4.  
I have a process that just gets stuck.  After some questions from various of 
the experts in this forum, I used gdb(1) to attach to one of the frozen 
Postgress backend processes, and here's what I found:

On 5/12/2006, I wrote:

Thanks, good advice.  You're absolutely right, it's stuck on a
mutex.  After doing what you suggest, I discovered that the query
in progress is a user-written function (mine).  When I log in as
root, and use "gdb -p " to attach to the process, here's
what I find.  Notice the second function in the stack, a mutex
lock:

(gdb) bt
#0  0x0087f7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0096cbfe in __lll_mutex_lock_wait () from /lib/tls/libc.so.6
#2  0x008ff67b in _L_mutex_lock_3220 () from /lib/tls/libc.so.6
#3  0x4f5fc1b4 in ?? ()
#4  0x00dc5e64 in std::string::_Rep::_S_empty_rep_storage () from 
/usr/local/pgsql/lib/libchmoogle.so
#5  0x009ffcf0 in ?? () from /usr/lib/libz.so.1
#6  0xbfe71c04 in ?? ()
#7  0xbfe71e50 in ?? ()
#8  0xbfe71b78 in ?? ()
#9  0x009f7019 in zcfree () from /usr/lib/libz.so.1
#10 0x009f7019 in zcfree () from /usr/lib/libz.so.1
#11 0x009f8b7c in inflateEnd () from /usr/lib/libz.so.1
#12 0x00c670a2 in ~basic_unzip_streambuf (this=0xbfe71be0) at 
zipstreamimpl.h:332
#13 0x00c60b61 in OpenBabel::OBConversion::Read (this=0x1, pOb=0xbfd923b8, 
pin=0xffea) at istream:115
#14 0x00c60fd8 in OpenBabel::OBConversion::ReadString (this=0x8672b50, 
pOb=0xbfd923b8) at obconversion.cpp:780
#15 0x00c19d69 in chmoogle_ichem_mol_alloc () at stl_construct.h:120
#16 0x00c1a203 in chmoogle_ichem_normalize_parent () at stl_construct.h:120
#17 0x00c1b172 in chmoogle_normalize_parent_sdf () at vector.tcc:243
#18 0x0810ae4d in ExecMakeFunctionResult ()
#19 0x0810de2e in ExecProject ()
#20 0x08115972 in ExecResult ()
#21 0x08109e01 in ExecProcNode ()
#22 0x0020 in ?? ()
#23 0xbed4b340 in ?? ()
#24 0xbf92d9a0 in ?? ()
#25 0xbed4b0c0 in ?? ()
#26 0x in ?? ()

It looks to me like my code is trying to read the input parameter
(a fairly long string, maybe 2K) from a buffer that was gzip'ed
by Postgres for the trip between the client and server... somewhere
along the way, a mutex gets set, and then ... it's stuck forever.

ps(1) shows that this thread had been running for about 7 hours,
and the job status showed that this function had been
successfully called about 1 million times, before this mutex lock
occurred.


This is not an issue that's been resolved.  Nobody had ever seen this before.  
Tom Lane suggested it might be a libc/c++ bug, but unfortunately in my case 
this lockup occurs so rarely (every few days) that it will be very difficult to 
know if we've fixed the problem.

If gdb(1) reveals that your process is stuck in a mutex, then you might have a 
better chance testing this hypothesis, since your problem happens within 15 
minutes or so.

Did this start recently, perhaps right after a kernel update?

Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Optimizer: limit not taken into account

2006-05-17 Thread Craig A. James


Tom Lane wrote:

There is not anything in there that considers whether the table's
physical order is so nonrandom that the search will take much longer
than it would given uniform distribution.  It might be possible to do
something with the correlation statistic in simple cases ...


In this case, the rows are not random at all, in fact they're inserted from a sequence, then rows are deleted as they are processed.  If the planner is hoping for random physical distribution, this particular case is exactly wrong. 


Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[PERFORM] Optimizer: limit not taken into account

2006-05-17 Thread Craig A. James


Here's a "corner case" that might interest someone.  It tripped up one of our 
programmers.

We have a table with > 10 million rows.  The ID column is indexed, the table 
has been vacuum/analyzed.  Compare these two queries:

  select * from tbl where id >= 1000 limit 1;
  select * from tbl where id >= 1000 order by id limit 1;

The first takes 4 seconds, and uses a full table scan.  The second takes 32 msec and uses the index.  
Details are below.


I understand why the planner makes the choices it does -- the "id > 1000" isn't very 
selective and under normal circumstances a full table scan is probably the right choice.  But the 
"limit 1" apparently doesn't alter the planner's strategy at all.  We were surprised by this.

Adding the "order by" was a simple solution.

Craig



pg=> explain analyze select url, url_digest from url_queue where priority >= 
1000 limit 1;
  QUERY PLAN 
--

Limit  (cost=0.00..0.65 rows=1 width=108) (actual time=4036.113..4036.117 
rows=1 loops=1)
  ->  Seq Scan on url_queue  (cost=0.00..391254.35 rows=606176 width=108) 
(actual time=4036.101..4036.101 rows=1 loops=1)
Filter: (priority >= 1000)
Total runtime: 4036.200 ms
(4 rows)

pg=> explain analyze select url, url_digest from url_queue where priority >= 
1000 order by priority limit 1;
  QUERY PLAN 
--

Limit  (cost=0.00..2.38 rows=1 width=112) (actual time=32.445..32.448 rows=1 
loops=1)
  ->  Index Scan using url_queue_priority on url_queue  (cost=0.00..1440200.41 
rows=606176 width=112) (actual time=32.434..32.434 rows=1 loops=1)
Index Cond: (priority >= 1000)
Total runtime: 32.566 ms

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Speed Up Offset and Limit Clause

2006-05-16 Thread Craig A. James


Christian Paul Cosinas wrote:

I am creating an application that gets the value of a large table and write
it to a file.

Why I want to use offset and limit is for me to create a threaded
application so that they will not get the same results.

For example:

Thread 1 : gets offset 0 limit 5000
Thread 2 : gets offset 5000 limit 5000
Thread 3 : gets offset 1 limit 5000

And so on...

Would there be any other faster way than what It thought?


In order to return rows 1 to 15000, it must select all rows from zero to 
15000 and then discard the first 1 -- probably not what you were hoping for.

You might add a "thread" column.  Say you want to run ten threads:

  create sequence thread_seq 
increment by 1

minvalue 1 maxvalue 10
cycle
start with 1;

  create table mytable(
 column1integer,
 ... other columns..., 
 thread integer default nextval('thread_seq')

  );

  create bitmap index i_mytable_thread on mytable(thread);

Now whenever you insert into mytable, you get a value in mytable.thread between 
1 and 10, and it's indexed with a highly efficient bitmap index.  So your query 
becomes:

  Thread 1:  select ... from mytable where ... and thread = 1;
  Thread 2:  select ... from mytable where ... and thread = 2;
  ... and so forth.

Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Postgres gets stuck

2006-05-11 Thread Craig A. James

Tom Lane wrote:

>My suspicion is that it's an incompatibility between malloc()
>libraries.

On Linux there's only supposed to be one malloc, ie, glibc's version.
On other platforms I'd be worried about threaded vs non-threaded libc
(because the backend is not threaded), but not Linux.

I guess I misinterpreted the Postgress manual, which says (in 31.9, "C Language 
Functions"),

   "When allocating memory, use the PostgreSQL functions palloc and pfree
   instead of the corresponding C library functions malloc and free."

I imagined that perhaps palloc/pfree used mutexes for something.  But if I 
understand you, palloc() and pfree() are just wrappers around malloc() and 
free(), and don't (for example) make their own separate calls to brk(2), 
sbrk(2), or their kin.  If that's the case, then you answered my question - 
it's all ordinary malloc/free calls in the end, and that's not the source of 
the problem.

There may be a more basic threading problem here, though, rooted in the
precise fact that the backend isn't threaded.  If you're trying to use
any libraries that assume they can have multiple threads, I wouldn't be
at all surprised to see things go boom.

No threading anywhere.  None of the libraries use threads or mutexes.  It's 
just plain old vanilla C/C++ scientific algorithms.

 C++ exception handling could be problematic too.

No C++ exceptions are thrown anywhere in the code, 'tho I suppose one of the 
I/O libraries could throw an exception, e.g. when reading from a file.  But 
there's no evidence of this after millions of identical operations succeeded.  
In addition, the stack trace shows it to be stuck in a memory operation, not an 
I/O operation.

Or it could be a garden variety glibc bug.  How up-to-date is your
platform?

I guess this is the next place to look.  From the few answers I've gotten, it 
sounds like this isn't a known Postgres issue, and my stack trace doesn't seem 
to be familiar to anyone on this forum.  Oh well... thanks for your help.

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Postgres gets stuck

2006-05-11 Thread Craig A. James


Chris wrote:


This is a deadly bug, because our web site goes dead when this 
happens, ...


Sounds like a deadlock issue.
...
stats_command_string = true
and restart postgresql.
then you'll be able to:
select * from pg_stat_activity;
to see what queries postgres is running and that might give you some clues.


Thanks, good advice.  You're absolutely right, it's stuck on a mutex.  After doing what you 
suggest, I discovered that the query in progress is a user-written function (mine).  When I log 
in as root, and use "gdb -p " to attach to the process, here's what I 
find.  Notice the second function in the stack, a mutex lock:

(gdb) bt
#0  0x0087f7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0096cbfe in __lll_mutex_lock_wait () from /lib/tls/libc.so.6
#2  0x008ff67b in _L_mutex_lock_3220 () from /lib/tls/libc.so.6
#3  0x4f5fc1b4 in ?? ()
#4  0x00dc5e64 in std::string::_Rep::_S_empty_rep_storage () from 
/usr/local/pgsql/lib/libchmoogle.so
#5  0x009ffcf0 in ?? () from /usr/lib/libz.so.1
#6  0xbfe71c04 in ?? ()
#7  0xbfe71e50 in ?? ()
#8  0xbfe71b78 in ?? ()
#9  0x009f7019 in zcfree () from /usr/lib/libz.so.1
#10 0x009f7019 in zcfree () from /usr/lib/libz.so.1
#11 0x009f8b7c in inflateEnd () from /usr/lib/libz.so.1
#12 0x00c670a2 in ~basic_unzip_streambuf (this=0xbfe71be0) at 
zipstreamimpl.h:332
#13 0x00c60b61 in OpenBabel::OBConversion::Read (this=0x1, pOb=0xbfd923b8, 
pin=0xffea) at istream:115
#14 0x00c60fd8 in OpenBabel::OBConversion::ReadString (this=0x8672b50, 
pOb=0xbfd923b8) at obconversion.cpp:780
#15 0x00c19d69 in chmoogle_ichem_mol_alloc () at stl_construct.h:120
#16 0x00c1a203 in chmoogle_ichem_normalize_parent () at stl_construct.h:120
#17 0x00c1b172 in chmoogle_normalize_parent_sdf () at vector.tcc:243
#18 0x0810ae4d in ExecMakeFunctionResult ()
#19 0x0810de2e in ExecProject ()
#20 0x08115972 in ExecResult ()
#21 0x08109e01 in ExecProcNode ()
#22 0x0020 in ?? ()
#23 0xbed4b340 in ?? ()
#24 0xbf92d9a0 in ?? ()
#25 0xbed4b0c0 in ?? ()
#26 0x in ?? ()

It looks to me like my code is trying to read the input parameter (a fairly 
long string, maybe 2K) from a buffer that was gzip'ed by Postgres for the trip 
between the client and server.  My suspicion is that it's an incompatibility 
between malloc() libraries.  libz (gzip compression) is calling something 
called zcfree, which then appears to be intercepted by something that's 
(probably statically) linked into my library.  And somewhere along the way, a 
mutex gets set, and then ... it's stuck forever.

ps(1) shows that this thread had been running for about 7 hours, and the job 
status showed that this function had been successfully called about 1 million 
times, before this mutex lock occurred.

Any ideas?

Thanks,
Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] Postgres gets stuck

2006-05-09 Thread Craig A. James


I'm having a rare but deadly problem.  On our web servers, a process occasionally gets stuck, and 
can't be unstuck.  Once it's stuck, all Postgres activities cease.  "kill -9" is required 
to kill it -- signals 2 and 15 don't work, and "/etc/init.d/postgresql stop" fails.

Here's what the process table looks like:

$ ps -ef | grep postgres
postgres 30713 1  0 Apr24 ?00:02:43 /usr/local/pgsql/bin/postmaster 
-p 5432 -D /disk3/postgres/data
postgres 25423 30713  0 May08 ?00:03:34 postgres: writer process
postgres 25424 30713  0 May08 ?00:00:02 postgres: stats buffer process
postgres 25425 25424  0 May08 ?00:00:02 postgres: stats collector 
process
postgres 11918 30713 21 07:37 ?02:00:27 postgres: production webuser 
127.0.0.1(21772) SELECT
postgres 31624 30713  0 16:11 ?00:00:00 postgres: production webuser 
[local] idle
postgres 31771 30713  0 16:12 ?00:00:00 postgres: production webuser 
127.0.0.1(12422) idle
postgres 31772 30713  0 16:12 ?00:00:00 postgres: production webuser 
127.0.0.1(12421) idle
postgres 31773 30713  0 16:12 ?00:00:00 postgres: production webuser 
127.0.0.1(12424) idle
postgres 31774 30713  0 16:12 ?00:00:00 postgres: production webuser 
127.0.0.1(12425) idle
postgres 31775 30713  0 16:12 ?00:00:00 postgres: production webuser 
127.0.0.1(12426) idle
postgres 31776 30713  0 16:12 ?00:00:00 postgres: production webuser 
127.0.0.1(12427) idle
postgres 31777 30713  0 16:12 ?00:00:00 postgres: production webuser 
127.0.0.1(12428) idle

The SELECT process is the one that's stuck.  top(1) and other indicators show that 
nothing is going on at all (no CPU usage, normal memory usage); the process seems to be 
blocked waiting for something.  (The "idle" processes are attached to a FastCGI 
program.)

This has happened on *two different machines*, both doing completely different 
tasks.  The first one is essentially a read-only warehouse that serves lots of 
queries, and the second one is the server we use to load the warehouse.  In 
both cases, Postgres has been running for a long time, and is issuing SELECT 
statements that it's issued millions of times before with no problems.  No 
other processes are accessing Postgres, just the web services.

This is a deadly bug, because our web site goes dead when this happens, and it 
requires an administrator to log in and kill the stuck postgres process then 
restart Postgres.  We've installed failover system so that the web site is 
diverted to a backup server, but since this has happened twice in one week, 
we're worried.

Any ideas?

Details:

   Postgres 8.0.3
   Linux 2.6.12-1.1381_FC3smp i686 i386

   Dell 2-CPU Xeon system (hyperthreading is enabled)
   4 GB memory
   2 120 GB disks (SATA on machine 1, IDE on machine 2)

Thanks,
Craig

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Memory and/or cache issues?

2006-05-05 Thread Craig A. James


mcelroy, tim wrote:
Sorry, been up all night and maybe provided too much information or not 
the right information and only confused folks, tired I guess.  When I 
say 'in use' I am referring to the 'used' column.  Thanks all who have 
responded to this inquiry, I appreciate it.


Here's free from PROD001:
[EMAIL PROTECTED] kernel]# free -k -t
 total   used   free sharedbuffers cached
Mem:   76435366975772 667764  0 1654965393396
-/+ buffers/cache:14168806226656
Swap:  8185108   52088179900
Total:1582864469809808847664


On Linux (unlike most Unix systems), "used" includes both processes AND the kernel's file-system buffers, which means 
"used" will almost always be close to 100%.  Starting with a freshly-booted system, you can issue almost any command that 
scans files, and "used" will go up and STAY at nearly 100% of memory.  For example, reboot and try "tar cf - / 
>/dev/null" and you'll see the same sort of "used" numbers.

In My Humble Opinion, this is a mistake in Linux.  This confuses just about everyone the first time 
they see it (including me), because the file-system buffers are dynamic and will be relenquished by 
the kernel if another process needs memory.  On Unix systems, "used" means, "someone 
else is using it and you can't have it", which is what most of us really want to know.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Craig A. James


Jim C. Nasby wrote:
No, I don't agree with this.  Too many people waste time designing for 
"what if..." scenarios that never happen.  You don't want to be dumb and 
design something that locks out a foreseeable and likely future need, but 
referential integrity doesn't meet this criterion.  There's nothing to keep 
you from changing from app-managed to database-managed referential 
integrity if your needs change.


In this case your argument makes no sense, because you will spend far
more time re-creating RI capability inside an application than if you
just use what the database offers natively.


But one of the specific conditions in my original response was, "You have 
application-specific knowledge about when you can skip referential integrity and thereby 
greatly improve performance."  If you can't do that, I agree with you.

Anyway, this discussion is probably going on too long, and I'm partly to blame. 
 I think we all agree that in almost all situations, using the database to do 
referential integrity is the right choice, and that you should only violate 
this rule if you have a really, really good reason, and you've thought out the 
implications carefully, and you know you may have to pay a steep price later if 
your requirements change.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Craig A. James


Jim C. Nasby wrote:
1. You have only one application that modifies the data.  (Otherwise, you 
have to duplicate the rules across many applications, leading to a 
code-maintenance nightmare).


You forgot something:

1a: You know that there will never, ever, ever, ever, be any other
application that wants to talk to the database.

I know tons of people that get burned because they go with something
that's "good enough for now", and then regret that decision for years to
come.


No, I don't agree with this.  Too many people waste time designing for "what 
if..." scenarios that never happen.  You don't want to be dumb and design something 
that locks out a foreseeable and likely future need, but referential integrity doesn't 
meet this criterion.  There's nothing to keep you from changing from app-managed to 
database-managed referential integrity if your needs change.

Design for your current requirements.


Let us be of good cheer, remembering that the misfortunes hardest to bear are 
those which never happen.		- James Russell Lowell (1819-1891)


Therefore do not be anxious about tomorrow, for tomorrow will be anxious for 
itself.  Let the day's own trouble be sufficient for the day.

- Matthew 6:34

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] FOREIGN KEYS vs PERFORMANCE

2006-04-12 Thread Craig A. James

  I think this is an old question, but I want to know if it really 
is  well worth to not create some foreign keys an deal with the 
referential  integrity at application-level?



Trust me : do it in the application and you'll enter a world of 
hurt. I'm  doing it with some mysql apps, and it's a nightmare ; doing 
cascaded  delete's by hand, etc, you always forget something, you have 
to modify a  million places in your code everytime you add a new table, 
your ORM  bloats, you get to write cleanup cron scripts which take 
forever to run,  your website crashes etc.


All good advice, but... there are no absolutes in this world.  
Application-enforced referential integrity makes sense if (and probably ONLY 
if):

1. You have only one application that modifies the data.  (Otherwise, you have 
to duplicate the rules across many applications, leading to a code-maintenance 
nightmare).

2. If your application crashes and leaves a mess, it's not a catastrophe, and 
you have a good way to clean it up.  For example, a bank shouldn't do this, but 
it might be OK for a computer-aided-design application, or the backend of a 
news web site.

3. You have application-specific knowledge about when you can skip referential 
integrity and thereby greatly improve performance.  For example, you may have 
batch operations where large numbers of rows are temporarily inconsistent.

If your application doesn't meet ALL of these criteria, you probably should use 
the database for referential integrity.

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Decide between Postgresql and Mysql (help of

2006-03-29 Thread Craig A. James


This is off-topic for this group so I'll just give a brief reply; I'm happy to 
carry on more just between the two of us...

Gorshkov wrote:
That being said . what *is* the difference between coding a website - 
major or otherwise - in an "old-fashioned" compiled language and a 
non-compiled language, except for the amount of hoursepower and memory you 
require?


Old-fashioned doesn't mean bad, inappropriate, or inferior. It's just not the 
latest-and-greatest, however it's currently defined by the geek fashion 
police.


Our experience coding web sites with C/C++ versus Perl is about a factor of ten 
in productivity.  We only use C/C++ for CPU-intensive calculations, such as 
scientific prediction code.  Everything else is Perl or Java.

I recently re-coded 10,000 lines of C into 650 lines of Perl.  Why?  String 
handling, hash tables, and the simplicity of DBD/DBI.  And there was no loss of 
performance, because the app was strictly I/O bound (that is, Postgres was I/O 
bound).  Sure, the old app may not have been optimal, but we're talking about a 
factor of 15 reduction in lines of code.

That's not "geek fashion", it's good engineering.  Pick the best tool for the 
job, and learn how to use it.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Decide between Postgresql and Mysql (help of

2006-03-29 Thread Craig A. James


Gorshkov wrote:

/flame on
if you were *that* worried about performance, you wouldn't be using PHP or 
*any* interperted language

/flame off

sorry - couldn't resist it :-)


I hope this was just a joke.  You should be sure to clarify - there might be 
some newbie out there who thinks you are seriously suggesting coding major web 
sites in some old-fashioned compiled language.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Poor performance o

2006-03-21 Thread Craig A. James


Tom Lane wrote:

"Craig A. James" <[EMAIL PROTECTED]> writes:

It looks to me like the problem is the use of nested loops when a hash
join should be used, but I'm no expert at query planning.


Given the sizes of the tables involved, you'd likely have to boost up
work_mem before the planner would consider a hash join.  What nondefault
configuration settings do you have, anyway?


shared_buffers = 2
work_mem = 32768
effective_cache_size = 30

This is on a 4GB machine.  Is there a guideline for work_mem that's related to table 
size?  Something like, "allow 2 MB per million rows"?

I'm also curious why the big difference between my "Query #1" and "Query #2".  
Even though it does a nested loop, #2's outer loop only returns one result from a very tiny table, 
so shouldn't it be virtually indistinguishable from #1?

Thanks,
Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] Poor performance o

2006-03-21 Thread Craig A. James


I'm reposting this -- I sent this out a month ago but never got a response, and 
hope someone can shed some light on this.

Thanks,
Craig

--

This is a straightforward query that should be fairly quick, but takes about 30 
minutes.  It's a query across three tables, call them A, B, and C.  The tables 
are joined on indexed columns.

Here's a quick summary:

Table A -> Table B -> Table C
A_ID   B_ID   C_ID
   A_ID   NAME
   C_ID

Tables A and B have 6 million rows each.  Table C is small: 67 names, no 
repeats.  All columns involved in the join are indexed.  The database has been 
full-vacuumed and analyzed.

Summary:

1. Query B only:2.7 seconds, 302175 rows returned
2. Join B and C:4.3 seconds, exact same answer
3. Join A and B:7.2 minutes, exact same answer
4. Join A, B, C:32.7 minutes, exact same answer

Looking at these:

Query #1 is doing the real work: finding the rows of interest.

Queries #1 and #2 ought to be virtually identical, since Table C has
just one row with C_ID = 9, but the time almost doubles.

Query #3 should take a bit longer than Query #1 because it has to join
300K rows, but the indexes should make this take just a few seconds,
certainly well under a minute. 


Query #4 should be identical to Query #3, again because there's only
one row in Table C.  32 minutes is pretty horrible for such a
straightforward query.

It looks to me like the problem is the use of nested loops when a hash join 
should be used, but I'm no expert at query planning.

This is psql 8.0.3.  Table definitions are at the end.  Hardware is a Dell, 
2-CPU Xeon, 4 GB memory, database is on a single SATA 7200RPM disk.

These table and column names are altered to protect the guilty, otherwise these 
are straight from Postgres.


QUERY #1:
-

explain analyze select B.A_ID from B where B.B_ID = 9;

Index Scan using i_B_B_ID on B  (cost=0.00..154401.36 rows=131236 width=4) 
(actual time=0.158..1387.251 rows=302175 loops=1)
Index Cond: (B_ID = 9)
Total runtime: 2344.053 ms


QUERY #2:
-

explain analyze select B.A_ID from B join C on (B.C_ID = C.C_ID) where C.name = 
'Joe';

Nested Loop  (cost=0.00..258501.92 rows=177741 width=4) (actual 
time=0.349..3392.532 rows=302175 loops=1)
->  Seq Scan on C  (cost=0.00..12.90 rows=1 width=4) (actual time=0.232..0.336 
rows=1 loops=1)
Filter: ((name)::text = 'Joe'::text)
->  Index Scan using i_B_C_ID on B  (cost=0.00..254387.31 rows=328137 width=8) 
(actual time=0.102..1290.002 rows=302175 loops=1)
Index Cond: (B.C_ID = "outer".C_ID)
Total runtime: 4373.916 ms


QUERY #3:
-

explain analyze
select A.A_ID from A
join B on (A.A_ID = B.A_ID)where B.B_ID = 9;

Nested Loop  (cost=0.00..711336.41 rows=131236 width=4) (actual 
time=37.118..429419.347 rows=302175 loops=1)
->  Index Scan using i_B_B_ID on B  (cost=0.00..154401.36 rows=131236 width=4) 
(actual time=27.344..8858.489 rows=302175 loops=1)
Index Cond: (B_ID = 9)
->  Index Scan using pk_A_test on A  (cost=0.00..4.23 rows=1 width=4) (actual 
time=1.372..1.376 rows=1 loops=302175)
Index Cond: (A.A_ID = "outer".A_ID)
Total runtime: 430467.686 ms


QUERY #4:
-
explain analyze
select A.A_ID from A
join B on (A.A_ID = B.A_ID)
join C on (B.B_ID = C.B_ID)
where C.name = 'Joe';

Nested Loop  (cost=0.00..1012793.38 rows=177741 width=4) (actual 
time=70.184..1960112.247 rows=302175 loops=1)
->  Nested Loop  (cost=0.00..258501.92 rows=177741 width=4) (actual 
time=52.114..17753.638 rows=302175 loops=1)
->  Seq Scan on C  (cost=0.00..12.90 rows=1 width=4) (actual 
time=0.109..0.176 rows=1 loops=1)
  Filter: ((name)::text = 'Joe'::text)
->  Index Scan using i_B_B_ID on B  (cost=0.00..254387.31 rows=328137 
width=8) (actual time=51.985..15566.896 rows=302175 loops=1)
  Index Cond: (B.B_ID = "outer".B_ID)
->  Index Scan using pk_A_test on A  (cost=0.00..4.23 rows=1 width=4) (actual 
time=6.407..6.412 rows=1 loops=302175)
Index Cond: (A.A_ID = "outer".A_ID)
Total runtime: 1961200.079 ms


TABLE DEFINITIONS:
--

xxx => \d a
   Table "xxx.a"
 Column   |  Type  | Modifiers
--++---
 a_id | integer| not null
 ... more columns

Indexes:
 "pk_a_id" PRIMARY KEY, btree (a_id)
   ... more indexes on other columns

xxx => \d b
   Table "xxx.b"
 Column  |  Type  | Modifiers
-++---
  b_id   | integer| not null
  a_id   | integer| not null
  c_id   | integer| not null
  ... more columns

Indexes:
 "b_pkey" PRIMARY KEY, btree (b_id)
 "i_b_a_id" btree (a_id)
 "i_b_c_id" btree (c_id)


xxx=> \d c
   Table "xxx.c"
   Column |  Type  | Modifiers
-

[PERFORM] update == delete + insert?

2006-03-20 Thread Craig A. James


I've seen it said here several times that "update == delete + insert".  On the other 
hand, I've noticed that "alter table [add|drop] column ..." is remarkably fast, even for 
very large tables, which leads me to wonder whether each column's contents are in a file 
specifically for that column.

My question: Suppose I have a very "wide" set of data, say 100 columns, and one 
of those columns will be updated often, but the others are fairly static.  I have two 
choices:

Design 1:
  create table a (
id integer,
frequently_updated  integer);

  create table b(
id integer,
infrequently_updated_1 integer,
infrequently_updated_2 integer,
infrequently_updated_3 integer,
... etc.
infrequently_updated_99 integer);

Design 2:
  create table c(
id integer,
frequently_updated  integer,
infrequently_updated_1 integer,
infrequently_updated_2 integer,
infrequently_updated_3 integer,
... etc.
infrequently_updated_99 integer);

If "update == delete + insert" is strictly true, then "Design 2" would be poor 
since 99 columns would be moved around with each update.  But if columns are actually stored in 
separate files, the Designs 1 and 2 would be essentially equivalent when it comes to vacuuming.

Thanks,
Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Migration study, step 1: bulk write performance optimization

2006-03-20 Thread Craig A. James


Mikael Carneholm wrote:


I am responisble for an exciting project of evaluating migration of a
medium/large application for a well-known swedish car&truck manufacturer
... The goal right now is to find the set of parameters that gives as
short bulk insert time as possible, minimizing downtime while the data
itself is migrated.


If you haven't explored the COPY command yet, check it out.  It is stunningly 
fast compared to normal INSERT commands.

  http://www.postgresql.org/docs/8.1/static/sql-copy.html

pg_dump and pg_restore make use of the COPY command.  Since you're coming from 
a different vendor, you'd have to dump the data into a COPY-compatible set of 
files yourself.  But it will be worth the effort.

Craig

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] No vacuum for insert-only database?

2006-03-13 Thread Craig A. James


Alvaro Herrera wrote:
If I only insert data into a table, never update or delete, then I should 
never have to vacuum it.  Is that correct?


You still need to vacuum eventually, to avoid transaction Id wraparound
issues.  But not as often.


Thanks.  Any suggestions for what "not as often" means?  For example, if my 
database will never contain more than 10 million rows, is that a problem?  100 million 
rows?  When does transaction ID wraparound become a problem?

Craig

---(end of broadcast)---
TIP 6: explain analyze is your friend

[PERFORM] No vacuum for insert-only database?

2006-03-13 Thread Craig A. James


If I only insert data into a table, never update or delete, then I should never 
have to vacuum it.  Is that correct?

Thanks,
Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Reliability recommendations

2006-02-24 Thread Craig A. James


Joshua D. Drake wrote:
I find this strains credibility, that this major manufacturer of PC's 
would do something deceptive that hurts performance, when it would be 
easily detected and widely reported.  Can anyone cite a specific 
instances where this has happened?  Such as, "I bought Dell model XYZ, 
which was advertised to have these parts and these specs, but in fact 
had these other parts and here are the actual specs."


I can :)

Feb 20 07:33:52 master kernel: [4294682.803000]   Vendor: MegaRAID  
Model: LD 0 RAID1   51G  Rev: 196T

--- snip ---
This machine... if you run it in raid 5 will only get 7-9 megabytes a 
second READ! performance. That is with 6 SCSI drives.
If you run it in RAID 10 you get a more reasonable 50-55 megabytes per 
second.


But you don't say how this machine was advertised.  Are there components in 
that list that were not as advertised?  Was the machine advertised as capable 
of RAID 5?  Were performance figures published for RAID 5?

If Dell advertised that the machine could do what you asked, then you're right 
-- they screwed you.  But if it was designed for and advertised to a different 
market, then I've made my point: People are blaming Dell for something that's 
not their fault.

Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Reliability recommendations

2006-02-24 Thread Craig A. James


Bruce Momjian wrote:

Dell often says part X is included, but part X is not the exact same as
part X sold by the original manufacturer.  To hit a specific price
point, Dell is willing to strip thing out of commodity hardware, and
often does so even when performance suffers.  For many people, this is
unacceptable.


I find this strains credibility, that this major manufacturer of PC's would do something 
deceptive that hurts performance, when it would be easily detected and widely reported.  
Can anyone cite a specific instances where this has happened?  Such as, "I bought 
Dell model XYZ, which was advertised to have these parts and these specs, but in fact had 
these other parts and here are the actual specs."

Dell seems to take quite a beating in this forum, and I don't recall seeing any other 
manufacturer blasted this way.  Is it that they are deceptive, or simply that their 
"servers" are designed to be office servers, not database servers?

There's nothing wrong with Dell designing their servers for a different market 
than ours; they need to go for the profits, and that may not include us.  But 
it's not fair for us to claim Dell is being deceptive unless we have concrete 
evidence.

Craig


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Cost Issue - How do I force a Hash Join

2006-02-20 Thread Craig A. James


"Virag Saksena" <[EMAIL PROTECTED]> writes:

The individual queries run in 50-300 ms. However the optimizer is
choosing a nested loop to join them rather than a Hash join...


I have what appears to be the identical problem.

This is a straightforward query that should be fairly quick, but takes about 30 
minutes.  It's a query across three tables, call them A, B, and C.  The tables 
are joined on indexed columns.

Here's a quick summary:

  Table A -> Table B -> Table C
A_ID   B_ID   C_ID
   A_ID   NAME
   C_ID

Tables A and B have 6 million rows each.  Table C is small: 67 names, no 
repeats.  All columns involved in the join are indexed.

Summary: 
  1. Query B only:2.7 seconds, 302175 rows returned

  2. Join B and C:4.3 seconds, exact same answer
  3. Join A and B:7.2 minutes, exact same answer
  4. Join A, B, C:32.7 minutes, exact same answer

Looking at these:

  Query #1 is doing the real work: finding the rows of interest.

  Queries #1 and #2 ought to be virtually identical, since Table C has
  just one row with C_ID = 9, but the time almost doubles.

  Query #3 should take a bit longer than Query #1 because it has to join
  300K rows, but the indexes should make this take just a few seconds,
  certainly well under a minute.  


  Query #4 should be identical to Query #3, again because there's only
  one row in Table C.  32 minutes is pretty horrible for such a
  straightforward query.

It looks to me like the problem is the use of nested loops when a hash join 
should be used, but I'm no expert at query planning.

This is psql 8.0.3.  Table definitions are at the end.  (Table and column names are 
altered to protect the guilty, otherwise these are straight from Postgres.)  I ran 
"vacuum full analyze" after the last data were added.  Hardware is a Dell, 
2-CPU Xeon, 4 GB memory, database is on a single SATA 7200RPM disk.

Thanks,
Craig




QUERY #1:
-

explain analyze select B.A_ID from B where B.B_ID = 9;

Index Scan using i_B_B_ID on B  (cost=0.00..154401.36 rows=131236 width=4) 
(actual time=0.158..1387.251 rows=302175 loops=1)
  Index Cond: (B_ID = 9)
Total runtime: 2344.053 ms


QUERY #2:
-

explain analyze select B.A_ID from B join C on (B.C_ID = C.C_ID) where C.name = 
'Joe';

Nested Loop  (cost=0.00..258501.92 rows=177741 width=4) (actual 
time=0.349..3392.532 rows=302175 loops=1)
  ->  Seq Scan on C  (cost=0.00..12.90 rows=1 width=4) (actual 
time=0.232..0.336 rows=1 loops=1)
Filter: ((name)::text = 'Joe'::text)
  ->  Index Scan using i_B_C_ID on B  (cost=0.00..254387.31 rows=328137 
width=8) (actual time=0.102..1290.002 rows=302175 loops=1)
Index Cond: (B.C_ID = "outer".C_ID)
Total runtime: 4373.916 ms


QUERY #3:
-

explain analyze
  select A.A_ID from A
join B on (A.A_ID = B.A_ID) 
   where B.B_ID = 9;


Nested Loop  (cost=0.00..711336.41 rows=131236 width=4) (actual 
time=37.118..429419.347 rows=302175 loops=1)
  ->  Index Scan using i_B_B_ID on B  (cost=0.00..154401.36 rows=131236 
width=4) (actual time=27.344..8858.489 rows=302175 loops=1)
Index Cond: (B_ID = 9)
  ->  Index Scan using pk_A_test on A  (cost=0.00..4.23 rows=1 width=4) (actual 
time=1.372..1.376 rows=1 loops=302175)
Index Cond: (A.A_ID = "outer".A_ID)
Total runtime: 430467.686 ms


QUERY #4:
-
explain analyze
  select A.A_ID from A
join B on (A.A_ID = B.A_ID)
join C on (B.B_ID = C.B_ID)
where C.name = 'Joe';

Nested Loop  (cost=0.00..1012793.38 rows=177741 width=4) (actual 
time=70.184..1960112.247 rows=302175 loops=1)
  ->  Nested Loop  (cost=0.00..258501.92 rows=177741 width=4) (actual 
time=52.114..17753.638 rows=302175 loops=1)
->  Seq Scan on C  (cost=0.00..12.90 rows=1 width=4) (actual 
time=0.109..0.176 rows=1 loops=1)
  Filter: ((name)::text = 'Joe'::text)
->  Index Scan using i_B_B_ID on B  (cost=0.00..254387.31 rows=328137 
width=8) (actual time=51.985..15566.896 rows=302175 loops=1)
  Index Cond: (B.B_ID = "outer".B_ID)
  ->  Index Scan using pk_A_test on A  (cost=0.00..4.23 rows=1 width=4) (actual 
time=6.407..6.412 rows=1 loops=302175)
Index Cond: (A.A_ID = "outer".A_ID)
Total runtime: 1961200.079 ms


TABLE DEFINITIONS:
--

xxx => \d a
   Table "xxx.a"
 Column   |  Type  | Modifiers 
---++---

a_id  | integer| not null
... more columns

Indexes:
   "pk_a_id" PRIMARY KEY, btree (a_id)
   ... more indexes on other columns

xxx => \d b
   Table "xxx.b"
 Column  |  Type  | Modifiers 
--++---

b_id | integer| not null
a_id | integer| not null
c_id | inte

1 2 >

1 - 100 of 135 matches

Mail list logo