from:"Andreas Kostyrka"

Re: [PERFORM] POSIX file updates

2008-04-02 Thread Andreas Kostyrka


Am Mittwoch, den 02.04.2008, 20:10 +0100 schrieb James Mansion:
 It strikes me as odd that fsync_writethrough isn't the most preferred 
 option where
 it is implemented. The postgres approach of *requiring* that there be no 
 cache
 below the OS is problematic, especially since the battery backup on internal
 array controllers is hardly the handiest solution when you find the mobo 
 has died.

Well, that might sound brutal, but I'm having today a brute day.

There are some items here.

1.) PostgreSQL relies on filesystem semantics. Which might be better or
worse then the raw devices other RDBMS use as an interface, but in the
end it is just an interface. How well that works out depends strongly on
your hardware selection, your OS selection and so on. DB tuning is an
scientific art form ;) Worse the fact that raw devices work better on
hardware X/os Y than say filesystems is only of limited interest, only
if you happen to have already an investement in X or Y. In the end the
questions are is the performance good enough, is the data safety good
enough, and at which cost (in money, work, ...).

2.) data safety requirements vary strongly. In many (if not most) cases,
the recovery of the data on a failed hardware is not critical. Hint:
being down till somebody figures out what failed, if the rest of the
system is still stable, and so on are not acceptable at all. Meaning the
moment that the database server has any problem, one of the hot standbys
takes over. The thing you worry about is if all data has made it to the
replication servers, not if some data might get lost in the hardware
cache of a controller. (Actually, talk to your local computer forensics
guru, there are a number of way to keep the current to electronics while
moving them.)

3.) a controller cache is an issue if you have a filesystem in your data
path or not. If you do raw io, and the stupid hardware do cache writes,
well it's about as stupid as it would be if it would have cached
filesystem writes.

Andreas


 And especially when the inability to flush caches on modern SATA and SAS
 drives would appear to be more a failing in some operating systems than in
 the drives themselves..
 
 The links I've been accumulating into my bibliography include:
 
 http://www.h2database.com/html/advanced.html#transaction_isolation
 http://lwn.net/Articles/270891/
 http://article.gmane.org/gmane.linux.kernel/646040
 http://lists.apple.com/archives/darwin-dev/2005/Feb/msg00072.html
 http://brad.livejournal.com/2116715.html
 
 And your handy document on wal tuning, of course.
 
 James
 
 


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

Re: [PERFORM] PostgreSQL 8.0 occasionally slow down

2007-06-28 Thread Andreas Kostyrka

Two points:

* need more information about the circumstances.

* could it be that autovaccum hits you?

Andreas

-- Ursprüngl. Mitteil. --
Betreff:[PERFORM] PostgreSQL 8.0 occasionally slow down
Von:Ho Fat Tsang [EMAIL PROTECTED]
Datum:  28.06.2007 06:56

Hi,

   I am new for postgresql server. And now i work on a projects which
requires postgreSQL 8.0 and Java. I don't know why the server occasionally
slow down a bit for every 3 minutes.
I have changed the log configuration so that it logs all statement
transaction  1000 ms and the result shown below :


elf2 2007-06-28 14:30:25 HKT 46835574.7a64 LOG:  duration: 1494.109 ms
statement: commit;begin;
elf2 2007-06-28 14:33:34 HKT 468354a8.7415 LOG:  duration: 1048.429 ms
statement: commit;begin;
elf2 2007-06-28 14:33:35 HKT 468354a9.7418 LOG:  duration: 1580.120 ms
statement: commit;begin;
elf2 2007-06-28 14:33:37 HKT 468354a9.7418 LOG:  duration: 1453.620 ms
statement: commit;begin;
elf2 2007-06-28 14:36:51 HKT 468354a9.7419 LOG:  duration: 1430.019 ms
statement: commit;begin;
elf2 2007-06-28 14:36:53 HKT 468354a9.7418 LOG:  duration: 1243.886 ms
statement: commit;begin;
elf2 2007-06-28 14:36:54 HKT 468354a9.7419 LOG:  duration: 1491.821 ms
statement: commit;begin;
elf2 2007-06-28 14:36:54 HKT 468354a9.7418 LOG:  duration: 1266.516 ms
statement: commit;begin;
...
...
elf2 2007-06-28 14:40:54 HKT 468354a9.741b LOG:  duration: 1776.466 ms
statement: commit;begin;
elf2 2007-06-28 14:40:54 HKT 468357ec.d5a LOG:  duration: 1500.132 ms
statement: commit;begin;
...
...
elf2 2007-06-28 14:44:07 HKT 46835477.73b7 LOG:  duration: 1011.216 ms
statement: commit;begin;
elf2 2007-06-28 14:44:12 HKT 46835477.73b7 LOG:  duration: 1009.187 ms
statement: commit;begin;
elf2 2007-06-28 14:44:13 HKT 468352f9.7194 LOG:  duration: 1086.769 ms
statement: commit;begin;
elf2 2007-06-28 14:44:14 HKT 46835477.73b7 LOG:  duration: 1481.627 ms
statement: commit;begin;
   ...
   ...
elf2 2007-06-28 14:47:44 HKT 468354a9.7419 LOG:  duration: 10513.208 ms
statement: commit;begin;
elf2 2007-06-28 14:48:22 HKT 468354a9.7419 LOG:  duration: 38126.708 ms
statement: commit;begin;



For each 3 ~ 4 minutes , there are many transactions which requires (1
seconds) for execution. It is strange for me seems the tables size is quite
small (~ 10K  20K row). I can said the rate of incoming transactions is
quite steady through our the testing. So i am quite confusing why the
performance degrades for every 3 ~ 4 minutes. I am wondering if there is any
default scheduled task in the postgreSQL 8.0

The configurations which i have amended in postgresql.conf.

max_fsm_pages = 10
vacuum_cost_delay = 10

The machine using :
512 RAM
Gentoo Linux

Do anyone can help me about this ? or any resolution for a sudden
performance degrade ( because the application i need to develop is quite
time-critical).

Thank.
Twinsen


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] PREPARE and stuff

2007-06-23 Thread Andreas Kostyrka

Well, that's not completely trivial = the plan might depend upon the concrete 
value of $1,$2 and $3.

Andreas

-- Ursprüngl. Mitteil. --
Betreff:[PERFORM] PREPARE and stuff
Von:PFC [EMAIL PROTECTED]
Datum:  23.06.2007 21:31


Suppose a web application with persistent database connections.
I have some queries which take longer to plan than to execute !

I with there was a way to issue a PREPARE (like PERSISTENT PREPARE).
Now all Postgres connections would know that prepared statement foo( 
$1,  
$2, $3 ) corresponds to some SQL query, but it wouldn't plan it yet. Just  
like a SQL function.
When invoking EXECUTE foo( 1,2,3 ) on any given connection the 
statement  
would get prepared and planned. Then on subsequent invocations I'd just  
get the previously prepared plan.

Is this planned ?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] PITR Backups

2007-06-22 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Wasn't it select pg_start_backup('backuplabel');?

Andreas

Kurt Overberg wrote:
 You can use the psql command line to run:
 
 select pg_start_backup();
 
 ...then when you're done,
 
 select pg_stop_backup();
 
 if you want an example from the unix command line:
 
 psql -c select pg_start_backup(); database_name
 
 then
 
 psql -c select pg_stop_backup(); database_name
 
 /kurt
 
 
 On Jun 22, 2007, at 7:51 AM, Dan Gorman wrote:
 
 Ah okay. I understand now. So how can I signal postgres I'm about to
 take a backup ? (read doc from previous email ? )

 Regards,
 Dan Gorman

 On Jun 22, 2007, at 4:38 AM, Simon Riggs wrote:

 On Fri, 2007-06-22 at 04:10 -0700, Dan Gorman wrote:
 This snapshot is done at the LUN (filer) level, postgres is un-aware
 we're creating a backup, so I'm not sure how pg_start_backup() plays
 into this ...

 Postgres *is* completely unaware that you intend to take a backup, that
 is *exactly* why you must tell the server you intend to make a backup,
 using pg_start_backup() and pg_stop_backup(). That way Postgres will
 flush its buffers, so that they are present on storage when you make the
 backup.

 Is the procedure for Oracle or any other transactional RDBMS any
 different?

 --  Simon Riggs
   EnterpriseDB   http://www.enterprisedb.com



 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match



 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq
 
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGe7zyHJdudm4KnO0RAgyaAJ9Vz52izICKYkep/wZpJMFPkfAiuQCfZcjB
yUYM6rYu18HmTAs3F4VaGJo=
=n3vX
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Maintenance question / DB size anomaly...

2007-06-20 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Kurt Overberg wrote:
 OOookay.   Since the discussion has wandered a bit I just
 wanted to restate things in an effort to clear the problem in my head.
 
 Okay, so the sl_log_1 TABLE looks okay.  Its the indexes that seem to be
 messed up, specifically sl_log_1_idx1 seems to think that there's
 300,000 rows in the table its associated with.  I just want to fix the
 index, really.  So my question remains:
 
 Its it okay to dump and recreate that index (or reindex it) while the
 servers are down and the database is not being accessed?

Well, I would probably stop the slon daemons = dropping needed indexes
which slony needs can lead to quite a slowdown, and worse, the slowdown
happens because the database server is doing things the wrong way. But
that's mostly what you need to do.

OTOH, depending upon the size of your database, you might consider
starting out from a scratch database.

Andreas

 
 Tom, Bill, Chris and Richard, thank you so much for your thoughts on
 this matter so far.  It helps to not feel so alone when dealing
 with difficult issues (for me anyway) on a system I don't know so much
 about.

[EMAIL PROTECTED], [EMAIL PROTECTED] are quite helpful, and
sometimes faster than mail.

Andreas

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGeTFtHJdudm4KnO0RAqDaAKDB1/eGqdwtLQdpTJzrChcp4J5M5wCglphW
ljxag882h33fDWXX1ILiUU8=
=jzBw
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] [ADMIN] Postgres VS Oracle

2007-06-18 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

It's even harder, as Oracle disallows publishing benchmark figures in
their license. As a cynic, I might ask, what Oracle is fearing?

Andreas

Jonah H. Harris wrote:
 On 6/18/07, David Tokmatchi [EMAIL PROTECTED] wrote:
 Scalability ? Performance? Benchmark ? Availability ? Architecture ?
 Limitation : users, volumes ? Resouces needed ? Support ?
 
 Aside from the Wikipedia database comparison, I'm not aware of any
 direct PostgreSQL-to-Oracle comparison.
 
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGdrfHHJdudm4KnO0RAqKQAJ96t7WkLG/VbqkWTW60g6QC5eU4HgCfShNd
o3+YPVnPJ2nwXcpi4ow28nw=
=1CwN
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] [ADMIN] Postgres VS Oracle

2007-06-18 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Jonah H. Harris wrote:
 On 6/18/07, Andreas Kostyrka [EMAIL PROTECTED] wrote:
 As a cynic, I might ask, what Oracle is fearing?
 
 As a realist, I might ask, how many times do we have to answer this
 type of anti-commercial-database flamewar-starting question?
 

Well, my experience when working with certain DBs is much like I had
some years ago, when I was forced to work with different SCO Unix legacy
boxes. Why do I have to put up with this silliness?, and with
databases there is no way to get a sensible tool set by shopping
around and installing GNU packages en masse :(

Furthermore not being allowed to talk about performance is a real hard
misfeature, like DRM. Consider:

1.) Performance is certainly an important aspect of my work as a DBA.
2.) Gaining experience as a DBA is not trivial, it's clearly a
discipline that cannot be learned from a book, you need experience. As a
developer I can gain experience on my own. As a DBA, I need some nice
hardware and databases that are big enough to be nontrivial.
3.) The above points make it vital to be able to discuss my experiences.
4.) Oracle's license NDA makes exchanging experience harder.

So as an endeffect, the limited number of playing grounds (#2 above)
keeps hourly rates for DBAs high. Oracle's NDA limits secondary
knowledge effects, so in effect it keeps the price for Oracle knowhow
potentially even higher.

Or put bluntly, the NDA mindset benefits completly and only Oracle, and
is a clear drawback for customers. It makes Oracle-supplied consultants
gods, no matter how much hot air they produce. They've got the benefit
of having internal peer knowledge, and as consumer there is not much
that I can do counter it. I'm not even allowed to document externally
the pitfalls and experiences I've made, so the next poor sob will walk
on the same landmine.

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGdsT5HJdudm4KnO0RAoASAJ9b229Uhsuxn9qGfU5I0QUfTC/dqQCfZK/b
65XQFcc0aRBVptxW5uzLejY=
=UIF6
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [GENERAL] [pgsql-advocacy] [PERFORM] [ADMIN] Postgres VS Oracle

2007-06-18 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



PFC wrote:
 
 2. Oracle, Microsoft, and IBM have a lot to fear in the sense of a
 database like PostgreSQL. We can compete in 90-95% of cases where
 people would traditionally purchase a proprietary system for many,
 many thousands (if not hundreds of thousands) of dollars.
 
 Oracle also fears benchmarks made by people who don't know how to
 tune Oracle properly...

Well, bad results are as interesting as good results. And this problems
applies to all other databases.

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGdsXdHJdudm4KnO0RArTkAKCZs6ht4z0lb2zHtr5MfXj8CsTZdQCgmwE5
JAD6Hkul1iIML42GO1vAM0c=
=FMRt
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [pgsql-advocacy] [PERFORM] [ADMIN] Postgres VS Oracle

2007-06-18 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Jonah H. Harris wrote:
 
 All of us have noticed the anti-MySQL bashing based on problems with
 MySQL 3.23... Berkus and others (including yourself, if I am correct),
 have corrected people on not making invalid comparisons against
 ancient versions.  I'm only doing the same where Oracle, IBM, and
 Microsoft are concerned.
 

My, my, I fear my asbestos are trying to feel warm inside ;)

Well, there is not much MySQL bashing going around. And MySQL 5 has
enough features and current MySQL AB support for it is so good, that
there is no need to bash MySQL based on V3 problems. MySQL5 is still a
joke, and one can quite safely predict the answers to tickets, with well
over 50% guess rate.

(Hint: I don't consider the answer: Redo your schema to be a
satisfactory answer. And philosophically, the query optimizer in MySQL
is near perfect. OTOH, considering the fact that many operations in
MySQL still have just one way to execute, it's easy to choose the
fastest plan, isn't it *g*)

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGdsgCHJdudm4KnO0RAg2oAKCdabTyQCcK8eC0+ErVJLlX59nNjgCfQjaO
hhfSxBoESyCU/mTQo3gbQRM=
=RqB7
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [pgsql-advocacy] [PERFORM] [ADMIN] Postgres VS Oracle

2007-06-18 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Jonah H. Harris wrote:
 Certainly, but can one expect to get a realistic answer to an, is
 Oracle fearing something question on he PostgreSQL list?  Or was it
 just a backhanded attempt at pushing the topic again?  My vote is for
 the latter; it served no purpose other than to push the
 competitiveness topic again.

Well, I'm a cynic at heart, really. So there was no bad intend behind it.

And it was a nice comment, because I would base it on my personal
experiences with certain vendors, it wouldn't be near as nice.

The original question was about comparisons between PG and Oracle.

Now, I could answer this question from my personal experiences with the
product and support. That would be way more stronger worded than my
small cynic question.

Another thing, Joshua posted a guesstimate that PG can compete in 90-95%
cases with Oracle. Because Oracle insists on secrecy, I'm somehow
inclined to believe the side that talks openly. And while I don't like
to question Joshua's comment, I think he overlooked one set of problems,
 namely the cases where Oracle is not able to compete with PG. It's hard
to quantify how many of these cases there are performance-wise, well,
because Oracle insists on that silly NDA, but there are clearly cases
where PG is superior.

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGds8WHJdudm4KnO0RAvb0AJ4gBec4yikrAOvDi5C3kc5NLGYteACghewU
PkfrnXgCRfZlEdeMA2DZGTE=
=BpUw
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Replication

2007-06-14 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 Most of our data is replicated offline using custom tools tailored to
 our loading pattern, but we have a small amount of global information,
 such as user signups, system configuration, advertisements, and such,
 that go into a single small (~5-10 MB) global database used by all
 servers.

Slony provides near instantaneous failovers (in the single digit seconds
 range). You can script an automatic failover if the master server
becomes unreachable. That leaves you the problem of restarting your app
(or making it reconnect) to the new master.

5-10MB data implies such a fast initial replication, that making the
server rejoin the cluster by setting it up from scratch is not an issue.


 The problem is, there don't seem to be any vote a new master type of
 tools for Slony-I, and also, if the original master comes back online,
 it has no way to know that a new master has been elected.  So I'd have
 to write a bunch of SOAP services or something to do all of this.

You don't need SOAP services, and you do not need to elect a new master.
if dbX goes down, dbY takes over, you should be able to decide on a
static takeover pattern easily enough.

The point here is, that the servers need to react to a problem, but you
probably want to get the admin on duty to look at the situation as
quickly as possible anyway. With 5-10MB of data in the database, a
complete rejoin from scratch to the cluster is measured in minutes.

Furthermore, you need to checkout pgpool, I seem to remember that it has
some bad habits in routing queries. (E.g. it wants to apply write
queries to all nodes, but slony makes the other nodes readonly.
Furthermore, anything inside a BEGIN is sent to the master node, which
is bad with some ORMs, that by default wrap any access into a transaction)

Andreas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGceUXHJdudm4KnO0RAgh/AJ4kXFpzoQAEnn1B7K6pzoCxk0wFxQCggGF1
mA1KWvcKtfJ6ZcPiajJK1i4=
=eoNN
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Replication

2007-06-14 Thread Andreas Kostyrka

Ok, slony supports two kinds of operation here: failover (which moves the 
master node to a new one without the old master node being present, it also 
drops the old node from replication) and move set (which moves the master node 
with cooperation)

The usecases for these two are slightly different. one is for all kinds of 
scheduled maintenance, while the other is what you do when you've got a 
hardware failure.

Andreas

-- Ursprüngl. Mitteil. --
Betreff:Re: [PERFORM] Replication
Von:Craig James [EMAIL PROTECTED]
Datum:  15.06.2007 01:48

Andreas Kostyrka wrote:
 Slony provides near instantaneous failovers (in the single digit seconds
  range). You can script an automatic failover if the master server
 becomes unreachable.

But Slony slaves are read-only, correct?  So the system isn't fully functional 
once the master goes down.

 That leaves you the problem of restarting your app
 (or making it reconnect) to the new master.

Don't you have to run a Slony app to convert one of the slaves into the master?

 5-10MB data implies such a fast initial replication, that making the
 server rejoin the cluster by setting it up from scratch is not an issue.

The problem is to PREVENT it from rejoining the cluster.  If you have some 
semi-automatic process that detects the dead server and converts a slave to the 
master, and in the mean time the dead server manages to reboot itself (or its 
network gets fixed, or whatever the problem was), then you have two masters 
sending out updates, and you're screwed.

 The problem is, there don't seem to be any vote a new master type of
 tools for Slony-I, and also, if the original master comes back online,
 it has no way to know that a new master has been elected.  So I'd have
 to write a bunch of SOAP services or something to do all of this.
 
 You don't need SOAP services, and you do not need to elect a new master.
 if dbX goes down, dbY takes over, you should be able to decide on a
 static takeover pattern easily enough.

I can't see how that is true.  Any self-healing distributed system needs 
something like the following:

  - A distributed system of nodes that check each other's health
  - A way to detect that a node is down and to transmit that
information across the nodes
  - An election mechanism that nominates a new master if the
master fails
  - A way for a node coming online to determine if it is a master
or a slave

Any solution less than this can cause corruption because you can have two nodes 
that both think they're master, or end up with no master and no process for 
electing a master.  As far as I can tell, Slony doesn't do any of this.  Is 
there a simpler solution?  I've never heard of one.

 The point here is, that the servers need to react to a problem, but you
 probably want to get the admin on duty to look at the situation as
 quickly as possible anyway.

No, our requirement is no administrator interaction.  We need instant, 
automatic recovery from failure so that the system stays online.

 Furthermore, you need to checkout pgpool, I seem to remember that it has
 some bad habits in routing queries. (E.g. it wants to apply write
 queries to all nodes, but slony makes the other nodes readonly.
 Furthermore, anything inside a BEGIN is sent to the master node, which
 is bad with some ORMs, that by default wrap any access into a transaction)

I should have been more clear about this.  I was planning to use PGPool in the 
PGPool-1 mode (not the new PGPool-2 features that allow replication).  So it 
would only be acting as a failover mechanism.  Slony would be used as the 
replication mechanism.

I don't think I can use PGPool as the replicator, because then it becomes a new 
single point of failure that could bring the whole system down.  If you're 
using it for INSERT/UPDATE, then there can only be one PGPool server.

I was thinking I'd put a PGPool server on every machine in failover mode only.  
It would have the Slony master as the primary connection, and a Slony slave as 
the failover connection.  The applications would route all INSERT/UPDATE 
statements directly to the Slony master, and all SELECT statements to the 
PGPool on localhost.  When the master failed, all of the PGPool servers would 
automatically switch to one of the Slony slaves.

This way, the system would keep running on the Slony slaves (so it would be 
read-only), until a sysadmin could get the master Slony back online.  And when 
the master came online, the PGPool servers would automatically reconnect and 
write-access would be restored.

Does this make sense?

Craig


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [OT] Re: [PERFORM] How much ram is too much

2007-06-08 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Zoltan Boszormenyi wrote:
 Joshua D. Drake írta:
 Zoltan Boszormenyi wrote:
 Dave Cramer írta:
 It's an IBM x3850 using linux redhat 4.0

 Isn't that a bit old? I have a RedHat 4.2 somewhere
 that was bundled with Applixware 3. :-)

 He means redhat ES/AS 4 I assume.

 J
 
 I guessed that, hence the smiley.
 But it's very unfortunate that version numbers
 are reused - it can cause confusion.
 There was a RH 4.0 already a long ago,
 when the commercial and the community
 version were the same. I think Microsoft
 will avoid reusing its versions when year 2095 comes. :-)

Well, RedHat Linux, and RedHat Linux Enterprise Server/Advanced Servers
are clearly different products :-P

And yes, I even owned Applix :)

Andreas

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGac2FHJdudm4KnO0RAkpcAJwI+RTIJgAc5Db1bnsu7tRNiU9vzACeIGvl
LP0CSxc5dML0BMerI+u1xYc=
=qiye
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: My quick and dirty solution (Re: [PERFORM] Performance P roblem with Vacuum of bytea table (PG 8.0.13))

2007-05-25 Thread Andreas Kostyrka

TOASTed means storage outside of the main table. But AFAIK, only rows bigger 2K 
are considered for toasting.

Andreas

-- Ursprüngl. Mitteil. --
Betreff:Re: My quick and dirty solution (Re: [PERFORM] Performance 
Problem with Vacuum of bytea table (PG 8.0.13))
Von:Bastian Voigt [EMAIL PROTECTED]
Datum:  25.05.2007 14:13

Richard Huxton wrote:
 Could you check the output of vacuum verbose on that table and see how 
 much work it's doing? I'd have thought the actual bytea data would be 
 TOASTed away to a separate table for storage, leaving the vacuum with 
 very little work to do.
I'm quite new to postgres (actually I just ported our running 
application from MySQL...), so I don't know what toast means. But I 
noticed that vacuum also tried to cleanup some toast relations or so. 
This was what took so long.

 It might well be your actual problem is your disk I/O is constantly 
 saturated and the vacuum just pushes it over the edge. In which case 
 you'll either need more/better disks or to find a quiet time once a 
 day to vacuum and just do so then.
Yes, that was definitely the case. But now everything runs smoothly 
again, so I don't think I need to buy new disks.

Regards
Bastian


-- 
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Tips Tricks for validating hardware/os

2007-05-23 Thread Andreas Kostyrka

You forgot pulling some RAID drives at random times to see how the hardware 
deals with the fact. And how it deals with the rebuild afterwards. (Many RAID 
solutions leave you with worst of both worlds, taking longer to rebuild than a 
restore from backup would take, while at the same ime providing a disc IO 
performance that is SO bad that the server becomes useless during the rebuild)

Andreas

-- Ursprüngl. Mitteil. --
Betreff:Re: [PERFORM] Tips  Tricks for validating hardware/os
Von:Greg Smith [EMAIL PROTECTED]
Datum:  23.05.2007 05:15

On Tue, 22 May 2007, Stephane Bailliez wrote:

 Out of curiosity, can anyone share his tips  tricks to validate a machine 
 before labelling it as 'ready to use postgres - you probably won't trash my 
 data today' ?

Write a little script that runs pgbench in a loop forever.  Set your 
shared_buffer cache to use at least 50% of the memory in the machine, and 
adjust the database size and concurrent clients so it's generating a 
substantial amount of disk I/O and using a fair amount of the CPU.

Install the script so that it executes on system startup, like adding it 
to rc.local Put the machine close to your desk.  Every time you walk by 
it, kill the power and then start it back up.  This will give you a mix of 
long overnight runs with no interruption to stress the overall system, 
with a nice dose of recovery trauma.  Skim the Postgres and OS log files 
every day.  Do that for a week, if it's still running your data should be 
safe under real conditions.

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Drop table vs Delete record

2007-05-22 Thread Andreas Kostyrka

Consider table partitioning (it's described in the manual).

Andreas

-- Ursprüngl. Mitteil. --
Betreff:[PERFORM] Drop table vs Delete record
Von:Orhan Aglagul [EMAIL PROTECTED]
Datum:  22.05.2007 18:42


My application has two threads, one inserts thousands of  records per second 
into a table  (t1) and the  other thread periodically deletes expired records 
(also in thousands) from the same table (expired ones).  So, we have one thread 
adding a row while the other thread is trying to delete a row. In a short time 
the overall performance of any sql statements on that instance degrades. (ex.  
Select count(*) from  t1 takes  more then few seconds with less than 10K rows).

My question is: Would any sql statement perform better if I would rename the 
table to t1_%indx periodically, create a new table t1 (for new inserts) and 
just drop the tables with expired records rather then doing a delete record? 
(t1 is a simple table with many rows and no constraints). 

(I know I could run vacuum analyze) 

Thanks,

Orhan A.   


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Postgres Benchmark Results

2007-05-20 Thread Andreas Kostyrka


   I'm writing a full report, but I'm having a 
 lot of problems with MySQL,  
 I'd like to give it a fair chance, but it shows 
 real obstination in NOT  
 working.

Well that matches up well with my experience, better even yet, file a 
performance bug to the commercial support and you'll get an explanation why 
your schema (or your hardware, well anything but the database software used) is 
the guilty factor.

but you know these IT manager journals consider mysql as the relevant 
opensource database. Guess it matches better with their expection than PG or 
say MaxDB (the artist known formerly as Sap DB).

Andreas


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Merging large volumes of data

2007-05-07 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I think you'll have to stick with doing your sorting (or merging) in
your client. Don't think that PG recognizes the fact it's just a merge step.

Andreas

Ambrus Wagner (IJ/ETH) wrote:
 Dear All,
 
 I have several tables containing data sorted by 2 keys (neither are keys in 
 db terms (not unique), however). I would like to retrieve all rows from all 
 tables sorted by the same keys, essentially merging the contents of the 
 tables together. While I am completely aware of sort order not being a 
 (fundamental) property of an RDBMS table, I am also aware of indices and 
 clustering (in fact, data is inserted into the tables into the correct order, 
 and not consequently modified in any way). I have a union query like this one:
 
 select a,b,c,d,e from table1 union all
 select a,b,c,d,e from table2 union all
 etc...
 select a,b,c,d,e from tablen order by a,b;
 
 Is there a way to prevent PostgreSQL from doing a full sort on the result set 
 after the unions have been completed? Even if I write
 
 (select a,b,c,d,e from table1 order by a,b) union all
 (select a,b,c,d,e from table2 order by a,b)  union all
 etc...
 (select a,b,c,d,e from tablen order by a,b)  order by a,b;
 
 PostgreSQL does not seem to realise (maybe it should not be able to do this 
 trick anyway) that the last order by clause is merely a final merge step on 
 the ordered data sets.
 
 Is there a workaround for this within PostgreSQL (another type of query, 
 parameter tuning, stored procedure, anything) or should I use my back-up plan 
 of making separate queries and merging the results in the target language?
 
 Thanks a lot,
 Ambrus
 
 --
 Wagner, Ambrus (IJ/ETH/GBD)
 Tool Designer
 GSDC Hungary
 
 Location: Science Park, A2 40 008
 Phone: +36 1 439 5282 
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGPy1CHJdudm4KnO0RAuKlAKCbYu2G/MYfmX9gAlSxkzA6KB4A+QCeIlAT
USxhGD5XL7oGlIh+i2rVyN4=
=APcb
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-05-06 Thread Andreas Kostyrka

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Only some problems that come to my mind with this:

a) Hardware is sometimes changed underhand without telling the customer.
Even for server-level hardware. (Been there.)

b) Hardware recommendations would get stale quickly. What use is a
hardware spec that specifies some versions of Xeons, when the supply
dries up. (the example is not contrived, certain versions of PG and
Xeons with certain usage patterns don't work that well. google for
context switch storms)

c) All that is depending upon the PG version too, so with every new
version somebody would have to reverify that the recommendations are
still valid. (Big example, partitioned tables got way better supported
in recent versions. So a setup that anticipated Seqscans over big tables
might suddenly perform way better. OTOH, there are some regressions
performance wise sometimes)

d) And to add insult to this, all that tuning (hardware and software
side) is sensitive to your workload. Before you start yelling, well,
have you ever rolled back an application version, because you notice
what stupidities the developers have added. (And yes you can try to
avoid this by adding better staging to your processes, but it's really
really hard to setup a staging environment that has the same performance
 characteristics as production.)

So, while it's a nice idea to have a set of recommended hardware setups,
I don't see much of a point. What makes a sensible database server is
not exactly a secret. Sizing slightly harder. And after that one enters
the realm of fine tuning the complete system. That does not end at the
socket on port 5432.

Andreas

Jim Nasby wrote:
 On May 4, 2007, at 12:11 PM, Josh Berkus wrote:
 Sebastian,
 Before inventing a hyper tool, we might consider to provide 3-5 example
 szenarios for common hardware configurations. This consumes less time
 and be discussed and defined in a couple of days. This is of course not
 the correct option for a brandnew 20 spindle Sata 10.000 Raid 10 system
 but these are probably not the target for default configurations.

 That's been suggested a number of times, but some GUCs are really tied
 to the
 *exact* amount of RAM you have available.  So I've never seen how
 example
 configurations could help.
 
 Uh... what GUCs are that exacting on the amount of memory? For a decent,
 base-line configuration, that is.
 -- 
 Jim Nasby[EMAIL PROTECTED]
 EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)
 
 
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGPX5aHJdudm4KnO0RAorYAJ9XymZy+pp1oHEQUu3VGB7G2G2cSgCfeGaU
X2bpEq3aM3tzP4MYeR02D6U=
=vtPy
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] View is not using a table index

2007-04-24 Thread Andreas Kostyrka

* Dan Shea [EMAIL PROTECTED] [070424 19:33]:
 Version is  PWFPM_DEV=# select version();
 version
 
 
  PostgreSQL 7.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.3
 20030502 (Red Hat Linux 3.2.3-20)
 (1 row)
 
 We used the rpm source from postgresql-7.4-0.5PGDG.
 
 You make it sound so easy.  Our database size is at 308 GB.  We actually
 have 8.2.3 running and would like to transfer in the future.  We have to
 investigate the best way to do it.

That depends upon your requirements for the uptime.

Andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] not using indexes on large table

2007-04-21 Thread Andreas Kostyrka

* Jeroen Kleijer [EMAIL PROTECTED] [070421 23:10]:
 
 Hi all,
 
 I'm a bit new to PostgreSQL and database design in general so forgive me
 for asking stupid questions. ;-)
 
 I've setup a PostgreSQL database on a Linux machine (2 processor, 1GB
 mem) and while the database itself resides on a NetApp filer, via NFS,
 this doesn't seem to impact the performance to drastically.
 
 I basically use it for indexed tables without any relation between 'em
 so far this has worked perfectly.
 
 For statistics I've created the following table:
 volume varchar(30),
 qtree varchar(255),
 file varchar(512),
 ctime timestamp,
 mtime timestamp,
 atime timestamp
 annd created separate indexes on the volume and qtree columns.
 
 This table gets filled with the copy command and about 2 hours and
 some 40 million records later I issue a reindex command to make sure the
 indexes are accurate. (for good interest, there are some 35 values for
 volume and some 1450 for qtrees)
 
 While filling of this table, my database grows to an (expected) 11.5GB.
 
 The problems comes when I try to do a query without using a where clause
 because by then, it completely discards the indexes and does a complete
 table scan which takes over half an hour! (40.710.725 rows, 1110258
 pages, 1715 seconds)
 
 I've tried several things but doing a query like:
 select distinct volume from project_access_times
 or
 select distinct qtree from project_access_times
 always result in a full sequential table scan even after a 'vacuum' and
 'vacuum analyze'.

Try:
select volume from project_access_times group by project_access_times;

And no matter, runnning a database over NFS smells like a dead rat.

Hopefully, you've mounted it hard, but still NFS does not have normal
semantics, e.g. locking, etc.

Next thing, as you've got only one client for that NFS mount, try to
make it to cache aggressivly meta data. The ac prefixed options in
nfs(5) come to mind.

Andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] SCSI vs SATA

2007-04-06 Thread Andreas Kostyrka

* Charles Sprickman [EMAIL PROTECTED] [070407 00:49]:
 On Fri, 6 Apr 2007, [EMAIL PROTECTED] wrote:
 
 On Fri, 6 Apr 2007, Scott Marlowe wrote:
 
 Based on experience I think that on average server drives are more
 reliable than consumer grade drives, and can take more punishment.
 
 this I am not sure about
 
 I think they should survey Tivo owners next time.
 
 Perfect stress-testing environment.  Mine runs at over 50C most of the time, 
 and it's writing 2 video streams 24/7.  What more could you do to punish a 
 drive? :)

Well, there is one thing, actually what my dreambox does ;)

-) read/write 2 streams at the same time. (which means quite a bit of
seeking under pressure)
-) and even worse, standby and sleep states. And powering up the drive
when needed.

Andreas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] SCSI vs SATA

2007-04-04 Thread Andreas Kostyrka

* Peter Kovacs [EMAIL PROTECTED] [070404 14:40]:
 This may be a silly question but: will not 3 times as many disk drives
 mean 3 times higher probability for disk failure? Also rumor has it
 that SATA drives are more prone to fail than SCSI drivers. More
 failures will result, in turn, in more administration costs.
Actually, the newest research papers show that all discs (be it
desktops, or highend SCSI) have basically the same failure statistics.

But yes, having 3 times the discs will increase the fault probability.

Andreas
 
 Thanks
 Peter
 
 On 4/4/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 On Tue, 3 Apr 2007, Geoff Tolley wrote:
 
 
  Ron wrote:
   At 07:07 PM 4/3/2007, Ron wrote:
For random IO, the 3ware cards are better than PERC
  
 Question: will 8*15k 73GB SCSI drives outperform 24*7K 320GB SATA II
drives?
  
Nope.  Not even if the 15K 73GB HDs were the brand new Savvio 15K
screamers.
  
Example assuming 3.5 HDs and RAID 10 = 4 15K 73GB vs 12 7.2K 320GB
The 15K's are 2x faster rpm, but they are only ~23% the density =
advantage per HD to SATAs.
Then there's the fact that there are 1.5x as many 7.2K spindles as 15K
spindles...
   Oops make that =3x= as many 7.2K spindles as 15K spindles...
 
  I don't think the density difference will be quite as high as you seem to
  think: most 320GB SATA drives are going to be 3-4 platters, the most that a
  73GB SCSI is going to have is 2, and more likely 1, which would make the
  SCSIs more like 50% the density of the SATAs. Note that this only really
  makes a difference to theoretical sequential speeds; if the seeks are 
  random
  the SCSI drives could easily get there 50% faster (lower rotational latency
  and they certainly will have better actuators for the heads). Individual 
  15K
  SCSIs will trounce 7.2K SATAs in terms of i/os per second.
 
 true, but with 3x as many drives (and 4x the capacity per drive) the SATA
 system will have to do far less seeking
 
 for that matter, with 20ish 320G drives, how large would a parition be
 that only used the outer pysical track of each drive? (almost certinly
 multiple logical tracks) if you took the time to set this up you could
 eliminate seeking entirely (at the cost of not useing your capacity, but
 since you are considering a 12x range in capacity, it's obviously not your
 primary concern)
 
  If you care about how often you'll have to replace a failed drive, then the
  SCSI option no question, although check the cases for hot-swapability.
 
 note that the CMU and Google studies both commented on being surprised at
 the lack of difference between the reliability of SCSI and SATA drives.
 
 David Lang
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
 
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
   http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] SCSI vs SATA

2007-04-04 Thread Andreas Kostyrka

* Alvaro Herrera [EMAIL PROTECTED] [070404 15:42]:
 Peter Kovacs escribió:
  But if an individual disk fails in a disk array, sooner than later you
  would want to purchase a new fitting disk, walk/drive to the location
  of the disk array, replace the broken disk in the array and activate
  the new disk. Is this correct?
 
 Ideally you would have a spare disk to let the array controller replace
 the broken one as soon as it breaks, but yeah, that would be more or
Well, no matter what, you need to test this procedure. I'd expect in
many cases the disc io during the rebuild of the array to that much
slower that the database server won't be able to cope with the load.

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] SCSI vs SATA

2007-04-04 Thread Andreas Kostyrka

* Joshua D. Drake [EMAIL PROTECTED] [070404 17:40]:
 
 Good point. On another note, I am wondering why nobody's brought up the 
 command-queuing perf benefits (yet). Is this because sata vs scsi are at 
 
 SATAII has similar features.
 
 par here? I'm finding conflicting information on this -- some calling sata's 
 ncq mostly crap, others stating the real-world results are negligible. I'm 
 inclined to believe SCSI's 
 pretty far ahead here but am having trouble finding recent articles on this.
 
 What I find is, a bunch of geeks sit in a room and squabble about a few 
 percentages one way or the other. One side feels very l33t because their 
 white paper looks like the latest 
 swimsuit edition.
 
 Real world specs and real world performance shows that SATAII performs, very, 
 very well. It is kind of like X86. No chip engineer that I know has ever 
 said, X86 is elegant but guess
 which chip design is conquering all others in the general and enterprise 
 marketplace?

Actually, to second that, we did have very similiar servers with
SCSI/SATA drives, and I did not notice any relevant measurable
difference. OTOH, the SCSI discs were way less reliable than the SATA
discs, that might have been bad luck.

Andreas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] How to determine which indexes are not using or using seldom in database

2007-04-02 Thread Andreas Kostyrka

* Denis  Lishtovny [EMAIL PROTECTED] [070402 09:20]:
Hello All.
 
I have a lot of tables and indexes in database. I must to determine which
indexes are not using or using seldon in databese . I enabled all posible
statistics in config but a can`t uderstand how to do this.
Thanks.
 
p.s for example i need this to reduce database size for increase backup
and restore speed.
Indexes are not backuped, and you can increase restore speed by
temporarily dropping them. Current pg_dumps should be fine from this
aspect.

Discovering which tables are unused via the database suggests more of
a software eng. problem IMHO. And it is bound to be unprecise and
dangerous, tables might get read from:

*) triggers. That means some tables might be only consulted if user X
is doing something. Or we have full moon. Or the Chi of the DBA barked
3 times this day.

*) during application startup only (easy to solve by forcing all clients
to restart)

*) during a cron job (daily, weekly, monthly, bi-monthly)

*) only during human orginated processes.

Not a good thing to decide to drop tables just because nothing has
accessed them for half an hour. Or even a week.

Worse, some tables might have relationsships that are missing in the
database (foreign constraint forgotten, or some relationships that are
hard to express with SQL constraints).

OTOH, if you just try to get a feel what parts of the database is
active, you can start by enabling SQL statement logging, and analyze
some of that output.

Andreas


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Andreas Kostyrka

* Andreas Tille [EMAIL PROTECTED] [070322 12:07]:
 Hi,
 
 I just try to find out why a simple count(*) might last that long.
 At first I tried explain, which rather quickly knows how many rows
 to check, but the final count is two orders of magnitude slower.

Which version of PG?

The basic problem is, that explain knows quickly, because it has it's
statistics.

The select proper, OTOH, has to go through the whole table to make
sure which rows are valid for your transaction.

That's the reason why PG (check the newest releases, I seem to
remember that there has been some aggregate optimizations there), does
a SeqScan for select count(*) from table. btw, depending upon your
data, doing a select count(*) from table where user=X will use an
Index, but will still need to fetch the rows proper to validate them.

Andreas

 
 My MS_SQL server using colleague can't believe that.
 
 $ psql InfluenzaWeb -c 'explain SELECT count(*) from agiraw ;'
   QUERY PLAN 
 ---
  Aggregate  (cost=196969.77..196969.77 rows=1 width=0)
-  Seq Scan on agiraw  (cost=0.00..185197.41 rows=4708941 width=0)
 (2 rows)
 
 real0m0.066s
 user0m0.024s
 sys 0m0.008s
 
 $ psql InfluenzaWeb -c 'SELECT count(*) from agiraw ;'
   count -
  4708941
 (1 row)
 
 real0m4.474s
 user0m0.036s
 sys 0m0.004s
 
 
 Any explanation?
 
 Kind regards
 
  Andreas.
 
 -- 
 http://fam-tille.de
 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Andreas Kostyrka

* Andreas Tille [EMAIL PROTECTED] [070322 13:24]:
 On Thu, 22 Mar 2007, Andreas Kostyrka wrote:
 
 Which version of PG?
 
 Ahh, sorry, forgot that.  The issue occurs in Debian (Etch) packaged
 version 7.4.16.  I plan to switch soon to 8.1.8.
I'd recommend 8.2 if at all possible :)
 
 That's the reason why PG (check the newest releases, I seem to
 remember that there has been some aggregate optimizations there),
 
 I'll verify this once I moved to the new version.
8.1 won't help you I'd guess. ;)

 
 does
 a SeqScan for select count(*) from table. btw, depending upon your
 data, doing a select count(*) from table where user=X will use an
 Index, but will still need to fetch the rows proper to validate them.
 
 I have an index on three (out of 7 columns) of this table.  Is there
 any chance to optimize indexing regarding this.
Well, that depends upon you query pattern. It's an art and a science
at the same time ;)
 
 Well, to be honest I'm not really interested in the performance of
 count(*).  I was just discussing general performance issues on the
 phone line and when my colleague asked me about the size of the
 database he just wonderd why this takes so long for a job his
 MS-SQL server is much faster.  So in principle I was just asking
 a first question that is easy to ask.  Perhaps I come up with
 more difficult optimisation questions.

Simple. MSSQL is optimized for this case, and uses older
datastructures. PG uses a MVCC storage, which is not optimized for
this usecase. It's quite fast for different kinds of queries.

The basic trouble here is that mvcc makes it a little harder to decide
what is valid for your transaction, plus the indexes seems to be
designed for lookup, not for data fetching. (Basically, PG can use
indexes only to locate potential data, but cannot return data directly
out of an index)

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Andreas Kostyrka

* Mario Weilguni [EMAIL PROTECTED] [070322 15:59]:
 Am Donnerstag, 22. März 2007 15:33 schrieb Jonah H. Harris:
  On 3/22/07, Merlin Moncure [EMAIL PROTECTED] wrote:
   As others suggest select count(*) from table is very special case
   which non-mvcc databases can optimize for.
 
  Well, other MVCC database still do it faster than we do.  However, I
  think we'll be able to use the dead space map for speeding this up a
  bit wouldn't we?
 
 Which MVCC DB do you mean? Just curious...
Well, mysql claims InnoDB to be mvcc ;)

Andreas

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Insert performance

2007-03-06 Thread Andreas Kostyrka

* Richard Huxton dev@archonet.com [070306 12:22]:
 2. You can do a COPY from libpq - is it really not possible?
 
 Not really but i have been testing it and inserts are flying (about
 10 inserts/sec) !!
 
 What's the problem with the COPY? Could you COPY into one table then insert 
 from that to your target table?
Well, there are some issues. First your client needs to support it.
E.g. psycopg2 supports only some specific CSV formatting in it's
methods. (plus I had sometimes random psycopg2 crashes, but guarding against
these is cheap compared to the speedup from COPY versus INSERT)
Plus you need to be sure that your data will apply cleanly (which in
my app was not the case), or you need to code a fallback that
localizes the row that doesn't work.

And the worst thing is, that it ignores RULES on the tables, which
sucks if you use them ;) (e.g. table partitioning).

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Insert performance

2007-03-06 Thread Andreas Kostyrka

* Richard Huxton dev@archonet.com [070306 13:47]:
 Andreas Kostyrka wrote:
 * Richard Huxton dev@archonet.com [070306 12:22]:
 2. You can do a COPY from libpq - is it really not possible?
 
 Not really but i have been testing it and inserts are flying (about
 10 inserts/sec) !!
 What's the problem with the COPY? Could you COPY into one table then insert 
 from that to your target table?
 Well, there are some issues. First your client needs to support it.
 E.g. psycopg2 supports only some specific CSV formatting in it's
 methods. (plus I had sometimes random psycopg2 crashes, but guarding against
 these is cheap compared to the speedup from COPY versus INSERT)
 Plus you need to be sure that your data will apply cleanly (which in
 my app was not the case), or you need to code a fallback that
 localizes the row that doesn't work.
 And the worst thing is, that it ignores RULES on the tables, which
 sucks if you use them ;) (e.g. table partitioning).
 
 Ah, but two things deal with these issues:
 1. Joel is using libpq
 2. COPY into a holding table, tidy data and INSERT ... SELECT

Clearly COPY is the way for bulk loading data, BUT you asked, so I
wanted to point out some problems and brittle points with COPY.

(and the copy into the holding table doesn't solve completly the
problem with the dirty inconsistent data)

Andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Scaling concerns

2006-12-16 Thread Andreas Kostyrka

* tsuraan [EMAIL PROTECTED] [061216 18:26]:
 I'm writing a webmail-type application that is meant to be used in a
 corporate environment.  The core of my system is a Postgres database
 that is used as a message header cache.  The two (relevant) tables
 being used are pasted into the end of this message.  My problem is
 that, as the messages table increases to tens of millions of rows,
 pgsql slows down considerably.  Even an operation like select
 count(*) from messages can take minutes, with a totally idle system.
 Postgres seems to be the most scalable Free database out there, so I
 must be doing something wrong.

select count(*) from table is the worst case in PostgreSQL. (MVC
systems in general I guess).

If you really need to run count(*) you need to think about the
required isolation level of these operations and make some aggregate
table yourself.

(btw, select aggregate(*) from bigtable is something that no database
likes, it's just the degree of slowness that sometimes is different).

For scaling you should consider slony. Either hangout on #slony on
Freenode.net or ask on the mailing list if you have questions.

 As for the most common strategy of having a slower (more rows)
 archival database and a smaller, faster live database, all the
 clients in the company are using their normal corporate email server
 for day-to-day email handling.  The webmail is used for access email
 that's no longer on the corporate server, so it's not really simple to
 say which emails should be considered live and which are really
 out-of-date.
 
 My postgres settings are entirely default with the exception of
 shared_buffers being set to 40,000 and max_connections set to 400.
 I'm not sure what the meaning of most of the other settings are, so I
 haven't touched them.  The machines running the database servers are
 my home desktop (a dual-core athlon 3200+ with 2GB RAM and a 120GB
 SATA II drive), and a production server with two dual-core Intel

Intel chips = define more. There are Intel boxes known to have issues
under specific load scenarios with PostgreSQL (again specific
versions). To make it funnier, these are really really hard to track
down ;)

 chips, 4 GB RAM, and a RAID 5 array of SATA II drives on a 3Ware 9550
 controller.  Both machines are running Gentoo Linux with a 2.6.1x
 kernel, and both exhibit significant performance degradation when I
 start getting tens of millions of records.
 
 Any advice would be most appreciated.  Thanks in advance!

Cluster. One box that applies changes, and multiple boxes that read
the data.

If you cannot afford multiple boxes from the start, design your
application still to work with two connections: one connection to a
user with read/write permissions, and one connecting to a user having
only select permissions = this way you can later easily add a
loadbalancer to the mix, and use multiple postgres boxes for reading
stuff.

Andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] SQL_CALC_FOUND_ROWS in POSTGRESQL / Some one can

2006-12-10 Thread Andreas Kostyrka

* Chris [EMAIL PROTECTED] [061211 07:01]:
 select SQL_CALC_FOUND_ROWS userid, username, password from users limit 10;
 
 will do:
 
 select userid, username, password from users limit 10;
 
 and calculate this:
 
 select userid, username, password from users;
 
 and tell you how many rows that will return (so you can call 'found_rows()').
 
 
 the second one does do a lot more because it has to send the results across 
 to the client program - whether the client uses that info or not doesn't 
 matter.
Not really. Sending the data to the client is usually (if you are not
connected via some small-bandwidth connection) a trivial cost compared
to calculating the number of rows.

(Our tables involve 100Ms of rows, while the net connectivity is a
private internal Gigabit net, returning the data seems never to be an
issue. Reading it from the disc, selecting the rows are issues. Not
sending the data.)

Actually, if you think that sending the data is an issue, PG offers
the more generic concept of cursors.

Andreas

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: RES: [PERFORM] Priority to a mission critical transaction

2006-11-28 Thread Andreas Kostyrka

* Carlos H. Reimer [EMAIL PROTECTED] [061128 20:02]:
 Hi,
 
 There is an article about Lowering the priority of a PostgreSQL query
 (http://weblog.bignerdranch.com/?p=11) that explains how to use the
 setpriority() to lower PostgreSQL processes.
 
 I?m wondering how much effective it would be for i/o bound systems.
 
 Will the setpriority() system call affect i/o queue too?

Nope, and in fact the article shows the way not to do it.

See http://en.wikipedia.org/wiki/Priority_inversion

Basically, lowering the priority of one backend in PostgreSQL can lead
to reduced performance of all, especially also the backends with
higher priorities.

(Think of priority inversion as a timed soft deadlock. It will
eventually resolve, because it's not a real deadlock, but it might
mean halting important stuff for quite some time.)

Taking the example above, consider the following processes and nice
values:

19x backends As nice = 0
 1x backend B nice = 10 (doing maintenance work)
 1x updatedb nice = 5 (running as a cronjob at night)
 
 Now, it possible (the probability depends upon your specific
situation), where backend B grabs some internal lock that is needed,
and then it gets preempted by higher priority stuff. Well, the A
backends need that lock too, so they cannot run; instead we wait till
updatedb (which updates the locate search db, and goes through the
whole filesystem of the server) is finished.

Luckily most if not all of these processes are disc io bound, so they
get interrupted any way, and low priority processes don't starve.
Well, replace updatedb with something hogging the CPU, and rethink the
situation.

Andreas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Slow SELECT on three or more clients

2006-11-15 Thread Andreas Kostyrka

* AMIR FRANCO D. JOVEN [EMAIL PROTECTED] [061115 12:44]:
 Hi!
 
 Im new to PostgreSQL.
 
 My current project uses PostgreSQL 7.3.4.
Ancient. Upgrade it, especially if it's a new database.

 
 the problem is like this:
 
 I have a table with 94 fields and a select with only one resultset in only
 one client consumes about 0.86 seconds.
 The client executes three 'select' statements to perform the task which
 consumes 2.58 seconds.
 With only one client this is acceptable, but the real problem is as i add
 more clients, it goes more and more slower.
That depends upon:
a) your table schema.
b) the data in the tables. E.g. how big are rows, how many rows.
c) the size of the result sets.
d) your indexes?

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Context switch storm

2006-11-14 Thread Andreas Kostyrka

* Cosimo Streppone [EMAIL PROTECTED] [061114 10:52]:
 Richard Huxton wrote:
 Cosimo Streppone wrote:
 Richard Huxton wrote:
 
 The average context switching for this server as vmstat shows is 1
 but when the problem occurs it goes to 25.
 
 I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server
 upgrade from 8.0.1 - the most recent is 8.0.9 iirc
 [...]
 Are you seeing a jump in context-switching in top? You'll know when you do - 
 it's a *large* jump. That's the key diagnosis. Otherwise it might simply be 
 your configuration settings 
 aren't ideal for that workload.
 
 Sorry for the delay.
 
 I have logged vmstat results for the last 3 days.
 Max context switches figure is 20500.
 
 If I understand correctly, this does not mean a storm,
Nope, 20500 is a magnitude to low to the storms we were experiencing.

 but only that the 2 Xeons are overloaded.
 Probably, I can do a good thing switching off the HyperThreading.
 I get something like 12/15 *real* concurrent processes hitting
 the server.

Actually, for the storms we had, the number of concurrent processes
AND the workload is important:

many processes that do all different things = overloaded server
many processes that do all the same queries = storm.

Basically, it seems that postgresql implementation of locking is on
quite unfriendly standings with the Xeon memory subsystems. googling
around might provide more details. 

 
 I must say I lowered shared_buffers to 8192, as it was before.
 I tried raising it to 16384, but I can't seem to find a relationship
 between shared_buffers and performance level for this server.
 
 Well, the client I saw it with just bought a dual-opteron server and used 
 their quad-Xeon for something else. However, I do remember that 8.1 seemed 
 better than 7.4 before they 
 switched. Part of that might just have been better query-planning and other 
 efficiences though.
 
 An upgrade to 8.1 is definitely the way to go.
 Any 8.0 - 8.1 migration advice?
Simple, there are basically two ways:
a) you can take downtime: pg_dump + restore
b) you cannot take downtime: install slony, install your new 8.1
server, replicate into it, switchover to the new server.

If you can get new hardware for the 8.1 box, you have two benefits:
a) order Opterons. That doesn't solve the overload problem as such,
but these pesky cs storms seems to have gone away this way.
(that was basically the free advice from an external consultant,
which luckily matched with my ideas what the problem could be. Cheap
solution at $3k :) )
b) you can use the older box still as readonly replica.
c) you've got a hot backup of your db.

Andreas

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Setting nice values

2006-11-03 Thread Andreas Kostyrka

Am Donnerstag, den 02.11.2006, 09:41 -0600 schrieb Scott Marlowe:
 Sometimes it's the simple solutions that work best.  :)  Welcome to the
 world of pgsql, btw...

OTOH, there are also non-simple solutions to this, which might make
sense anyway: Install slony, and run your queries against a readonly
replica of your data.

Andreas



signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

Re: [PERFORM] Context switch storm

2006-11-03 Thread Andreas Kostyrka

The solution for us has been twofold:

upgrade to the newest PG version available at the time while we waited
for our new Opteron-based DB hardware to arrive.

Andreas


Am Freitag, den 03.11.2006, 13:29 + schrieb Richard Huxton:
 Cosimo Streppone wrote:
  Richard Huxton wrote:
  
  [EMAIL PROTECTED] wrote:
 
  The average context switching for this server as vmstat shows is 1
  but when the problem occurs it goes to 25.
 
  You'll tend to see it when you have multiple clients and most queries 
  can use RAM rather than disk I/O. My understanding of what happens is 
  that PG requests data from RAM - it's not in cache so the process gets 
  suspended to wait. The next process does the same, with the same 
  result.   You end up with lots of processes all fighting over what 
  data is in the cache and no-one gets much work done.
  
  Does this happen also with 8.0, or is specific to 8.1 ?
 
 All versions suffer to a degree - they just push the old Xeon in the 
 wrong way. However, more recent versions *should* be better than older 
 versions. I believe some work was put in to prevent contention on 
 various locks which should reduce context-switching across the board.
 
  I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server
 
 upgrade from 8.0.1 - the most recent is 8.0.9 iirc
 
  when I raise `shared_buffers' from 8192 to 4.
  I would expect an increase in tps/concurrent clients, but I see an average
  performance below a certain threshold of users, and when concurrent users
  get above that level, performance starts to drop, no matter what I do.
 
 Are you seeing a jump in context-switching in top? You'll know when you 
 do - it's a *large* jump. That's the key diagnosis. Otherwise it might 
 simply be your configuration settings aren't ideal for that workload.
 
  Server logs and io/vm statistics seem to indicate that there is little
  or no disk activity but machine loads increases to 7.0/8.0.
  After some minutes, the problem goes away, and performance returns
  to acceptable levels.
 
 That sounds like it. Query time increases across the board as all the 
 clients fail to get any data back.
 
  When the load increases, *random* database queries show this slowness,
  even if they are perfectly planned and indexed.
  
  Is there anything we can do?
 
 Well, the client I saw it with just bought a dual-opteron server and 
 used their quad-Xeon for something else. However, I do remember that 8.1 
 seemed better than 7.4 before they switched. Part of that might just 
 have been better query-planning and other efficiences though.
 


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

Re: [PERFORM] Context switch storm

2006-11-03 Thread Andreas Kostyrka

Am Freitag, den 03.11.2006, 14:38 + schrieb Richard Huxton:
 [EMAIL PROTECTED] wrote:
  If you can keep your numbers of clients down below the critical 
  level, you should find the overall workload is fine.
  
  We have at about 600 connections. Is this a case to use a connection
  pool (pg_pool) system?
 
 Possibly - that should help. I'm assuming that most of your queries are 
 very short, so you could probably get that figure down a lot lower. 
 You'll keep the same amount of queries running through the system, just 
 queue them up.
that have 
Ah, yes, now that you mention, avoid running many queries with a
similiar timing behaviour, PG8 seems to have a lock design that's very
bad for the memory architecture of the Xeons.

So running SELECT * FROM table WHERE id=1234567890; from 600 clients in
parallel can be quite bad than say a complicated 6-way join :(

Andreas

 
  And why this happens only with 8.0 and 8.1 and not with the 7.4?
 
 Not sure. Maybe 8.x is making more intensive use of your memory, 
 possibly with a change in your plans.
 


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

Re: [PERFORM] big transaction slows down over time - but disk

2006-11-01 Thread Andreas Kostyrka

Am Dienstag, den 31.10.2006, 21:58 -0800 schrieb Ben:
 I've got a long-running, update-heavy transaction that increasingly slows 
 down the longer it runs. I would expect that behavior, if there was some 
 temp file creation going on. But monitoring vmstat over the life of the 
 transaction shows virtually zero disk activity. Instead, the system has 
 its CPU pegged the whole time.
 
 So why the slowdown? Is it a MVCC thing? A side effect of calling 
 stored proceedures a couple hundred thousand times in a single 

Memory usage? Have you tried to checkpoint your transaction from time to
time?

Andreas

 transaction? Or am I just doing something wrong?
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

Re: [PERFORM] Index ignored on column containing mostly 0 values

2006-10-31 Thread Andreas Kostyrka

Am Dienstag, den 31.10.2006, 13:04 +0900 schrieb Leif Mortenson:
 Hello,
 I have been having a problem with the following query ignoring an index
 on the foos.bar column.
 
 SELECT c.id
 FROM foos c, bars r
 WHERE r.id != 0
 AND r.modified_time  '2006-10-20 10:00:00.000'
 AND r.modified_time = '2006-10-30 15:20:00.000'
 AND c.bar = r.id
 
 The bars table contains 597 rows, while the foos table contains 5031203
 rows.
 
 After much research I figured out that the problem is being caused by the
 PG planner deciding that my foos.bar index is not useful. The data in the
 foos.bar column contains 5028698 0 values and 2505 that are ids in the bars
 table.
 
 Both tables have just been analyzed.
 
 When I EXPLAIN ANALYZE the above query, I get the following:
 
 Hash Join (cost=3.06..201642.49 rows=25288 width=8) (actual
 time=0.234..40025.514 rows=11 loops=1)
  Hash Cond: (outer.bar = inner.id)
  - Seq Scan on foos c (cost=0.00..176225.03 rows=5032303 width=16)
 (actual time=0.007..30838.623 rows=5031203 loops=1)
  - Hash (cost=3.06..3.06 rows=3 width=8) (actual time=0.117..0.117
 rows=20 loops=1)
  - Index Scan using bars_index_modified_time on bars r
 (cost=0.00..3.06 rows=3 width=8) (actual time=0.016..0.066 rows=20 loops=1)
  Index Cond: ((modified_time  '2006-10-20 10:00:00'::timestamp without
 time zone) AND (modified_time = '2006-10-30 15:20:00'::timestamp
 without time zone))
  Filter: (id  0)
 Total runtime: 40025.629 ms
 
 The solution I found was to change the statistics on my foos.bar column from
 the default -1 to 1000. When I do this, reanalyze the table, and rerun
 the above
 query, I get the following expected result.
 
 Nested Loop (cost=0.00..25194.66 rows=25282 width=8) (actual
 time=13.035..23.338 rows=11 loops=1)
  - Index Scan using bars_index_modified_time on bars r
 (cost=0.00..3.06 rows=3 width=8) (actual time=0.063..0.115 rows=20 loops=1)
  Index Cond: ((modified_time  '2006-10-20 10:00:00'::timestamp without
 time zone) AND (modified_time = '2006-10-30 15:20:00'::timestamp
 without time zone))
  Filter: (id  0)
  - Index Scan using foos_index_bar on foos c (cost=0.00..6824.95
 rows=125780 width=16) (actual time=1.141..1.152 rows=1 loops=20)
  Index Cond: (c.bar = outer.id)
 Total runtime: 23.446 ms
 
 Having to do this concerns me as I am not sure what a good statistics value
 should be. Also we expect this table to grow much larger and I am concerned
 that it may not continue to function correctly. I tried a value of 100
 and that
 works when the number of bars records is small, but as soon as I increase
 them, the query starts ignoring the index again.
 
 Is increasing the statistics value the best way to resolve this problem? How
 can I best decide on a good statistics value?
 
 Having a column containing large numbers of null or 0 values seems fairly
 common. Is there way to tell Postgres to create an index of all values with
 meaning. Ie all non-0 values? None that I could find.
Have you tried

CREATE INDEX partial ON foos (bar) WHERE bar IS NOT NULL;

Andreas



signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

Re: [PERFORM] partitioned table performance

2006-10-30 Thread Andreas Kostyrka

Am Montag, den 30.10.2006, 08:18 + schrieb Simon Riggs:
 On Sun, 2006-10-29 at 00:28 +0200, Andreas Kostyrka wrote:
 
  Any ideas how to make the optimizer handle partitioned tables more
  sensible? 
 
 Yes, those are known inefficiencies in the current implementation which
 we expect to address for 8.3.

Any ideas to force the current optimizer to do something sensible?

Andreas

 


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-29 Thread Andreas Kostyrka

Am Sonntag, den 29.10.2006, 10:34 -0500 schrieb Andrew Sullivan:
 On Sun, Oct 29, 2006 at 03:08:26PM +, Gavin Hamill wrote:
  
  This is interesting, but I don't understand.. We've done a full restore
  from one of these pg_dump backups before now and it worked just great.
  
  Sure I had to DROP SCHEMA _replication CASCADE to clear out all the
  slony-specific triggers etc., but the new-master ran fine, as did
  firing up new replication to the other nodes :)
  
  Was I just lucky?
 
 Yes.  Slony alters data in the system catalog for a number of
 database objects on the replicas.  It does this in order to prevent,
 for example, triggers from firing both on the origin and the replica. 
 (That is the one that usually bites people hardest, but IIRC it's not
 the only such hack in there.)  This was a bit of a dirty hack that
 was supposed to be cleaned up, but that hasn't been yet.  In general,
 you can't rely on a pg_dump of a replica giving you a dump that, when
 restored, actually works.

Actually, you need to get the schema from the master node, and can take
the data from a slave. In mixing dumps like that, you must realize that
there are two seperate parts in the schema dump: table definitions and
constraints. Do get a restorable backup you need to put the table
definitions stuff before your data, and the constraints after the data
copy.

Andreas

 
 A
 


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

Re: [PERFORM] VACUUMs take twice as long across all nodes

2006-10-29 Thread Andreas Kostyrka

Am Sonntag, den 29.10.2006, 11:43 -0500 schrieb Andrew Sullivan:
 On Sun, Oct 29, 2006 at 05:24:33PM +0100, Andreas Kostyrka wrote:
  Actually, you need to get the schema from the master node, and can take
  the data from a slave. In mixing dumps like that, you must realize that
  there are two seperate parts in the schema dump: table definitions and
  constraints. Do get a restorable backup you need to put the table
  definitions stuff before your data, and the constraints after the data
  copy.
 
 This will work, yes, but you don't get a real point-in-time dump this
But one does, because one can dump all data in one pg_dump call. And
with slony enabled, schema changes won't happen by mistake, they tend to
be a thing for the Slony High Priest, nothing for mere developers ;)

Andreas



signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil

[PERFORM] partitioned table performance

2006-10-28 Thread Andreas Kostyrka

Hi!

I'm just wondering, I've got a table that is partitioned into monthly
tables:

media_downloads - media_downloads_MM
 I\- id (primary key)
  \- created_on (timestamp criteria for the monthly table split)

There are constraints upon the created_on column, all needed insert
instead rules are defined too.
One additional hardship is that id are not monotone against created_on,
id1  id2 does not imply created_on1 = created_on2 :(
The table contains basically almost 100M rows, and the number is
growing. (the table will be about a 12GB pg_dump.)
All relevant indexes (primary key id, index on created_on) are defined
too.

The good thing is, queries like all rows in the last 7 days work
reasonable fast, the optimizer just checks the 1-2 last month tables.

Using postgres 8.1.4-0ubuntu1, I've got to implement the following
queries in a reasonable fast way:

-- sequential reading of rows
SELECT * FROM media_downloads WHERE id  100 ORDER BY id LIMIT 100;

Against the same monolithic table with about 16.5M rows, I'm getting a
cost of 20.6 pages. (Index scan)

Against the partitioned tables, I'm getting a cost of 5406822 pages.
Now I understand, that without any additional conditions, postgresql
needs to do the query for all subtables first, but explain against the
subtables show costs of 4-5 pages.
events=#  explain select * from media_downloads where id 9000 order
by id limit 100;

QUERY
PLAN   
---
 Limit  (cost=5406822.39..5406822.64 rows=100 width=1764)
   -  Sort  (cost=5406822.39..5413639.50 rows=2726843 width=1764)
 Sort Key: public.media_downloads.id
 -  Result  (cost=0.00..115960.71 rows=2726843 width=1764)
   -  Append  (cost=0.00..115960.71 rows=2726843
width=1764)
 -  Seq Scan on media_downloads  (cost=0.00..10.50
rows=13 width=1764)
   Filter: (id  9000)
 -  Index Scan using media_downloads_200510_pkey on
media_downloads_200510 media_downloads  (cost=0.00..3.75 rows=14
width=243)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200511_pkey on
media_downloads_200511 media_downloads  (cost=0.00..72.19 rows=172
width=239)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200512_pkey on
media_downloads_200512 media_downloads  (cost=0.00..603.64 rows=172
width=240)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200601_pkey on
media_downloads_200601 media_downloads  (cost=0.00..19.33 rows=232
width=239)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200602_pkey on
media_downloads_200602 media_downloads  (cost=0.00..56.82 rows=316
width=240)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200603_pkey on
media_downloads_200603 media_downloads  (cost=0.00..18.88 rows=270
width=243)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200604_pkey on
media_downloads_200604 media_downloads  (cost=0.00..1194.16 rows=939
width=298)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200605_pkey on
media_downloads_200605 media_downloads  (cost=0.00..79.28 rows=672
width=326)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200606_pkey on
media_downloads_200606 media_downloads  (cost=0.00..75.26 rows=1190
width=314)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200607_pkey on
media_downloads_200607 media_downloads  (cost=0.00..55.29 rows=1238
width=319)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200608_pkey on
media_downloads_200608 media_downloads  (cost=0.00..73.95 rows=1305
width=319)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200609_pkey on
media_downloads_200609 media_downloads  (cost=0.00..144.10 rows=1575
width=324)
   Index Cond: (id  9000)
 -  Index Scan using media_downloads_200610_pkey on
media_downloads_200610 media_downloads  (cost=0.00..113532.57
rows=2718709 width=337)
   Index Cond: (id  9000)
 -  Seq Scan on media_downloads_200611
media_downloads  (cost=0.00..10.50 rows=13 width=1764)
   Filter: (id  9000)
 -  Seq Scan on media_downloads_200612

45 matches

Mail list logo