Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-21 Thread Matt Clark
Presumably it can't _ever_ know without being explicitly told, because even for a plain SELECT there might be triggers involved that update tables, or it might be a select of a stored proc, etc. So in the general case, you can't assume that a select doesn't cause an update, and you can't be

Re: [PERFORM] PostgreSQL clustering VS MySQL clustering

2005-01-21 Thread Matt Clark
that pgpool couldn't make a good guess in the majority of cases! M Joshua D. Drake wrote: Matt Clark wrote: Presumably it can't _ever_ know without being explicitly told, because even for a plain SELECT there might be triggers involved that update tables, or it might be a select of a stored

Re: [PERFORM] Swapping on Solaris

2005-01-19 Thread Matt Clark
This page may be of use: http://www.serverworldmagazine.com/monthly/2003/02/solaris.shtml From personal experience, for god's sake don't think Solaris' VM/swap implementation is easy - it's damn good, but it ain't easy! Matt Kevin Schroeder wrote: I think it's probably just reserving them. I

Re: [PERFORM] Speed in V8.0

2004-12-24 Thread Matt Clark
Another man working to the bitter end this Christmas! There could be many reasons, but maybe first you should look at the amount of RAM available? If the tables fit in RAM on the production server but not on the dev server, then that will easily defeat the improvement due to using the native DB

Re: [PERFORM] ext3 journalling type

2004-11-08 Thread Matt Clark
Am I right to assume that writeback is both fastest and at the same time as safe to use as ordered? Maybe any of you did some benchmarks? It should be fastest because it is the least overhead, and safe because postgres does it's own write-order guaranteeing through fsync(). You should also

Re: [PERFORM] Strange (?) Index behavior?

2004-11-05 Thread Matt Clark
For some reason it's a requirement that partial wildcard searches are done on this field, such as SELECT ... WHERE field LIKE 'A%' I thought an interesting way to do this would be to simply create partial indexes for each letter on that field, and it works when the query matches the

Re: [PERFORM] Strange (?) Index behavior?

2004-11-05 Thread Matt Clark
With that many rows, and a normal index on the field, postgres figures the best option for say I% is not an index scan, but a sequential scan on the table, with a filter -- quite obviously this is slow as heck, and yes, I've run analyze several times and in fact have the vacuum analyze

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
I have a dual processor system that can support over 150 concurrent connections handling normal traffic and load. Now suppose I setup Apache to spawn all of it's children instantly, what will ... This will spawn 150 children in a short order of time and as this takes Doctor, it hurts

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
Apache::DBI overall works better to what I require, even if it is not a pool per sey. Now if pgpool supported variable rate pooling like Apache does with it's children, it might help to even things out. That and you'd still get the spike if you have to start the webserver and

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
Case in point: A first time visitor hits your home page. A dynamic page is generated (in about 1 second) and served (taking 2 more seconds) which contains links to 20 additional The gain from an accelerator is actually even more that that, as it takes essentially zero seconds for Apache to

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
Correct the 75% of all hits are on a script that can take anywhere from a few seconds to a half an hour to complete.The script essentially auto-flushes to the browser so they get new information as it arrives creating the illusion of on demand generation. This is more like a

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
1- You have a query that runs for half an hour and you spoon feed the results to the client ? (argh) 2- Your script looks for new data every few seconds, sends a packet, then sleeps, and loops ? If it's 2 I have a readymade solution for you, just ask. I'm guessing (2) - PG

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
These are CGI scripts at the lowest level, nothing more and nothing less. While I could probably embed a small webserver directly into the perl scripts and run that as a daemon, it would take away the portability that the scripts currently offer. If they're CGI *scripts* then they just use

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
In your webpage include an iframe with a Javascript to refresh it every five seconds. The iframe fetches a page from the server which brings in the new data in form of generated JavaScript which writes in the parent window. Thus, you get a very short request every 5 seconds to fetch

Re: [PERFORM] appropriate indexing

2004-11-04 Thread Matt Clark
- ITEM table will, grow, grow, grow (sold items are not deleted) WHERE PRODUCT.SECTION_USED_FK IS NOT NULL AND ITEM.STATUS=1 and (ITEM.KIND=2 or ITEM.KIND=3) Partial index on item.status ? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe

Re: [PERFORM] Better Hardware, worst Results

2004-11-04 Thread Matt Clark
All 3 plans have crappy estimates. Run ANALYZE in production, then send another explain analyze (as an attachment please, to avoid linewrap). Er, no other possible answer except Rod's :-) ---(end of broadcast)--- TIP 2: you can get off all lists

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
Javascript is too powerful to turn for any random web page. It is only essential for web pages because people write their web pages to only work with javascript. Hmm... I respectfully disagree. It is so powerful that it is impossible to ignore when implementing a sophisticated app. And it

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
A note though : you'll have to turn off HTTP persistent connections in your server (not in your proxy) or youre back to square one. I hadn't considered that. On the client side it would seem to be up to the client whether to use a persistent connection or not. If it does, then yeah, a

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Matt Clark
Pierre-Frédéric Caillaud wrote: check this marvelus piece of 5 minutes of work : http://boutiquenumerique.com/test/iframe_feed.html cela m'a fait le sourire :-) (apologies for bad french) M ---(end of broadcast)--- TIP 7: don't forget to increase

Re: [PERFORM] preloading indexes

2004-11-03 Thread Matt Clark
Title: Message The best way to get all the stuff needed by a query into RAM is to run the query. Is it more that you want to 'pin' the data in RAM so it doesn't get overwritten by other queries? -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of

Re: [PERFORM] can't handle large number of INSERT/UPDATEs

2004-10-27 Thread Matt Clark
and certainly anyone who's been around a computer more than a week or two knows which direction in and out are customarily seen from. regards, tom lane Apparently not whoever wrote the man page that everyone copied ;-) Interesting. I checked this on several machines. They actually

Re: [PERFORM] can't handle large number of INSERT/UPDATEs

2004-10-26 Thread Matt Clark
I don't have iostat on that machine, but vmstat shows a lot of writes to the drives, and the runnable processes are more than 1: 6 1 0 3617652 292936 279192800 0 52430 1347 4681 25 19 20 37 Assuming that's the output of 'vmstat 1' and not some other delay, 50MB/second of

[PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Matt Clark
performance for the WAL? * optimised scattered writes for checkpointing? * Knowledge that FSYNC is being used for preserving ordering a lot of the time, rather than requiring actual writes to disk (so long as the writes eventually happen in order...)? Matt Matt Clark Ymogen Ltd P: 0845 130 4531 W

Re: [PERFORM] OS desicion

2004-10-20 Thread Matt Clark
You are asking the wrong question. The best OS is the OS you (and/or the customer) knows and can administer competently. The real performance differences between unices are so small as to be ignorable in this context. The context switching bug is not OS-dependent, but varys in severity

Re: [PERFORM] how much mem to give postgres?

2004-10-20 Thread Matt Clark
How would I turn that off? In the kernel config? Not too familiar with that. I have a 2 proc xeon with 4 gigs of mem on the way for postgres, so I hope HT isn't a problem. If HT is turned off, does it just not use the other half of the processor? Or does the processor just work as one unit?

Re: [PERFORM] OS desicion

2004-10-20 Thread Matt Clark
The real performance differences between unices are so small as to be ignorable in this context. <> Well, at least the difference between Linux and BSD. There are substantial tradeoffs should you chose to use Solaris or UnixWare. Yes, quite right, I should have said

Re: [PERFORM] how much mem to give postgres?

2004-10-20 Thread Matt Clark
OT Hyperthreading is actually an excellent architectural feature that can give significant performance gains when implemented well and used for an appropriate workload under a decently HT aware OS. IMO, typical RDBMS streams are not an obviously appropriate workload, Intel didn't implement it

Re: [PERFORM] Select with qualified join condition / Batch inserts

2004-10-15 Thread Matt Clark
SELECT cmp.WELL_INDEX, cmp.COMPOUND, con.CONCENTRATION FROM SCR_WELL_COMPOUND cmp, SCR_WELL_CONCENTRATION con WHERE cmp.BARCODE=con.BARCODE AND cmp.WELL_INDEX=con.WELL_INDEX AND cmp.MAT_ID=con.MAT_ID AND cmp.MAT_ID = 3

Re: [PERFORM] Opteron vs RHAT

2004-10-13 Thread Matt Clark
trainwreck... If you're going through IBM, then they won't want to respond to any issues if you're not running a bog-standard RHAS/RHES release from Red Hat. ... To be fair, we keep on actually running into things that _can't_ be backported, like fibrechannel drivers that were

Re: [PERFORM] IBM P-series machines

2004-10-12 Thread Matt Clark
As for vendor support for Opteron, that sure looks like a trainwreck... If you're going through IBM, then they won't want to respond to any issues if you're not running a bog-standard RHAS/RHES release from Red Hat. And that, on Opteron, is preposterous, because there's plenty of the bits of

Re: [PERFORM] Which plattform do you recommend I run PostgreSQL for

2004-10-12 Thread Matt Clark
In the MySQL manual it says that MySQL performs best with Linux 2.4 with ReiserFS on x86. Can anyone official, or in the know, give similar information regarding PostgreSQL? I'm neither official, nor in the know, but I do have a spare moment! I can tell you that any *NIX variant on any modern

Re: [PERFORM] Caching of Queries

2004-10-01 Thread Matt Clark
OK, that'd work too... the point is if you're re-connecting all the time it doesn't really matter what else you do for performance. Yeah, although there is the chap who was asking questions on the list recently who had some very long-running code on his app servers, so was best off closing

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Matt Clark
Basically you set a default in seconds for the HTML results to be cached, and then have triggers set that force the cache to regenerate (whenever CRUD happens to the content, for example). Can't speak for Perl/Python/Ruby/.Net/Java, but Cache_Lite sure made a believer out of me! Nice to have

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Matt Clark
More to the point though, I think this is a feature that really really should be in the DB, because then it's trivial for people to use. How does putting it into PGPool make it any less trivial for people to use? The answers are at

Re: [PERFORM] Caching of Queries

2004-09-27 Thread Matt Clark
Any competently written application where caching results would be a suitable performance boost can already implement application or middleware caching fairly easily, and increase performance much more than putting result caching into the database would. I guess the performance increase is

Re: [PERFORM] Caching of Queries

2004-09-23 Thread Matt Clark
I've looked at PREPARE, but apparently it only lasts per-session - that's worthless in our case (web based service, one connection per data-requiring connection). That's a non-sequitur. Most 'normal' high volume web apps have persistent DB connections, one per http server process. Are you

Re: [PERFORM] insert

2004-08-13 Thread Matt Clark
It is likely that you are missing an index on one of those foreign key'd items. I don't think that is too likely as a foreign key reference must be a unique key which would have an index. I think you must be thinking of primary keys, not foreign keys. All one-to-many relationships

Re: [PERFORM] Performance Bottleneck

2004-08-08 Thread Matt Clark
And this is exactly where the pgpool advantage lies. Especially with the TPC-W, the Apache is serving a mix of PHP (or whatever CGI technique is used) and static content like images. Since the 200+ Apache kids serve any of that content by random and the emulated browsers very much

Re: [PERFORM] Swapping in 7.4.3

2004-07-16 Thread Matt Clark
This is normal. My personal workstation has been up for 16 days, and it shows 65 megs used for swap. The linux kernel looks for things that haven't been accessed in quite a while and tosses them into swap to free up the memory for other uses. This isn't PostgreSQL's fault, or anything

Re: [PERFORM] Postgres over Linux NBD or NFS

2004-06-22 Thread Matt Clark
How about iSCSI? This is exactly what it's for - presenting a bunch of remote SCSI hardware as if it were local. There are several reference implementations on SourceForge from Intel, Cisco others. I've never tried it myself, but I would if I had the need. And let's face it there are some

Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-20 Thread Matt Clark
As a cross-ref to all the 7.4.x tests people have sent in, here's 7.2.3 (Redhat 7.3), Quad Xeon 700MHz/1MB L2 cache, 3GB RAM. Idle-ish (it's a production server) cs/sec ~5000 3 test queries running: procs memoryswap io system cpu r b w swpd

Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-24 Thread Matt Clark
Now if these vendors could somehow eliminate downtime due to human error we'd be talking *serious* reliablity. You mean making the OS smart enough to know when clearing the arp cache is a bonehead operation, or just making the hardware smart enough to realise that the keyswitch really

Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Matt Clark
If it's going to be write intensive then the RAID controller will be the most important thing. A dual p3/500 with a write-back cache will smoke either of the boxes you mention using software RAID on write performance. As for the compute intensive side (complex joins sorts etc), the Dell will

Re: [PERFORM] insert speed - Mac OSX vs Redhat

2004-01-15 Thread Matt Clark
On a variety of hardware with Redhat, and versions of postgres, we're not getting much better than 50 inserts per second. This is prior to moving WAL to another disk, and fsync is on. However, with postgres 7.4 on Mac OSX 10.2.3, we're getting an amazing 500 inserts per second. We can only

Re: [PERFORM] Adding RAM: seeking advice warnings of hidden gotchas

2003-12-17 Thread Matt Clark
If you have 3 1.5GB tables then you might as well go for 4GB while you're at it. Make sure you've got a bigmem kernel either running or available, and boost effective_cache_size by whatever amount you increase the RAM by. We run a Quad Xeon/4GB server on Redhat 7.3 and it's solid as a rock.

Re: [PERFORM] tuning questions

2003-12-09 Thread Matt Clark
I ended up going back to a default postgresql.conf and reapplying the various tunings one-by-one. Turns out that while setting fsync = false had little effect on the slow IDE box, it had a drastic effect on this faster SCSI box and performance is quite acceptable now (aside from the expected

Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Matt Clark
On Sat, Oct 04, 2003 at 12:29:55AM +0100, Matt Clark wrote: My real world experience on a *very* heavily updated OLTP type DB, following advice from this list (thanks guys!), is that there is essentially zero cost to going ahead and vacuuming as often as you feel like it. Go crazy

Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Matt Clark
The point is that a big seqscan (either VACUUM or a plain table scan) hits a lot of pages, and thereby tends to fill your cache with pages that aren't actually likely to get hit again soon, perhaps pushing out pages that will be needed again soon. This happens at both the shared-buffer and

Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Matt Clark
vacuum full does require exclusive lock, plain vacuum does not. I think I need full, because there are updates on the table. As I understand it, an update in pg is an insert/delete, so it needs to be garbage collected. Yes and no. You only need a plain VACUUM that is run often enough to

Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Matt Clark
In summary, I suspect that it is better from a UI perspective to bring down the app on Sat at 3 a.m and reimport with a fixed time period than to live through reindexing/vacuuming which may deadlock. Am I missing something? Consider running pg_autovacuum, and thereby do a little bit of

Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Matt Clark
Also, if you find that you need to run VACUUM FULL often, then you need to raise your max_fsm_pages. Yes and no. If it's run often enough then the number of tracked pages shouldn't need to be raised, but then again... ...max_fsm_pages should be raised anyway. I'm about to reclaim a Pentium

Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-03 Thread Matt Clark
Also, if you find that you need to run VACUUM FULL often, then you need to raise your max_fsm_pages. Yes and no. If it's run often enough then the number of tracked pages shouldn't need to be raised, but then again... Oops, sorry, didn't pay attention and missed the mention of FULL. My

Re: [PERFORM] advice on raid controller

2003-09-28 Thread Matt Clark
As others have mentioned, you really ought to get battery-backed cache if you're doing any volume of writes. The ability to do safe write-back caching makes an *insane* difference to write performance. The site you link to also has that for only 15% more money:

Re: [PERFORM] Index problem

2003-09-25 Thread Matt Clark
There are about 2500 rows in that table. 1st query explain analyze: Seq Scan on PRIORITY_STATISTICS (cost=0.00..491.44 rows=127 width=12) (actual time=98.58..98.58 rows=0 loops=1) Total runtime: 98.74 msec 2nd query explain analyze: NOTICE: QUERY PLAN: Index Scan using

[PERFORM] Is there a reason _not_ to vacuum continuously?

2003-09-17 Thread Matt Clark
I'm running a load of stress scripts against my staging environment to simulate user interactions, and watching the various boxen as time goes by. I noticed that the CPU utilisation on the DB server (PG 7.2.3, RH7.3, Dual PII 550MHz, 1GB RAM, 1GB database on disk, Single 10k SCSI drive) was

Re: [PERFORM] Is there a reason _not_ to vacuum continuously?

2003-09-17 Thread Matt Clark
the DB server is under no more apparent load. I can only assume some kind of locking issue. Is that fair? M -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of scott.marlowe Sent: 17 September 2003 20:55 To: Matt Clark Cc: [EMAIL PROTECTED] Subject

Re: [PERFORM] Is there a reason _not_ to vacuum continuously?

2003-09-17 Thread Matt Clark
2) Are you sure that ANALYZE is needed? Vacuum is required whenever lots of rows are updated, but analyze is needed only when the *distribution* of values changes significantly. You are right. I have a related qn in this thread about random vs. monotonic values in indexed fields. 3) using

Re: [PERFORM] Inconsistent performance

2003-09-16 Thread Matt Clark
... #effective_cache_size = 1000# typically 8KB each That's horribly wrong. It's telling PG that your OS is only likely to cache 8MB of the DB in RAM. If you've got 1GB of memory it should be between 64000 and 96000 ---(end of

Re: [PERFORM] best arrangement of 3 disks for (insert) performance

2003-09-12 Thread Matt Clark
the machine will be dealing with lots of inserts, basically as many as we can throw at it If you mean lots of _transactions_ with few inserts per transaction you should get a RAID controller w/ battery backed write-back cache. Nothing else will improve your write performance by nearly as

Re: [PERFORM] slow plan for min/max

2003-09-08 Thread Matt Clark
Actually, referring down to later parts of this thread, why can't this optimisation be performed internally for built-in types? I understand the issue with aggregates over user-defined types, but surely optimising max() for int4, text, etc is safe and easy? Sorry, missed the bit about

Re: [PERFORM] slow plan for min/max

2003-09-08 Thread Matt Clark
Matt Clark [EMAIL PROTECTED] writes: Actually, referring down to later parts of this thread, why can't this optimisation be performed internally for built-in types? I understand the issue with aggregates over user-defined types, but surely optimising max() for int4, text, etc is safe

Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-28 Thread Matt Clark
Just how big do you expect your DB to grow? For a 1GB disk-space database, I'd probably just splurge for an SSD hooked up either via SCSI or FibreChannel. Heck, up to about 5Gb or so it is not that expensive (about $25k) and adding another 5Gb should set you back probably another $20k. I

Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-28 Thread Matt Clark
Ok.. I would be surprised if you needed much more actual CPU power. I suspect they're mostly idle waiting on data -- especially with a Quad Xeon (shared memory bus is it not?). In reality the CPUs get pegged: about 65% PG and 35% system. But I agree that memory throughput and latency is an