Presumably it can't _ever_ know without being explicitly told, because
even for a plain SELECT there might be triggers involved that update
tables, or it might be a select of a stored proc, etc. So in the
general case, you can't assume that a select doesn't cause an update,
and you can't be
that pgpool couldn't make a good guess in the majority
of cases!
M
Joshua D. Drake wrote:
Matt Clark wrote:
Presumably it can't _ever_ know without being explicitly told,
because even for a plain SELECT there might be triggers involved that
update tables, or it might be a select of a stored
This page may be of use:
http://www.serverworldmagazine.com/monthly/2003/02/solaris.shtml
From personal experience, for god's sake don't think Solaris' VM/swap
implementation is easy - it's damn good, but it ain't easy!
Matt
Kevin Schroeder wrote:
I think it's probably just reserving them. I
Another man working to the bitter end this Christmas!
There could be many reasons, but maybe first you should look at the amount
of RAM available? If the tables fit in RAM on the production server but not
on the dev server, then that will easily defeat the improvement due to using
the native DB
Am I right to assume that writeback is both fastest and at
the same time as safe to use as ordered? Maybe any of you
did some benchmarks?
It should be fastest because it is the least overhead, and safe because
postgres does it's own write-order guaranteeing through fsync(). You should
also
For some reason it's a requirement that partial wildcard
searches are done on this field, such as SELECT ... WHERE
field LIKE 'A%'
I thought an interesting way to do this would be to simply
create partial indexes for each letter on that field, and it
works when the query matches the
With that many rows, and a normal index on the field,
postgres figures the best option for say I% is not an index
scan, but a sequential scan on the table, with a filter --
quite obviously this is slow as heck, and yes, I've run
analyze several times and in fact have the vacuum analyze
I have a dual processor system that can support over 150 concurrent
connections handling normal traffic and load. Now suppose I setup
Apache to spawn all of it's children instantly, what will
...
This will spawn 150 children in a short order of time and as
this takes
Doctor, it hurts
Apache::DBI overall works better to what I require, even if
it is not a
pool per sey. Now if pgpool supported variable rate pooling like
Apache does with it's children, it might help to even things
out. That
and you'd still get the spike if you have to start the webserver and
Case in point: A first time visitor hits your home page. A
dynamic page is generated (in about 1 second) and served
(taking 2 more seconds) which contains links to 20 additional
The gain from an accelerator is actually even more that that, as it takes
essentially zero seconds for Apache to
Correct the 75% of all hits are on a script that can take
anywhere from
a few seconds to a half an hour to complete.The script
essentially
auto-flushes to the browser so they get new information as it arrives
creating the illusion of on demand generation.
This is more like a
1- You have a query that runs for half an hour and you spoon feed
the results to the client ?
(argh)
2- Your script looks for new data every few seconds, sends a
packet, then sleeps, and loops ?
If it's 2 I have a readymade solution for you, just ask.
I'm guessing (2) - PG
These are CGI scripts at the lowest level, nothing more and nothing
less. While I could probably embed a small webserver directly into
the perl scripts and run that as a daemon, it would take away the
portability that the scripts currently offer.
If they're CGI *scripts* then they just use
In your webpage include an iframe with a Javascript to refresh it
every five seconds. The iframe fetches a page from the server which
brings in the new data in form of generated JavaScript which writes
in the parent window. Thus, you get a very short request every 5
seconds to fetch
- ITEM table will, grow, grow, grow (sold items are not deleted)
WHERE PRODUCT.SECTION_USED_FK IS NOT NULL AND ITEM.STATUS=1 and
(ITEM.KIND=2 or ITEM.KIND=3)
Partial index on item.status ?
---(end of broadcast)---
TIP 1: subscribe and unsubscribe
All 3 plans have crappy estimates.
Run ANALYZE in production, then send another explain analyze (as an
attachment please, to avoid linewrap).
Er, no other possible answer except Rod's :-)
---(end of broadcast)---
TIP 2: you can get off all lists
Javascript is too powerful to turn for any random web page. It is only
essential for web pages because people write their web pages to only
work with javascript.
Hmm... I respectfully disagree. It is so powerful that it is impossible
to ignore when implementing a sophisticated app. And it
A note though : you'll have to turn off HTTP persistent
connections in your server (not in your proxy) or youre back to
square one.
I hadn't considered that. On the client side it would seem to be up to
the client whether to use a persistent connection or not. If it does,
then yeah, a
Pierre-Frédéric Caillaud wrote:
check this marvelus piece of 5 minutes of work :
http://boutiquenumerique.com/test/iframe_feed.html
cela m'a fait le sourire :-)
(apologies for bad french)
M
---(end of broadcast)---
TIP 7: don't forget to increase
Title: Message
The best way to get all the stuff needed by a query into
RAM is to run the query. Is it more that you want to 'pin' the data in RAM
so it doesn't get overwritten by other
queries?
-Original Message-From:
[EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
and certainly anyone who's been around a computer more than a week or
two knows which direction in and out are customarily seen from.
regards, tom lane
Apparently not whoever wrote the man page that everyone copied ;-)
Interesting. I checked this on several machines. They actually
I don't have iostat on that machine, but vmstat shows a lot of writes to
the drives, and the runnable processes are more than 1:
6 1 0 3617652 292936 279192800 0 52430 1347 4681 25
19 20 37
Assuming that's the output of 'vmstat 1' and not some other delay,
50MB/second of
performance for the WAL?
* optimised scattered writes for checkpointing?
* Knowledge that FSYNC is being used for preserving ordering a lot of the
time, rather than requiring actual writes to disk (so long as the writes
eventually happen in order...)?
Matt
Matt Clark
Ymogen Ltd
P: 0845 130 4531
W
You are asking the wrong question. The best OS is the OS you (and/or
the customer) knows and can administer competently. The real
performance differences between unices are so small as to be ignorable
in this context. The context switching bug is not OS-dependent, but
varys in severity
How would I turn that off? In the kernel config? Not too familiar with
that. I have a 2 proc xeon with 4 gigs of mem on the way for postgres,
so I hope HT isn't a problem. If HT is turned off, does it just not
use the other half of the processor? Or does the processor just work
as one unit?
The real
performance differences between unices are so small as to be ignorable
in this context.
<>
Well, at least the difference between Linux and BSD. There are
substantial
tradeoffs should you chose to use Solaris or UnixWare.
Yes, quite right, I should have said
OT
Hyperthreading is actually an excellent architectural feature that
can give significant performance gains when implemented well and used
for an appropriate workload under a decently HT aware OS.
IMO, typical RDBMS streams are not an obviously appropriate workload,
Intel didn't implement it
SELECT cmp.WELL_INDEX, cmp.COMPOUND, con.CONCENTRATION
FROM SCR_WELL_COMPOUND cmp, SCR_WELL_CONCENTRATION con
WHERE cmp.BARCODE=con.BARCODE
AND cmp.WELL_INDEX=con.WELL_INDEX
AND cmp.MAT_ID=con.MAT_ID
AND cmp.MAT_ID = 3
trainwreck... If you're going through IBM, then they won't want to
respond to any issues if you're not running a
bog-standard RHAS/RHES
release from Red Hat.
... To be fair, we keep on actually running into things that
_can't_ be backported, like fibrechannel drivers that were
As for vendor support for Opteron, that sure looks like a
trainwreck... If you're going through IBM, then they won't want to
respond to any issues if you're not running a bog-standard RHAS/RHES
release from Red Hat. And that, on Opteron, is preposterous, because
there's plenty of the bits of
In the MySQL manual it says that MySQL performs best with Linux 2.4 with
ReiserFS on x86. Can anyone official, or in the know, give similar
information regarding PostgreSQL?
I'm neither official, nor in the know, but I do have a spare moment! I
can tell you that any *NIX variant on any modern
OK, that'd work too... the point is if you're re-connecting
all the time it doesn't really matter what else you do for
performance.
Yeah, although there is the chap who was asking questions on the list
recently who had some very long-running code on his app servers, so was best
off closing
Basically you set a default in seconds for the HTML results to be
cached, and then have triggers set that force the cache to regenerate
(whenever CRUD happens to the content, for example).
Can't speak for Perl/Python/Ruby/.Net/Java, but Cache_Lite sure made a
believer out of me!
Nice to have
More to the point though, I think this is a feature that really really
should be in the DB, because then it's trivial for people to use.
How does putting it into PGPool make it any less trivial for people to
use?
The answers are at
Any competently written application where caching results would be a
suitable performance boost can already implement application or
middleware caching fairly easily, and increase performance much more
than putting result caching into the database would.
I guess the performance increase is
I've looked at PREPARE, but apparently it only lasts
per-session - that's worthless in our case (web based
service, one connection per data-requiring connection).
That's a non-sequitur. Most 'normal' high volume web apps have persistent
DB connections, one per http server process. Are you
It is likely that you are missing an index on one of those foreign
key'd items.
I don't think that is too likely as a foreign key reference
must be a unique key which would have an index.
I think you must be thinking of primary keys, not foreign keys. All
one-to-many relationships
And this is exactly where the pgpool advantage lies.
Especially with the
TPC-W, the Apache is serving a mix of PHP (or whatever CGI
technique is
used) and static content like images. Since the 200+ Apache
kids serve
any of that content by random and the emulated browsers very much
This is normal. My personal workstation has been up for 16
days, and it shows 65 megs used for swap. The linux kernel
looks for things that haven't been accessed in quite a while
and tosses them into swap to free up the memory for other uses.
This isn't PostgreSQL's fault, or anything
How about iSCSI? This is exactly what it's for - presenting a bunch of
remote SCSI hardware as if it were local.
There are several reference implementations on SourceForge from Intel, Cisco
others.
I've never tried it myself, but I would if I had the need. And let's face
it there are some
As a cross-ref to all the 7.4.x tests people have sent in, here's 7.2.3 (Redhat 7.3),
Quad Xeon 700MHz/1MB L2 cache, 3GB RAM.
Idle-ish (it's a production server) cs/sec ~5000
3 test queries running:
procs memoryswap io system cpu
r b w swpd
Now if these vendors could somehow eliminate downtime due to human error
we'd be talking *serious* reliablity.
You mean making the OS smart enough to know when clearing the arp
cache is a bonehead operation, or just making the hardware smart
enough to realise that the keyswitch really
If it's going to be write intensive then the RAID controller will be the most
important thing. A dual p3/500 with a write-back
cache will smoke either of the boxes you mention using software RAID on write
performance.
As for the compute intensive side (complex joins sorts etc), the Dell will
On a variety of hardware with Redhat, and versions of postgres, we're
not getting much better than 50 inserts per second. This is prior to
moving WAL to another disk, and fsync is on.
However, with postgres 7.4 on Mac OSX 10.2.3, we're getting an amazing
500 inserts per second.
We can only
If you have 3 1.5GB tables then you might as well go for 4GB while you're at
it. Make sure you've got a bigmem kernel either running or available, and
boost effective_cache_size by whatever amount you increase the RAM by. We
run a Quad Xeon/4GB server on Redhat 7.3 and it's solid as a rock.
I ended up going back to a default postgresql.conf and reapplying the
various tunings one-by-one. Turns out that while setting fsync = false
had little effect on the slow IDE box, it had a drastic effect on this
faster SCSI box and performance is quite acceptable now (aside from the
expected
On Sat, Oct 04, 2003 at 12:29:55AM +0100, Matt Clark wrote:
My real world experience on a *very* heavily updated OLTP type
DB, following
advice from this list (thanks guys!), is that there is
essentially zero cost
to going ahead and vacuuming as often as you feel like it. Go
crazy
The point is that a big seqscan (either VACUUM or a plain table scan)
hits a lot of pages, and thereby tends to fill your cache with pages
that aren't actually likely to get hit again soon, perhaps pushing out
pages that will be needed again soon. This happens at both the
shared-buffer and
vacuum full does require exclusive lock, plain vacuum does not.
I think I need full, because there are updates on the table. As I
understand it, an update in pg is an insert/delete, so it needs
to be garbage collected.
Yes and no. You only need a plain VACUUM that is run often enough to
In summary, I suspect that it is better from a UI perspective to
bring down the app on Sat at 3 a.m and reimport with a fixed time
period than to live through reindexing/vacuuming which may deadlock.
Am I missing something?
Consider running pg_autovacuum, and thereby do a little bit of
Also, if you find that you need to run VACUUM FULL often, then
you need to
raise your max_fsm_pages.
Yes and no. If it's run often enough then the number of tracked pages
shouldn't need to be raised, but then again...
...max_fsm_pages should be raised anyway. I'm about to reclaim a Pentium
Also, if you find that you need to run VACUUM FULL often, then
you need to
raise your max_fsm_pages.
Yes and no. If it's run often enough then the number of tracked pages
shouldn't need to be raised, but then again...
Oops, sorry, didn't pay attention and missed the mention of FULL. My
As others have mentioned, you really ought to get battery-backed cache if
you're doing any volume of writes. The ability to do safe write-back
caching makes an *insane* difference to write performance.
The site you link to also has that for only 15% more money:
There are about 2500 rows in that table.
1st query explain analyze: Seq Scan on PRIORITY_STATISTICS
(cost=0.00..491.44 rows=127 width=12) (actual time=98.58..98.58 rows=0
loops=1)
Total runtime: 98.74 msec
2nd query explain analyze: NOTICE: QUERY PLAN:
Index Scan using
I'm running a load of stress scripts against my staging environment to
simulate user interactions, and watching the various boxen as time goes by.
I noticed that the CPU utilisation on the DB server (PG 7.2.3, RH7.3, Dual
PII 550MHz, 1GB RAM, 1GB database on disk, Single 10k SCSI drive) was
the DB server is under no more
apparent load. I can only assume some kind of locking issue. Is that fair?
M
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of
scott.marlowe
Sent: 17 September 2003 20:55
To: Matt Clark
Cc: [EMAIL PROTECTED]
Subject
2) Are you sure that ANALYZE is needed? Vacuum is required
whenever lots of
rows are updated, but analyze is needed only when the *distribution* of
values changes significantly.
You are right. I have a related qn in this thread about random vs. monotonic
values in indexed fields.
3) using
...
#effective_cache_size = 1000# typically 8KB each
That's horribly wrong. It's telling PG that your OS is only likely to cache
8MB of the DB in RAM. If you've got 1GB of memory it should be between
64000 and 96000
---(end of
the machine will be dealing with lots of inserts, basically as many as we can
throw at it
If you mean lots of _transactions_ with few inserts per transaction you should get a
RAID controller w/ battery backed write-back
cache. Nothing else will improve your write performance by nearly as
Actually, referring down to later parts of this thread, why can't this
optimisation be performed internally for built-in types? I understand the
issue with aggregates over user-defined types, but surely optimising max()
for int4, text, etc is safe and easy?
Sorry, missed the bit about
Matt Clark [EMAIL PROTECTED] writes:
Actually, referring down to later parts of this thread, why can't this
optimisation be performed internally for built-in types? I
understand the
issue with aggregates over user-defined types, but surely
optimising max()
for int4, text, etc is safe
Just how big do you expect your DB to grow? For a 1GB disk-space
database, I'd probably just splurge for an SSD hooked up either via
SCSI or FibreChannel. Heck, up to about 5Gb or so it is not that
expensive (about $25k) and adding another 5Gb should set you back
probably another $20k. I
Ok.. I would be surprised if you needed much more actual CPU power. I
suspect they're mostly idle waiting on data -- especially with a Quad
Xeon (shared memory bus is it not?).
In reality the CPUs get pegged: about 65% PG and 35% system. But I agree that memory
throughput and latency is an
63 matches
Mail list logo