pectations, and it changes over
the time. Scaling up/down has helped us cope.
how do you add another server without having to do a massive data copy in the
process?
David Lang
- Live relocation of databases helps with hardware upgrades and spreading
of load.
Main issues:
- We are not overpr
very predictably spiky load and you
can add/remove machines to meet that load, but if you end up needing to have the
machines running a significant percentage of the time, dedicated boxes are
cheaper (as well as faster)
David Lang
--
Sent via pgsql-performance mailing list (pgsql-performance
tial corruption.
note tha the ext3, reiserfs, jfs, and xfs developers (at least) consider
fsck nessasary even for journaling fileysstems. they just let you get away
without it being mandatory after a unclean shutdown.
David Lang
---(end of broadcast)--
ease note, when I'm talking about support, it's not just postgresql
support, but also hardware/driver support that can run into these problems
David Lang
---(end of broadcast)---
TIP 4: Have you searched our list arch
would these be? As a representative of the most
prominent one in the US I can tell you that you are not speaking from a
knowledgeable position.
note I said many, not all. I am aware that your company does not fall into
this catagory.
David Lang
---(end of broa
On Wed, 9 Aug 2006, Stephen Frost wrote:
* David Lang ([EMAIL PROTECTED]) wrote:
there's a huge difference between 'works on debian' and 'supported on
debian'. I do use debian extensivly, (along with slackware on my personal
machines), so i am comfortable getting thing
hing ubuntu with great interest, it's debian under the covers,
but they're starting to get the recognition from the support groups of
companies)
David Lang
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
e you get on it.
David Lang
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
On Sat, 11 Mar 2006, Joost Kraaijeveld wrote:
Date: Sat, 11 Mar 2006 09:17:09 +0100
From: Joost Kraaijeveld <[EMAIL PROTECTED]>
To: David Lang <[EMAIL PROTECTED]>
Cc: Richard Huxton , pgsql-performance@postgresql.org
Subject: Re: [PERFORM] x206-x225
On Fri, 2006-03-10 at 23:57
rite (which you then wait for), so you can only do one transaction per
rotation.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
ff (a==b) )
if you could drop that constraint (the cost of which would be extra 'real'
compares within a bucket) then a helper function per datatype could work
as you are talking.
David Lang
---(end of broadcast)---
TIP 6: explain analyze is your friend
visualize
the problem in 2D or 3D, but I'm not sure that that geometric intuition
holds up in such a high-dimensional space as we have here.
I will say that I'm not understanding the problem well enough to
understand themulti-dimentional nature of this problem.
David Lang
--
ssentially eliminate disks for databases of this size.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
ameters and make sure it's not set to limit
the amount of memory used for cache (I'm not actually sure if there
is such a limit on Linux, but there definitely is on some other Unixen).
Linux doesn't have any ability to limit the amount of memory used for
caching (there are periodic
gt;/dev/null
will probably do a pretty good job of this (especially if large_file is
noticably larger then the amount of ram you have)
David Lang
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
you can know when to use each one.
for example I have a situation I am looking at where RAID0 is looking
appropriate for a database (a multi-TB array that gets completely reloaded
every month or so as data expires and new data is loaded from the
authoritative source, adding another 16 drives t
that got us off on this tangent, when
doing new writes to an array you don't have to read the blocks as they are
blank, assuming your cacheing is enough so that you can write blocksize*n
before the system starts actually writing the data)
David Lang
Alex.
On 12/25/05, Michael Sto
Linux, not the controllers.
Thanks for the clarification, I knew that PATA didn't do hotswap, and I've
seen discussions on the linux-kernel list about SATA hotswap being worked
on, but I thought that scsi handled it. how recent a kernel have you had
problems with?
David Lang
-
ight controller.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
r random writes as you state, but how does it do for
sequential writes (for example data mining where you do a large import at
one time, but seldom do other updates). I'm assuming a controller with a
reasonable amount of battery-backed cache.
David Lang
---(end
us speed demons, or dogs (and it could even be both, depending
on your workload)
David Lang
Thanks,
Juan
On Thursday 22 December 2005 22:12, David Lang wrote:
On Wed, 21 Dec 2005, Juan Casero wrote:
Date: Wed, 21 Dec 2005 22:31:54 -0500
From: Juan Casero <[EMAIL PROTECTED]>
ur workload
(I'm trying to get some started, but haven't had a chance yet)
David Lang
I know it
depends alot on the system but for now this database is about 20 gigabytes.
Not too large right now but it may grow 5x in the next year.
Thanks,
Juan
On Wednesday 21 December 2005 22:09, Juan
on your
1.2G system, buying a dual opteron with 16gigs of ram will allow you to
work with much larger sets of data, and you can go beyond that if needed.
David Lang
---(end of broadcast)---
TIP 6: explain analyze is your friend
On Tue, 20 Dec 2005, Alan Stange wrote:
David Lang wrote:
On Tue, 20 Dec 2005, Alan Stange wrote:
Jignesh K. Shah wrote:
I guess it depends on what you term as your metric for measurement.
If it is just one query execution time .. It may not be the best on
UltraSPARC T1.
But if you have
to 32 postgresql processes running in parallel on the
current systems (assuming the application can scale).
note that like hyperthreading, the strands aren't full processors, their
efficiancy depends on how much other threads shareing the core stall
waiting for external things.
On Mon, 19 Dec 2005, David Lang wrote:
this is getting dangerously close to being able to fit in ram. I saw an
article over the weekend that Samsung is starting to produce 8G DIMM's, that
can go 8 to a controller (instead of 4 per as is currently done), when
motherboards come out that su
with other apps you run the
risk of slowing down your database significantly. you may be better off
with fewer, but dedicated drives rather then more, but shared drives.
David Lang
---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
tabases where the data is split between machines this would be a hook
that the cluster engine could use to put it's own plan into place without
having to modify and recompile)
David Lang
---(end of broadcast)---
TIP 4: Have you searched our list
forms (hardware and OS) that it can run on. and this by itself
can result in significant wins (does oracle support Opteron CPU's in 64
bit mode yet? as of this summer it just wasn't an option)
David Lang
---(end of broadcast)---
TIP
".
Mark, I've seen these config options listed as tweaking targets fairly
frequently, has anyone put any thought or effort into creating a test
program that could analyse the actual system and set the defaults based on
the measured performance?
David Lang
---
e was very different then it is today) so some way to
gather real-world stats and set the system defaults based on actual
hardware performance is really the right way to go (even for things like
sequential scan speed that are set in the config file today)
David Lang
-
aster drives are better (less time to read or write a track)
so the 15k drive option is better
one other note, you probably don't want to use all the disks in a raid10
array, you probably want to split a pair of them off into a seperate raid1
array and put your WAL on it.
te
drive). you can test this (with significant data risk) by putting the WAL
on a ramdisk and see what your performance looks like.
David Lang
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
g things becouse it's
possible to make it much easier or much harder for the OS optimize things.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
o a test table
(millions of narrow rows). I'm waiting to see what happens once I have
data/pg_xlog on the 2nd disk set.
in that case you logicly have two disks, so see the post from Ron earlier
in this thread.
David Lang
---(end of broadcast)---
it is something similarly drastic)
this is also the reason why it's so good to have a filesystem journal on a
different drive.
David Lang
---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
something that seems as obvious as seperating the
sizeof from the data itself as you suggest above has a penalty, namely it
spreads the data that needs to be accessed to process a line between
different cache lines, so in some cases it won't be worth it)
David Lang
--
then it's a bug, parsing bugs could happen in the server as
welll. (in fact, the server could parse things to the intermediate format
and then convert them, this sounds expensive, but given the high clock
multipliers in use, it may not end up being measurable)
David Lang
--
s having to do to
support this), but if you allow for multiple clients it easily becomes a
win.
David Lang
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PR
would have to be
smarter to deal with the additional possibilities.
David Lang
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
them all working in
parallel throwing the data at one database then it is to throw more
hardware at the database server to speed it up (and yes, assuming that MPP
splits the parseing costs as well, it can be an answer for some types of
systems)
David Lang
but it's split between different machines, and the
binary representation of the data will reduce probably your network
traffic as a side effect.
and for things like date which get parsed in multiple ways until one is
found that seems sane, there's a significant amount of work that
On Fri, 2 Dec 2005, Qingqing Zhou wrote:
I don't have all the numbers readily available (and I didn't do all the
tests on every filesystem), but I found that even with only 1000
files/directory ext3 had some problems, and if you enabled dir_hash some
functions would speed up, but writing lots o
connect
to postgres on a different machine when doing the copy? (I'm thinking that
the first machine may be able to do a lot of the parseing and conversion,
leaving the second machine to just worry about doing the writes)
David Lang
---(end of broadcast)--
low the system to use the CPU more and overlap it with your
seeking.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
clude tests that your own database has trouble with :)
David Lang
-- Forwarded message -- Date: Thu, 01 Dec 2005 16:14:25
David,
The choice of benchmark depends on what kind of application would you
like to see performance for.
Than someone speaks about one or other database
On Thu, 1 Dec 2005, Qingqing Zhou wrote:
"David Lang" <[EMAIL PROTECTED]> wrote
a few weeks ago I did a series of tests to compare different filesystems.
the test was for a different purpose so the particulars are not what I
woud do for testing aimed at postgres, but I t
stop it every few days to defrag
things forever after.
David Lang
I can only think of two other options:
1. Change the database schema to reduce the number of tables involved.
I'm assuming that of the 3500 tables most hold the same data but for
different clients (or something similar). This
I'll see aobut
re-running the tests to get a complete set of benchmarks in the next few
days. My tests had their times vary from 4 min to 80 min depending on the
filesystem in use (ext3 with hash_dir posted the worst case). what testing
have other people done with different files
. if
you do a ls -l on the parent directory you will see that the size of the
directory is large if it's ever had lots of files in it, the only way to
shrink it is to mv the old directory to a new name, create a new directory
and move the files from the old directory to the new one.
David
;t remember who
and know nothing about them.
David Lang
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
t off topic for the postgres performance list (at least
until the postgres project itself starts implementing similar features :-)
David Lang
Thanks,
Brendan Duddridge | CTO | 403-277-5591 x24 | [EMAIL PROTECTED]
ClickSpac
On Sun, 27 Nov 2005, Andreas Pflug wrote:
David Lang wrote:
Postgres needs to work on the low end stuff as well as the high end stuff
or people will write their app to work with things that DO run on low end
hardware and they spend much more money then is needed to scale the
hardware up
just for a
seq scan, you'd better be in the multi-gb per second range.
if you truely need to scan the entire database then you are right, however
indexes should be able to cut the amount you need to scan drasticly.
David Lang
---(end of broa
eed is the bottleneck you still win with the small systems,
but for most other uses the large system would win easily. and in any case
it's not the open and shut case that you keep presenting it as.
David Lang
---(end of broadcast)---
TIP 6: explain analyze is your friend
ale the
hardware up rather then re-writing their app.
Part of the reason that I made the post on /. to start this was the hope
that a reasonable set of benchmarks could be hammered out and then more
people then just me could run them to get a wider range of results.
David Lang
-
by the way, this is the discussion that promped me to start this project
http://lwn.net/Articles/161323/
David Lang
---(end of broadcast)---
TIP 6: explain analyze is your friend
sts. I know the tuneing will be different for
different hardware, but if we can have a bunch of people run similar tests
we should learn a lot.
David Lang
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an approp
58 matches
Mail list logo