Re: [ADMIN] Re: HSA (Highly Scalable Architecture) Distribution andreplication

Rich Ryan Fri, 29 Oct 1999 20:14:50 -0700
> > It sounds to me like the "throw hardware at the problem" solution
> > is not even close to exhausted yet.  You are still using petty low
> > cost equipment.  Run the "top" program in an xterm on the database
> > server when the system is loaded.  What do you see?  Is the CPU at
> > 100%?  How much RAM is in use?  How much RAM is used for cache and
> > buffer?  Is any swap space in use?  How many Postgres backed
> > servers are typically running at once?
>
Here's a snapshot
CPU states: 92.0% user,  7.9% system,  0.0% nice,  0.0% idle
Mem:  192984K av, 187244K used,   5740K free, 105624K shrd,  47128K buff
Swap: 104416K av,   1160K used, 103256K free                 79908K cached
The current DB server can handle about 15 backends reasonably. I set the
max at 30. Anything above that becomes futile.


> > I don't think adding many $1k boxes is what your customers want.
> > Why do they care how many requests per hour you can handle if
> > each single request still take 45 seconds?  No, what your customers
> > want is for _thier_ request to have zero delay.  For that you want
> > one big fast computer.
> >
>
In non-peak hours, the average query takes 1-2 seconds. I think
this performance is acceptable. However, I only have 300K rows in my
biggest table. I will create a table with about 10 million rows and see
how it does. If it's slow, then I'd say you're right.

> > First off RAM helps a LOT.   192MB RAM is not much.  Look at "top"
> > and keep adding it until the cache and buffer sizes are quite large.
> > I would think 512MB or 1GB of RAM is what you need.  The idea is
> > to let the OS (Linux) cache a good partion of your database in
> > RAM.  Postgres has a buffer pool which you should set to some
> > huge size but OS level caching really helps speed things up.
> > My rule is to keep adding RAM until some of it is not used.
> >
>
Ok, I'll try 512RAM initially. Is the buffer pool the -B switch? Right
now, I start postmaster with -B 160 -N 80. So maybe with 512MB RAM, I
should allocate 256MB of buffer space, so B = 256 / single buffer Size.
>
> > Multiple CPUs IMO _will_ help.  Check again with "top".  If you
> > have enough RAM and fast enough disks then CPU will be your
> > bottleneck.  You can see that the CPU is in fact the  bottleneck
> > if 'top" shows a CPU utilization up above 85% to 100%.  Adding
> > a second (or 3rd, 4th or 10th) CPU will help if utilization is
> > close to 100%.  Just keep adding them until you have idle CPU time.
> > Same as above.  You want free RAM and idle CPU time
> >
> > One reason I think a multi-CPU machine is better is that you
> > only need to buy that 512 or 1024MB of RAM once.  The RAM is for
> > caching so All four CPUs can share that big RAM cache.  Put each
> > CPU in it's own box and they each get only a small cache or you
> > pay 4x for four big RAMs.  It gets worse as you have to add
> > overhead to keep all those databases in sync.
> >

Good point.

> > You may have to "bite the bullet". and realize you have out grown
> > PC hardware.  Yes you could go with a four-way SMP Xeon and RAID
> > but for that kind of money (or less) you could also get a used
> > Sun E450.
> >
> > I run Postgres on both a Pentium with Linux and on a Sun SPARC
> > with Solaris 7.  I am pretty impressed with Sun hardware.  One
> > thing about Sun.  You will never outgrow it.  the 450 could likely
> > run both your web and database servers.   The cost is not all
> > that much in the grand scheme of things.  the E450 is about the
> > same price as just one car.  Sun will run Linux, NetBSD and Solaris.
> > As much as I am a Linux fan I think for this application (database
> > server) Solaris is the way to go.
>
I guess i'm pretty ignorant about big expensive servers. Is the
450 really that much faster than say a few Dual PIII's? I guess if I can
buy 10 Dual PIII for the price of one 450, I'd have a pretty hard time
buying the 450. I feel like no matter how big a server I buy, I will
eventually outgrow it. I like the SETI example of using a bunch of little
bitty computers around the world to accomplish what a few huge mainframes
can't.

> >
> > So do your upgrade this weekend but then in two months, if you are
> > lucky, you will need to upgrade again. Maybe then think about a
> > non-PC that will have a three to five year life.   A good rule of
> > thumb is that when you install a new server it should run a 50%
> > capacity.  If it is at 100% on day one you have just wasted your
> > time and money as you are just going to have to upgrade again.
> >
Good rule of thumb that 50%.

Thanks for your help,
Rich
>
> > PostgreSQL Server wrote:
> > >
> > > I'm glad someone asked this, because I'd like to try and optimize my
> > > database or the way I'm using it, and I've reached my personal limit
on
> > > optimization at this point.
> > >
> > > 1) I'm using two machines so far for my e-comerce solution. One
> > > machine is running the web server and java
> > > servlets, the other postgres lastest from CVS. Server for psql is RH
Linux
> > > 6, on PII 333, 192MB RAM, 104MB Swap. The HD psql is running on is 4GB
> > > UW SCSI. No RAID, just daily FTP and Tape backups. I'm getting ready
to
> > > upgrade the MB to the latest PIII whatever this weekend. I'm thinking
a
> > > dual-CPU box would be a good idea, however, I'm not sure how much good
a
> > > dual-CPU will really do. I feel that in another 2 months, I'll be in
the
> > > same boat again. Whereas if I come up with a distributed solution, I
can
> > > just keep adding those $1000 Linux boxes. :-)
> > >
> > > 2) One table gets 99% of the queries, it has about 300K rows, by
watching
> > > the log, it seems to get about 120 queries per minute on average,
peaking
> > > at 480 queries per minute during peak hours. Not sure what a join is,
do
> > > you mean union? If so, then yes, almost every query to this table is
of
> > > the form
> > > select a.*, b.1, b.2 where a.oid = b.id and a.int4 < x and a.str =
> > > 'String' union select a.*, c.1, c.2 where a.oid = c.id and a.int4 < x
and
> > > a.str = 'String' limit 100;
> > > I defined indexes on all the fields which appear in a WHERE clause,
using
> > > either hash or btree, as recommended by the docs. The only operator I
> > > couldn't find was for a money type.
> > > I don't use order by, although I'd like to, because the docs said that
> > > indexes are not used on order by queries. I use limit also.
> > >
> > > I think writing a Distribution/Replication engine using Java servlets
> > > should be a medium size task. Making it fault-tolerant and dealing
with
> > > crash and error recovery issues could make it larger. If I end up
writing
> > > it from scratch, I'll post the source for others to use.
> > >
> > > Thanks,
> > > Rich
> > >
> > > On Fri, 29 Oct 1999, Chris Albertson wrote:
> > >
> > > > >
> > > > > Date: Thu, 28 Oct 1999 01:11:32 -0700
> > > > > From: Rich Ryan <[EMAIL PROTECTED]>
> > > > > Subject: HSA (Highly Scalable Architecture) Distribution and
replication
> > > > >
> > > > > Has anyone written a HSA e-commerce solution using postgres? In
english,
> > > > > this means I want more than one server running postgres with the
same
> > > > > database(s). I get lots and lots of queries, and it's just too
much for one
> > > > > poor machine to handle.
> > > >
> > > >
> > > > Is this really true?  "too much for one poor machine to handle"?
> > > >
> > > > Could you be more specific?
> > > >   1) What kind of hardware are you using.  What CPU(s), how fast?
> > > >      How much RAM?, What kind of disk(s) (RAID?) and so on.
> > > >
> > > >   2) How big is your task?  How many rows and tables.  How many
> > > >      queries per minute?  Do you do many big joins?
> > > >
> > > > I think Postgres scales well if you throw hardware at at.  I wonder
> > > > if you've tried that yet.
> > > >
> > > > All that said, I'd be very much interrested in Distribution and
> > > > replication for another reason.  I am working with a large database.
> > > > (on order of a hundred million rows)  Our application is distributed
> > > > over the (much to slow) Internet.  The ability to periodically
> > > > synchronize servers would work for us.
> > > >
> > > >
> > > > PS Please cc any reply to me directly as a get the digest.
> > > > --
> > > >   Chris Albertson
> > > >
> > > >   [EMAIL PROTECTED]                  Voice: 626-351-0089  X127
> > > >   Logicon, Pasadena California            Fax:   626-351-0699
> > > >
> >
> > --
> >   Chris Albertson
> >
> >   [EMAIL PROTECTED]                  Voice: 626-351-0089  X127
> >   Logicon, Pasadena California            Fax:   626-351-0699
> >
> > ************
> >
>
>


************
Re: [ADMIN] Re: HSA (Highly Scalable Architecture) Distribution andreplication

Reply via email to