Re: [PERFORM] Putting tables or indexes in SSD or RAM: avoiding double caching?

2009-05-26 Thread Kenny Gorman
Well for one thing on the IODrive.  Be sure to use a FS that supports direct IO 
so you don't cache it on the FS level and thus take room an object not on SSD 
could use.  We use vxfs with mincache=direct as our filesystem for just this 
reason.  Also, there is an IO drive tuning manual that discusses the same.  
It's a good read if you don't already have it.  I do not know of a way to 
partition the PG cache other than make it small and use the FS controls to 
force direct IO.

2.5 cents..

-kg


-Original Message-
From: pgsql-performance-ow...@postgresql.org on behalf of Shaul Dar
Sent: Mon 5/25/2009 6:51 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Putting tables or indexes in SSD or RAM: avoiding double 
caching?
 
Hi,

I have sen many posts on using SSDs, and iodrive
http://www.fusionio.comin particular, to accelerate the performance
of Postgresql (or other DBMS)
-- e.g. this 
discussionhttp://groups.google.co.il/group/pgsql.performance/browse_thread/thread/1d6d7434246afd97?pli=1.
I have also seen the suggestion to use RAM for the same purpose by creating
a tablespace on a RAM mount
point.http://magazine.redhat.com/2007/12/12/tip-from-an-rhce-memory-storage-on-postgresql/Granted
these make most sense when the whole database cannot fit into main
memory, or if we want to avoid cold DB response times (i.e waiting for the
DB to warm up as stuff gets cached in memory).

My question is this: if we use either SSD or RAM tablespaces, I would
imagine postgresql will be oblevient to this and would still cache the
tablespace elemenst that are on SSD or RAM into memory - right? Is there a
way to avoid that, i.e. to tell postgress NOT to cache tablespaces, or some
other granularity of the DB?

Thanks,

-- Shaul

*Dr. Shaul Dar*
Email: i...@shauldar.com
Web: www.shauldar.com



Re: [PERFORM] Problems with autovacuum

2009-05-26 Thread Alvaro Herrera
Tom Lane escribió:
 =?UTF-8?B?xYF1a2FzeiBKYWdpZcWCxYJv?= lukasz.jagie...@gforces.pl writes:
  That autovacuum working hole time, shoudn't be run only when db needs ?
 
 With 2000 databases to cycle through, autovac is going to be spending
 quite a lot of time just finding out whether it needs to do anything.
 I believe the interpretation of autovacuum_naptime is that it should
 examine each database that often, ie once a minute by default.  So
 it's got more than 30 databases per second to look through.

Note that this is correct in 8.1 and 8.2 but not 8.3 onwards.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problems with autovacuum

2009-05-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane escribió:
 I believe the interpretation of autovacuum_naptime is that it should
 examine each database that often, ie once a minute by default.  So
 it's got more than 30 databases per second to look through.

 Note that this is correct in 8.1 and 8.2 but not 8.3 onwards.

Oh?  The current documentation still defines the variable thusly:

Specifies the minimum delay between autovacuum runs on any given
database. In each round the daemon examines the database and
issues VACUUM and ANALYZE commands as needed for tables in that
database.

I suppose the use of minimum means that this is not technically
incorrect, but it's sure not very helpful if there is some other
rule involved that causes it to not behave as I said.  (And if there
is some other rule, what is that?)  Please improve the docs.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problems with autovacuum

2009-05-26 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Tom Lane escribi�:
  I believe the interpretation of autovacuum_naptime is that it should
  examine each database that often, ie once a minute by default.  So
  it's got more than 30 databases per second to look through.
 
  Note that this is correct in 8.1 and 8.2 but not 8.3 onwards.
 
 Oh?  The current documentation still defines the variable thusly:
 
   Specifies the minimum delay between autovacuum runs on any given
   database. In each round the daemon examines the database and
   issues VACUUM and ANALYZE commands as needed for tables in that
   database.

Sorry, it's the other way around actually -- correct for 8.3 onwards,
wrong for 8.1 and 8.2.  In the earlier versions, it would do one run in
a chosen database, sleep during naptime, then do another run.

 I suppose the use of minimum means that this is not technically
 incorrect, but it's sure not very helpful if there is some other
 rule involved that causes it to not behave as I said.  (And if there
 is some other rule, what is that?)

The word minimum is there because it's possible that all workers are
busy with some other database(s).

 Please improve the docs.

I'll see about that.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problems with autovacuum

2009-05-26 Thread Łukasz Jagiełło
W dniu 26 maja 2009 20:28 użytkownik Tom Lane t...@sss.pgh.pa.us napisał:
 I believe the interpretation of autovacuum_naptime is that it should
 examine each database that often, ie once a minute by default.  So
 it's got more than 30 databases per second to look through.

 Note that this is correct in 8.1 and 8.2 but not 8.3 onwards.

 Oh?  The current documentation still defines the variable thusly:

        Specifies the minimum delay between autovacuum runs on any given
        database. In each round the daemon examines the database and
        issues VACUUM and ANALYZE commands as needed for tables in that
        database.

 I suppose the use of minimum means that this is not technically
 incorrect, but it's sure not very helpful if there is some other
 rule involved that causes it to not behave as I said.  (And if there
 is some other rule, what is that?)  Please improve the docs.

After change autovacuum_naptime postgresql behave like you wrote before.

-- 
Łukasz Jagiełło
System Administrator
G-Forces Web Management Polska sp. z o.o. (www.gforces.pl)

Ul. Kruczkowskiego 12, 80-288 Gdańsk
Spółka wpisana do KRS pod nr 246596 decyzją Sądu Rejonowego Gdańsk-Północ

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problems with autovacuum

2009-05-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Sorry, it's the other way around actually -- correct for 8.3 onwards,
 wrong for 8.1 and 8.2.  In the earlier versions, it would do one run in
 a chosen database, sleep during naptime, then do another run.

 Tom Lane escribió:
 I suppose the use of minimum means that this is not technically
 incorrect, but it's sure not very helpful if there is some other
 rule involved that causes it to not behave as I said.  (And if there
 is some other rule, what is that?)

 The word minimum is there because it's possible that all workers are
 busy with some other database(s).

 Please improve the docs.

 I'll see about that.

Hmm, maybe we need to improve the code too.  This example suggests that
there needs to be some limit on the worker launch rate, even if there
are so many databases that that means we don't meet naptime exactly.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Greg Smith
I keep falling into situations where it would be nice to host a server 
somewhere else.  Virtual host solutions and the mysterious cloud are no 
good for the ones I run into though, as disk performance is important for 
all the applications I have to deal with.


What I'd love to have is a way to rent a fairly serious piece of dedicated 
hardware, ideally with multiple (at least 4) hard drives in a RAID 
configuration and a battery-backed write cache.  The cache is negotiable. 
Linux would be preferred, FreeBSD or Solaris would also work; not Windows 
though (see good DB performance).


Is anyone aware of a company that offers such a thing?

--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Craig James

Greg Smith wrote:
What I'd love to have is a way to rent a fairly serious piece of 
dedicated hardware, ideally with multiple (at least 4) hard drives in a 
RAID configuration and a battery-backed write cache.  The cache is 
negotiable. Linux would be preferred, FreeBSD or Solaris would also 
work; not Windows though (see good DB performance).


We tried this with poor results.  Most of the co-location and server-farm 
places are set up with generic systems that are optimized for 
small-to-medium-sized web sites.  They use MySQL and are surprised to hear 
there's an alternative open-source DB.  They claim to be able to create custom 
configurations, but it's a lie.

The problem is that they run on thin profit margins, and their techs are mostly 
ignorant, they just follow scripts.  If something goes wrong, or they make an 
error, you can't get anything through their thick heads.  And you can't go down 
there and fix it yourself.

For example, we told them EXACTLY how to set up our system, but they decided 
that automatic monthly RPM OS updates couldn't hurt.  So the first of the 
month, we in the morning to find that Linux had been updated to libraries that 
were incompatible with our own software, the system automatically rebooted and 
our web site was dead.  And many similar incidents.

We finally bought some nice Dell servers and found a co-location site that 
provides us all the infrastructure (reliable power, internet, cooling, 
security...), and we're in charge of the computers.  We've never looked back.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Marcin Stępnicki
On Tue, May 26, 2009 at 11:51 PM, Greg Smith gsm...@gregsmith.com wrote:
 I keep falling into situations where it would be nice to host a server
 somewhere else.  Virtual host solutions and the mysterious cloud are no good
 for the ones I run into though, as disk performance is important for all the
 applications I have to deal with.

Perhaps you'll be satisfied with
http://www.ovh.co.uk/products/dedicated_list.xml ? Personally I have
only one machine there (SuperPlan Mini) - I asked them to set up
Proxmox (http://pve.proxmox.com/wiki/Main_Page ) for me and now I have
four OpenVZ Linux containers with different setup and services. So far
I can't be more happy.

Regards,
Marcin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Jerry Champlin
Depends on the level of facility you are looking for.  Peer1 (www.peer1.com)
will sell you just about whatever you need contained in a single box and I
believe their Atlanta facility and some others have a managed SAN option.
Since you want a customized solution, make sure you talk with one of their
solutions engineers.  Another good option in this range up to mid-enterprise
hosting solutions is Host My Site (www.hostmysite.com).  On the very high
end of the spectrum, gni (www.gni.com) seems to provide a good set of
infrastructure as a service (IAAS) solutions including SAN storage and very
high bandwidth - historically they have been very successful in the MPOG
world.  If you are interested, I can put you in touch with real people who
can help you at all three organizations.

Jerry Champlin|Absolute Performance Inc.


-Original Message-
From: pgsql-performance-ow...@postgresql.org
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Greg Smith
Sent: Tuesday, May 26, 2009 3:51 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Hosted servers with good DB disk performance?

I keep falling into situations where it would be nice to host a server 
somewhere else.  Virtual host solutions and the mysterious cloud are no 
good for the ones I run into though, as disk performance is important for 
all the applications I have to deal with.

What I'd love to have is a way to rent a fairly serious piece of dedicated 
hardware, ideally with multiple (at least 4) hard drives in a RAID 
configuration and a battery-backed write cache.  The cache is negotiable. 
Linux would be preferred, FreeBSD or Solaris would also work; not Windows 
though (see good DB performance).

Is anyone aware of a company that offers such a thing?

--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Joshua D. Drake
On Tue, 2009-05-26 at 17:51 -0400, Greg Smith wrote:
 I keep falling into situations where it would be nice to host a server 
 somewhere else.  Virtual host solutions and the mysterious cloud are no 
 good for the ones I run into though, as disk performance is important for 
 all the applications I have to deal with.
 
 What I'd love to have is a way to rent a fairly serious piece of dedicated 
 hardware, ideally with multiple (at least 4) hard drives in a RAID 
 configuration and a battery-backed write cache.  The cache is negotiable. 
 Linux would be preferred, FreeBSD or Solaris would also work; not Windows 
 though (see good DB performance).
 
 Is anyone aware of a company that offers such a thing?

Sure, CMD will do it, so will Rack Space and a host of others. If you
are willing to go with a VPS SliceHost are decent folk. CMD doesn't rent
hardware you would have to provide that, Rack Space does.

Joshua D. Drake

-- 
PostgreSQL - XMPP: jdr...@jabber.postgresql.org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problems with autovacuum

2009-05-26 Thread Alvaro Herrera
Tom Lane escribió:

 Hmm, maybe we need to improve the code too.  This example suggests that
 there needs to be some limit on the worker launch rate, even if there
 are so many databases that that means we don't meet naptime exactly.

We already have a 100ms lower bound on the sleep time (see
launcher_determine_sleep()).  Maybe that needs to be increased?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problems with autovacuum

2009-05-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane escribió:
 Hmm, maybe we need to improve the code too.  This example suggests that
 there needs to be some limit on the worker launch rate, even if there
 are so many databases that that means we don't meet naptime exactly.

 We already have a 100ms lower bound on the sleep time (see
 launcher_determine_sleep()).  Maybe that needs to be increased?

Maybe.  I hesitate to suggest a GUC variable ;-)

One thought is that I don't trust the code implementing the minimum
too much:

/* 100ms is the smallest time we'll allow the launcher to sleep */
if (nap-tv_sec = 0  nap-tv_usec = 10)
{
nap-tv_sec = 0;
nap-tv_usec = 10;  /* 100 ms */
}

What would happen if tv_sec is negative and tv_usec is say 50?
Maybe negative tv_sec is impossible here, but ...

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problems with autovacuum

2009-05-26 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Tom Lane escribi�:
  Hmm, maybe we need to improve the code too.  This example suggests that
  there needs to be some limit on the worker launch rate, even if there
  are so many databases that that means we don't meet naptime exactly.
 
  We already have a 100ms lower bound on the sleep time (see
  launcher_determine_sleep()).  Maybe that needs to be increased?
 
 Maybe.  I hesitate to suggest a GUC variable ;-)

Heh :-)

 One thought is that I don't trust the code implementing the minimum
 too much:
 
   /* 100ms is the smallest time we'll allow the launcher to sleep */
   if (nap-tv_sec = 0  nap-tv_usec = 10)
   {
   nap-tv_sec = 0;
   nap-tv_usec = 10;  /* 100 ms */
   }
 
 What would happen if tv_sec is negative and tv_usec is say 50?
 Maybe negative tv_sec is impossible here, but ...

I don't think it's possible to get negative tv_sec here currently, but
perhaps you're right that we could make this code more future-proof.

However I think there's a bigger problem here, which is that if the user
has set naptime too low, i.e. to a value lower than
number-of-databases * 100ms, we'll be running the (expensive)
rebuild_database_list function on each iteration ... maybe we oughta put
a lower bound on naptime based on the number of databases to avoid this
problem.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Dave Page
On 5/26/09, Greg Smith gsm...@gregsmith.com wrote:
 I keep falling into situations where it would be nice to host a server
 somewhere else.  Virtual host solutions and the mysterious cloud are no
 good for the ones I run into though, as disk performance is important for
 all the applications I have to deal with.

 What I'd love to have is a way to rent a fairly serious piece of dedicated
 hardware, ideally with multiple (at least 4) hard drives in a RAID
 configuration and a battery-backed write cache.  The cache is negotiable.
 Linux would be preferred, FreeBSD or Solaris would also work; not Windows
 though (see good DB performance).

 Is anyone aware of a company that offers such a thing?

www.contegix.com offer just about the best support I've come across
and are familiar with Postgres. They offer RHEL (and windows) managed
servers on a variety of boxes. They're not a budget outfit though, but
that's reflected in the service.

-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Greg Smith

On Tue, 26 May 2009, Joshua D. Drake wrote:

CMD doesn't rent hardware you would have to provide that, Rack Space 
does.


Part of the idea was to avoid buying a stack of servers, if this were just 
a where do I put the boxes at? problem I'd have just asked you about it 
already.  I forgot to check Rack Space earlier, looks like they have Dell 
servers with up to 8 drives and a RAID controller in them available. 
Let's just hope it's not one of the completely useless PERC models there; 
can anyone confirm Dell's PowerEdge R900 has one of the decent performing 
PERC6 controllers I've heard rumors of in it?


Craig, I share your concerns about outsourced hosting, but as the only 
custom application involved is one I build my own RPMs for I'm not really 
concerned about the system getting screwed up software-wise.  The idea 
here is that I might rent an eval system to confirm performance is 
reasonable, and if it is then I'd be clear to get a bigger stack of them. 
Luckily there's a guy here who knows a bit about benchmarking for this 
sort of thing...


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Josh Berkus

Greg,


I keep falling into situations where it would be nice to host a server
somewhere else. Virtual host solutions and the mysterious cloud are no
good for the ones I run into though, as disk performance is important
for all the applications I have to deal with.


Joyent will guarentee you a certain amount of disk bandwidth.  As far as 
I know, they're the only hoster who does.



--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Scott Carey

On 5/26/09 6:17 PM, Greg Smith gsm...@gregsmith.com wrote:

 On Tue, 26 May 2009, Joshua D. Drake wrote:
 
 CMD doesn't rent hardware you would have to provide that, Rack Space
 does.
 
 Part of the idea was to avoid buying a stack of servers, if this were just
 a where do I put the boxes at? problem I'd have just asked you about it
 already.  I forgot to check Rack Space earlier, looks like they have Dell
 servers with up to 8 drives and a RAID controller in them available.
 Let's just hope it's not one of the completely useless PERC models there;
 can anyone confirm Dell's PowerEdge R900 has one of the decent performing
 PERC6 controllers I've heard rumors of in it?

Every managed hosting provider I've seen uses RAID controllers and support
through the hardware provider.  If its Dell its 99% likely a PERC (OEM'd
LSI).
HP, theirs (not sure who the OEM is), Sun theirs (OEM'd Adaptec).

PERC6 in my testing was certainly better than PERC5, but its still sub-par
in sequential transfer rate or scaling up past 6 or so drives in a volume.

I did go through the process of using a managed hosting provider and getting
custom RAID card and storage arrays -- but that takes a lot of hand-holding
and time, and will most certainly cause setup delays and service issues when
things go wrong and you've got the black-sheep server.  Unless its
absolutely business critical to get that last 10%-20% performance, I would
go with whatever they have with no customization.

Most likely if you ask for a database setup, they'll give you 6 or 8 drives
in raid-5.  Most of what these guys do is set up LAMP cookie-cutters...

 
 Craig, I share your concerns about outsourced hosting, but as the only
 custom application involved is one I build my own RPMs for I'm not really
 concerned about the system getting screwed up software-wise.  The idea
 here is that I might rent an eval system to confirm performance is
 reasonable, and if it is then I'd be clear to get a bigger stack of them.
 Luckily there's a guy here who knows a bit about benchmarking for this
 sort of thing...
 
 --
 * Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD
 
 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Scott Marlowe
On Tue, May 26, 2009 at 7:41 PM, Scott Carey sc...@richrelevance.com wrote:

 On 5/26/09 6:17 PM, Greg Smith gsm...@gregsmith.com wrote:

 On Tue, 26 May 2009, Joshua D. Drake wrote:

 CMD doesn't rent hardware you would have to provide that, Rack Space
 does.

 Part of the idea was to avoid buying a stack of servers, if this were just
 a where do I put the boxes at? problem I'd have just asked you about it
 already.  I forgot to check Rack Space earlier, looks like they have Dell
 servers with up to 8 drives and a RAID controller in them available.
 Let's just hope it's not one of the completely useless PERC models there;
 can anyone confirm Dell's PowerEdge R900 has one of the decent performing
 PERC6 controllers I've heard rumors of in it?

 Every managed hosting provider I've seen uses RAID controllers and support
 through the hardware provider.  If its Dell its 99% likely a PERC (OEM'd
 LSI).
 HP, theirs (not sure who the OEM is), Sun theirs (OEM'd Adaptec).

 PERC6 in my testing was certainly better than PERC5, but its still sub-par
 in sequential transfer rate or scaling up past 6 or so drives in a volume.

 I did go through the process of using a managed hosting provider and getting
 custom RAID card and storage arrays -- but that takes a lot of hand-holding
 and time, and will most certainly cause setup delays and service issues when
 things go wrong and you've got the black-sheep server.  Unless its
 absolutely business critical to get that last 10%-20% performance, I would
 go with whatever they have with no customization.

 Most likely if you ask for a database setup, they'll give you 6 or 8 drives
 in raid-5.  Most of what these guys do is set up LAMP cookie-cutters...


 Craig, I share your concerns about outsourced hosting, but as the only
 custom application involved is one I build my own RPMs for I'm not really
 concerned about the system getting screwed up software-wise.  The idea
 here is that I might rent an eval system to confirm performance is
 reasonable, and if it is then I'd be clear to get a bigger stack of them.
 Luckily there's a guy here who knows a bit about benchmarking for this
 sort of thing...

Yeah, the OP would be much better served ordering a server with an
Areca or Escalade / 3ware controller setup and ready to go, shipped to
the hosting center and sshing in and doing the rest than letting a
hosted solution company try to compete.  You can get a nice 16x15K SAS
disk machine with an Areca controller, dual QC cpus, and 16 to 32 gig
ram for $6000 to $8000 ready to go.  We've since repurposed our Dell /
PERC machines as file servers and left the real database server work
to our aberdeen machines.  Trying to wring reasonable performance out
of most Dell servers is a testament to frustration.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Problems with autovacuum

2009-05-26 Thread Scott Marlowe
2009/5/26 Tom Lane t...@sss.pgh.pa.us:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 However I think there's a bigger problem here, which is that if the user
 has set naptime too low, i.e. to a value lower than
 number-of-databases * 100ms, we'll be running the (expensive)
 rebuild_database_list function on each iteration ... maybe we oughta put
 a lower bound on naptime based on the number of databases to avoid this
 problem.

 Bingo, that's surely exactly what was happening to the OP.  He had 2000
 databases and naptime at (I assume) the default; so he was rerunning
 rebuild_database_list every 100ms.

 So that recovery code path needs some more thought.  Maybe a lower bound
 on how often to do rebuild_database_list?  And/or don't set adl_next_worker
 to less than 100ms in the future to begin with?

I'd be happy with logging telling me when things are getting pathological.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Scott Carey

On 5/26/09 6:52 PM, Scott Marlowe scott.marl...@gmail.com wrote:

 On Tue, May 26, 2009 at 7:41 PM, Scott Carey sc...@richrelevance.com wrote:
 
 On 5/26/09 6:17 PM, Greg Smith gsm...@gregsmith.com wrote:
 
 On Tue, 26 May 2009, Joshua D. Drake wrote:
 
 CMD doesn't rent hardware you would have to provide that, Rack Space
 does.
 
 Part of the idea was to avoid buying a stack of servers, if this were just
 a where do I put the boxes at? problem I'd have just asked you about it
 already.  I forgot to check Rack Space earlier, looks like they have Dell
 servers with up to 8 drives and a RAID controller in them available.
 Let's just hope it's not one of the completely useless PERC models there;
 can anyone confirm Dell's PowerEdge R900 has one of the decent performing
 PERC6 controllers I've heard rumors of in it?
 
 Every managed hosting provider I've seen uses RAID controllers and support
 through the hardware provider.  If its Dell its 99% likely a PERC (OEM'd
 LSI).
 HP, theirs (not sure who the OEM is), Sun theirs (OEM'd Adaptec).
 
 PERC6 in my testing was certainly better than PERC5, but its still sub-par
 in sequential transfer rate or scaling up past 6 or so drives in a volume.
 
 I did go through the process of using a managed hosting provider and getting
 custom RAID card and storage arrays -- but that takes a lot of hand-holding
 and time, and will most certainly cause setup delays and service issues when
 things go wrong and you've got the black-sheep server.  Unless its
 absolutely business critical to get that last 10%-20% performance, I would
 go with whatever they have with no customization.
 
 Most likely if you ask for a database setup, they'll give you 6 or 8 drives
 in raid-5.  Most of what these guys do is set up LAMP cookie-cutters...
 
 
 Craig, I share your concerns about outsourced hosting, but as the only
 custom application involved is one I build my own RPMs for I'm not really
 concerned about the system getting screwed up software-wise.  The idea
 here is that I might rent an eval system to confirm performance is
 reasonable, and if it is then I'd be clear to get a bigger stack of them.
 Luckily there's a guy here who knows a bit about benchmarking for this
 sort of thing...
 
 Yeah, the OP would be much better served ordering a server with an
 Areca or Escalade / 3ware controller setup and ready to go, shipped to
 the hosting center and sshing in and doing the rest than letting a
 hosted solution company try to compete.  You can get a nice 16x15K SAS
 disk machine with an Areca controller, dual QC cpus, and 16 to 32 gig
 ram for $6000 to $8000 ready to go.  We've since repurposed our Dell /
 PERC machines as file servers and left the real database server work
 to our aberdeen machines.  Trying to wring reasonable performance out
 of most Dell servers is a testament to frustration.
 

For a permanent server, yes.  But for a sort lease?  You have to go with
what is easily available for lease, or work out something with a provider
where they buy the HW from you and manage/lease it back (some do this, but
all I've ever heard of involved 12+ servers to do so and sign on for 1 or 2
years).

Expecting full I/O performance out of a DELL with a PERC is not really
possible, but maybe that's not as important as a certain pricing model or
the flexibility?  That is really an independent business decision.

I'll also but a caveat to the '3ware' above -- the last few I've used were
slower than the PERC (9650 series versus PERC6, 9550 versus PERC5  -- all
tests with 12 SATA drives raid 10).
I have no experience with the 3ware 9690 series (SAS) though -- those might
be just fine.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Scott Carey



On 5/26/09 7:27 PM, Scott Carey sc...@richrelevance.com wrote:
 
 For a permanent server, yes.  But for a sort lease?  You have to go with

Ahem ... 'short'  not 'sort'. 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Scott Marlowe
On Tue, May 26, 2009 at 8:27 PM, Scott Carey sc...@richrelevance.com wrote:

 Yeah, the OP would be much better served ordering a server with an
 Areca or Escalade / 3ware controller setup and ready to go, shipped to
 the hosting center and sshing in and doing the rest than letting a
 hosted solution company try to compete.  You can get a nice 16x15K SAS
 disk machine with an Areca controller, dual QC cpus, and 16 to 32 gig
 ram for $6000 to $8000 ready to go.  We've since repurposed our Dell /
 PERC machines as file servers and left the real database server work
 to our aberdeen machines.  Trying to wring reasonable performance out
 of most Dell servers is a testament to frustration.


 For a permanent server, yes.  But for a sort lease?  You have to go with
 what is easily available for lease, or work out something with a provider
 where they buy the HW from you and manage/lease it back (some do this, but
 all I've ever heard of involved 12+ servers to do so and sign on for 1 or 2
 years).

True, but given the low cost of a high drive count machine with spares
etc you can come away spending a lot less than by leasing.

 Expecting full I/O performance out of a DELL with a PERC is not really
 possible, but maybe that's not as important as a certain pricing model or
 the flexibility?  That is really an independent business decision.

True.  Plus if you only need 4 drives or something, you can do pretty
well with a Dell with the RAID controller turned to JBOD and letting
the linux kernel do the RAID work.

 I'll also but a caveat to the '3ware' above -- the last few I've used were
 slower than the PERC (9650 series versus PERC6, 9550 versus PERC5  -- all
 tests with 12 SATA drives raid 10).
 I have no experience with the 3ware 9690 series (SAS) though -- those might
 be just fine.

My experience is primarily with Areca 1100, 1200, and 1600 series
controllers, but others on the list have done well with 3ware
controllers.  We have an 8 port 11xx series areca card at work running
RAID-6 as a multipurpose server, and it's really quite fast and well
behaved for sequential throughput.  But the 16xx series cards stomp
the 11xx series in the ground for random IOPS.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Greg Smith

On Tue, 26 May 2009, Scott Marlowe wrote:

Plus if you only need 4 drives or something, you can do pretty well with 
a Dell with the RAID controller turned to JBOD and letting the linux 
kernel do the RAID work.


I think most of the apps I'm considering would be OK with 4 drives and a 
useful write cache.  The usual hosted configurations are only 1 or 2 and 
no usable cache, which really limits what you can do with the server 
before you run into a disk bottleneck.  My rule of thumb is that any 
single core will be satisfied as long as you've got at least 4 disks to 
feed it, since it's hard for one process to use more than a couple of 
hundred MB/s for doing mostly sequential work.  Obviously random access is 
much easier to get disk-bound, where you have to throw a lot more disks at 
it.


It wouldn't surprise me to find it's impossible to get an optimal setup of 
8+ disks from any hosting provider.  Wasn't asking for great DB 
performance though, just good.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Ron Mayer
Greg Smith wrote:
 I keep falling into situations where it would be nice to host a server
 somewhere else.  Virtual host solutions and the mysterious cloud are no
 good for the ones I run into though, as disk performance is important
 for all the applications I have to deal with.

It's worth noting that some clouds are foggier than others.

On Amazon's you can improve your disk performance by setting up
software RAID over multiple of their virtual drives.   And since they
charge by GB, it doesn't cost you any more to do this than to set up
a smaller number of larger drives.

Here's a blog showing Bonnie++ comparing various RAID levels
on Amazon's cloud - with a 4 disk RAID0 giving a nice
performance increase over a single virtual drive.
http://af-design.com/blog/2009/02/27/amazon-ec2-disk-performance/

Here's a guy who set up a 40TB RAID0 with 40 1TB virtual disks
on Amazon.
http://groups.google.com/group/ec2ubuntu/web/raid-0-on-ec2-ebs-volumes-elastic-block-store-using-mdadm
http://groups.google.com/group/ec2ubuntu/browse_thread/thread/d520ae145edf746

I might get around to trying some pgbench runs on amazon
in a week or so.   Any suggestions what would be most interesting?


 What I'd love to have is a way to rent a fairly serious piece of
 dedicated hardware, ideally with multiple (at least 4) hard drives in a
 RAID configuration and a battery-backed write cache.  The cache is
 negotiable. Linux would be preferred, FreeBSD or Solaris would also
 work; not Windows though (see good DB performance).
 
 Is anyone aware of a company that offers such a thing?
 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Erik Aronesty
On Tue, May 26, 2009 at 6:28 PM, Craig James craig_ja...@emolecules.com wrote:
 Greg Smith wrote:

 What I'd love to have is a way to rent a fairly serious piece of dedicated
 hardware, ideally with multiple (at least 4) hard drives in a RAID
 configuration and a battery-backed write cache.  The cache is negotiable.
 Linux would be preferred, FreeBSD or Solaris would also work; not Windows
 though (see good DB performance).

 We finally bought some nice Dell servers and found a co-location site that
 provides us all the infrastructure (reliable power, internet, cooling,
 security...), and we're in charge of the computers.  We've never looked
 back.

I ran this way on a Quad-processor Dell for many years, and then,
after selling the business and starting a new one, decided to keep my
DB on a remote-hosted machine.  I have a dual-core2 with hardware RAID
5 (I know, I know) and a private network interface to the other
servers (web, email, web-cache)

Just today when the DB server went down (after 2 years of reliable
service  and 380 days of uptime) they gave me remote KVM access to
the machine.  Turns out I had messed up the fstab while fiddling with
the server because I really don't know FreeBSD as well as Linux,

I think remote leased-hosting works fine as long as you have a
competent team on the other end and KVM over IP access.  Many
providers don't have that... and without it you can get stuck as you
describe.

I have used MANY providers over they years, at the peak with over 30
leased servers at 12 providers, and with many colocation situations as
well.   The only advantage with colocation I have seen  is the
reduced expense if you keep it going for a few years on the same
box. which is a big advantage if it lets you buy a much more
powerful box to begin with.

Providers I prefer for high-end machines allow me to upgrade the
hardware with no monthly fees (marked-up cost of upgrade + time/labor
only) that keeps the cost down.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hosted servers with good DB disk performance?

2009-05-26 Thread Alex Adriaanse

Greg Smith wrote:
What I'd love to have is a way to rent a fairly serious piece of 
dedicated hardware, ideally with multiple (at least 4) hard drives in 
a RAID configuration and a battery-backed write cache.  The cache is 
negotiable. Linux would be preferred, FreeBSD or Solaris would also 
work; not Windows though (see good DB performance).


Is anyone aware of a company that offers such a thing?
I've used http://softlayer.com/ in the past and highly recommend them.  
They sell a wide range of dedicated servers, including ones that handle 
up to 12 HDDs/SSDs, and servers with battery-backed RAID controllers 
(I've been told they use mostly Adaptec cards as well as some 3ware 
cards).  In addition, all their servers are connected to a private 
network you can VPN into, and all include IPMI for remote management 
when you can't SSH into your server.  They have a host of other 
features; click on the Services tab on their site to find out more.


Alex


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance