Re: [PERFORM] Unexpected pgbench result

2013-12-20 Thread Shaun Thomas

On 12/19/2013 04:06 PM, Dave Johansen wrote:


Right now, we're running a RAID 1 for pg_clog, pg_log and pg_xlog and
then a RAID 1+0 with 12 disks for the data. Would there be any benefit
to running a separate RAID 1+0 with a tablespace for the indexes?


Not really. PostgreSQL doesn't currently support parallel backend 
fetches, aggregation, or really anything. It's looking like 9.4 will get 
us a lot closer to that, but right now, everything is serial.


Serial or not, separate backends will have separate read concerns, and 
PostgreSQL 9.2 and above *do* support index only scans. So 
theoretically, you might actually see some benefit there. If it were me 
and I had spindles available, I would just increase the overall size of 
the pool. It's a lot easier than managing multiple tablespaces.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Unexpected pgbench result

2013-12-20 Thread Dave Johansen
On Fri, Dec 20, 2013 at 7:10 AM, Shaun Thomas stho...@optionshouse.comwrote:

 On 12/19/2013 04:06 PM, Dave Johansen wrote:

  Right now, we're running a RAID 1 for pg_clog, pg_log and pg_xlog and
 then a RAID 1+0 with 12 disks for the data. Would there be any benefit
 to running a separate RAID 1+0 with a tablespace for the indexes?


 Not really. PostgreSQL doesn't currently support parallel backend fetches,
 aggregation, or really anything. It's looking like 9.4 will get us a lot
 closer to that, but right now, everything is serial.

 Serial or not, separate backends will have separate read concerns, and
 PostgreSQL 9.2 and above *do* support index only scans. So theoretically,
 you might actually see some benefit there. If it were me and I had spindles
 available, I would just increase the overall size of the pool. It's a lot
 easier than managing multiple tablespaces.


Ok, that makes sense. Is there a benefit to having the WAL and logs on the
separate RAID 1? Or is just having them be part of the larger RAID 1+0 just
as good?


Re: [PERFORM] Unexpected pgbench result

2013-12-20 Thread Kevin Grittner
Dave Johansen davejohan...@gmail.com wrote:

 Is there a benefit to having the WAL and logs on the separate
 RAID 1? Or is just having them be part of the larger RAID 1+0
 just as good?

I once accidentally left the pg_xlog directory on the 40-spindle
RAID with most of the data instead of moving it.  Results with
graph here:

http://www.postgresql.org/message-id/4b71358e02250002f...@gw.wicourts.gov

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Unexpected pgbench result

2013-12-20 Thread Dave Johansen
On Fri, Dec 20, 2013 at 8:22 AM, Kevin Grittner kgri...@ymail.com wrote:

 Dave Johansen davejohan...@gmail.com wrote:

  Is there a benefit to having the WAL and logs on the separate
  RAID 1? Or is just having them be part of the larger RAID 1+0
  just as good?

 I once accidentally left the pg_xlog directory on the 40-spindle
 RAID with most of the data instead of moving it.  Results with
 graph here:


 http://www.postgresql.org/message-id/4b71358e02250002f...@gw.wicourts.gov


That's very helpful information. Thanks for sharing it,
Dave


[PERFORM] Unexpected pgbench result

2013-12-19 Thread Dave Johansen
I'm working on setting up a large database (or at least what I consider to
be a large one with several tables having 10-20 million records inserted
per day), and I've been using pgbench to verify that the hardware and
database are configured in an optimal manner.

When I run pgbench in SELECT only after doing -i -s 2000 I get what
appears to be good performance (60k-70k tps) but if I initialize a new
database with -i -s 4000 the tps drops to 4k-7k. Is this order of
magnitude drop expected? Or is there something wrong with my hardware or
database configuration that is causing this issue?

The one option that I've been considering is partitioning and I've been
asking about it here:
http://www.postgresql.org/message-id/CAAcYxUcb0NFfMDsMOCL5scNRrUL7=9hkxjz021jmqp0r7f5...@mail.gmail.com


Re: [PERFORM] Unexpected pgbench result

2013-12-19 Thread Shaun Thomas

On 12/19/2013 11:00 AM, Dave Johansen wrote:


When I run pgbench in SELECT only after doing -i -s 2000 I get what
appears to be good performance (60k-70k tps) but if I initialize a new
database with -i -s 4000 the tps drops to 4k-7k. Is this order of
magnitude drop expected? Or is there something wrong with my hardware or
database configuration that is causing this issue?


When you increase the size of the initialized pgbench tables, you 
increase the size on disk. My guess is that you doubled it so that the 
data no longer fits in memory. You can verify this yourself:


SELECT pg_size_pretty(sum(pg_database_size(oid))::bigint)
  from pg_database;

Any amount of memory you have that is smaller than that, will affect 
select performance. I can guarantee you will not get 60k-70k tps from 
anything short of an array of SSD devices or a PCIe NVRAM solution. Your 
'-s 2000' test was probably running mostly from memory, while the '-s 
4000' did not.


What you're seeing is the speed your records are being supplied from 
disk, plus whatever cache effects are there when records are read before 
they are flushed in favor of more recent data.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Unexpected pgbench result

2013-12-19 Thread Dave Johansen
On Thu, Dec 19, 2013 at 10:44 AM, Shaun Thomas stho...@optionshouse.comwrote:

 On 12/19/2013 11:00 AM, Dave Johansen wrote:

  When I run pgbench in SELECT only after doing -i -s 2000 I get what
 appears to be good performance (60k-70k tps) but if I initialize a new
 database with -i -s 4000 the tps drops to 4k-7k. Is this order of
 magnitude drop expected? Or is there something wrong with my hardware or
 database configuration that is causing this issue?


 When you increase the size of the initialized pgbench tables, you increase
 the size on disk. My guess is that you doubled it so that the data no
 longer fits in memory. You can verify this yourself:

 SELECT pg_size_pretty(sum(pg_database_size(oid))::bigint)
   from pg_database;

 Any amount of memory you have that is smaller than that, will affect
 select performance. I can guarantee you will not get 60k-70k tps from
 anything short of an array of SSD devices or a PCIe NVRAM solution. Your
 '-s 2000' test was probably running mostly from memory, while the '-s 4000'
 did not.

 What you're seeing is the speed your records are being supplied from disk,
 plus whatever cache effects are there when records are read before they are
 flushed in favor of more recent data.


Yep, that was exactly it and that definitely makes sense now that you point
it out.

Right now, we're running a RAID 1 for pg_clog, pg_log and pg_xlog and then
a RAID 1+0 with 12 disks for the data. Would there be any benefit to
running a separate RAID 1+0 with a tablespace for the indexes? Or is
reading the indexes and data a serial process where separating them like
that won't have any big benefit?

Thanks,
Dave