[PERFORM] The dreaded semwait on FreeBSD

2013-03-11 Thread Benjamin Krajmalnik
I have a very heavy OLTP application in the field.

We have two PostgreSQL databases (9.0.x)/FreeBSD 8.1/amd64 - one is a
listener which just stores data digests, the other is the actual
database.

The digests from the listener are processed by multi-threaded daemons
and inserted into the main database.

Every so often, we see the queues on the listener grow and get
significantly behind.

Checking the system metrics, we do not see any issue with either memory,
CPU, or IO (using top and iostat), so it would appear we are being hit
by contention.  When data insertion rates slow down we can see a lot of
the potgresql processes in a semwait state.

 

The data which is coming in from the daemons is inserted into a
temporary table, which fires of a trigger which in turn calls a stored
procedure (massive) which processes the data input.

We also have another daemon which runs in the background continuously
creating a materialized view of a dashboard.  If this daemon does not
run, the queues do not grow (or the rate of growth decreases if some
other heavy processing is going on such as a large report being
generated), so it appears this is one of the primary causes for the
contention which is causing the semwaits.

 

The data is inserted into 3 partitioned tables, each of which is fairly
wide.  The daemon which processes the dashboard looks at only 2 fields
(state and last value), using the Devices and Tests tables.  If I were
to create a 2 more partitioned tables which only holds the columns in
question (as well as the columns necessary to associate the rows),
should this reduce the contention?  

 

The partitioned tables are as such:

 

Devices ->  Tests -> Statistical info on tests

 

I was thinking of adding two more partitioned tables

 

Altdevices -> DashboardTests

 

That way, when we process the dashboard we do not touch any of the 3
primary tables, which are the ones which are constantly being pounded
on.

Looking at the stats on the server (Xact committed), we are processing
approximately 4000 transactions per second.

 

On another note, this setup is using streaming replication to a
secondary server which is used as a read only server for reporting.
Would accessing data from the secondary server somehow cause contention
on the primary server?   From the patterns of behavior, it would appear
so (when large reports are generated we are seeing some effect on the
data insertion rates).

 

Thanks in advance,

 

Benjamin

 



Re: [PERFORM] New server setup

2013-03-05 Thread Benjamin Krajmalnik
Set it to use session.  I had a similar issue having moved one of the 
components of our app to use transactions, which introduced an undesired 
behavior.


-Original Message-
From: pgsql-performance-ow...@postgresql.org 
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Niels Kristian 
Schjødt
Sent: Tuesday, March 05, 2013 10:12 AM
To: Kevin Grittner
Cc: Craig James; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] New server setup

Thanks, that was actually what I just ended up doing yesterday. Any suggestion 
how to tune pgbouncer?

BTW, I have just bumped into an issue that caused me to disable pgbouncer again 
actually. My web application is querying the database with a per request based 
SEARCH_PATH. This is because I use schemas to provide country based separation 
of my data (e.g. english, german, danish data in different schemas). I have 
pgbouncer setup to have a transactional behavior (pool_mode = transaction) - 
however some of my colleagues complained that it sometimes didn't return data 
from the right schema set in the SEARCH_PATH - you wouldn't by chance have any 
idea what is going wrong wouldn't you?

 pgbouncer.ini
[databases]
production =

[pgbouncer]

logfile = /var/log/pgbouncer/pgbouncer.log pidfile = 
/var/run/pgbouncer/pgbouncer.pid listen_addr = localhost listen_port = 6432 
unix_socket_dir = /var/run/postgresql auth_type = md5 auth_file = 
/etc/pgbouncer/userlist.txt admin_users = postgres pool_mode = transaction 
server_reset_query = DISCARD ALL max_client_conn = 500 default_pool_size = 20 
reserve_pool_size = 5 reserve_pool_timeout = 10 #


Den 05/03/2013 kl. 17.34 skrev Kevin Grittner :

> Niels Kristian Schjødt  wrote:
> 
>> So my question is, should I also get something like pgpool2 setup at 
>> the same time? Is it, from your experience, likely to increase my 
>> throughput a lot more, if I had a connection pool of eg. 20 
>> connections, instead of 300 concurrent ones directly?
> 
> In my experience, it can make a big difference.  If you are just using 
> the pooler for this reason, and don't need any of the other features 
> of pgpool, I suggest pgbouncer.  It is a simpler, more lightweight 
> tool.
> 
> --
> Kevin Grittner
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL 
> Company



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] how much postgres can scale up?

2011-06-12 Thread Benjamin Krajmalnik
Greg's book is highly recommended, and in my opinion a "must" for anyone doing 
serious work with Postgres.

> -Original Message-
> From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
> ow...@postgresql.org] On Behalf Of Anibal David Acosta
> Sent: Friday, June 10, 2011 7:19 AM
> To: 'Craig Ringer'
> Cc: t...@fuzzy.cz; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] how much postgres can scale up?
> 
> Excellent.
> 
> Thanks I'll buy and read that book :)
> 
> 
> Thanks!
> 
> 
> 
> -Mensaje original-
> De: Craig Ringer [mailto:cr...@postnewspapers.com.au]
> Enviado el: viernes, 10 de junio de 2011 09:13 a.m.
> Para: Anibal David Acosta
> CC: t...@fuzzy.cz; pgsql-performance@postgresql.org
> Asunto: Re: [PERFORM] how much postgres can scale up?
> 
> On 06/10/2011 08:56 PM, Anibal David Acosta wrote:
> > The version is Postgres 9.0
> > Yes, I setup the postgres.conf according to instructions in the
> > http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
> >
> >
> > Cool, I will check this
> > http://wiki.postgresql.org/wiki/Logging_Difficult_Queries
> >
> > Looks like great starting point to find bottleneck
> >
> > But so, Is possible in excellent conditions that two connections
> duplicate the quantity of transactions per second?
> 
> For two connections, if you have most of the data cached in RAM or you
> have lots of fast disks, then sure. For that matter, if they're
> synchronized scans of the same table then the second transaction might
> perform even faster than the first one!
> 
> There are increasing overheads with transaction synchronization, etc
> with number of connections, and they'll usually land up contending for
> system resources like RAM (for disk cache, work_mem, etc), disk I/O,
> and CPU time. So you won't generally get linear scaling with number of
> connections.
> 
> Greg Smith has done some excellent and detailed work on this. I highly
> recommend reading his writing, and you should consider buying his
> recent book "PostgreSQL 9.0 High Performance".
> 
> See also:
> 
> http://wiki.postgresql.org/wiki/Performance_Optimization
> 
> There have been lots of postgresql scaling benchmarks done over time,
> too. You'll find a lot of information if you look around the wiki and
> Google.
> 
> --
> Craig Ringer
> 
> 
> --
> Sent via pgsql-performance mailing list (pgsql-
> performa...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Unused indices

2011-02-23 Thread Benjamin Krajmalnik
I am trying to clean up our schema by removing any indices which are not
being used frequently or at all.

Using pgadmin, looking at the statistics for an index, I see various
pieces of information:

 

Index Scans, Index Tuples Read, Index Tuples Fetched, Index Blocks Read,
and Index Blocks Hit.

I have on index with the following statistics:

 

Index Scans0  

Index Tuples Read   0  

Index Tuples Fetched0  

Index Blocks Read   834389  

Index Blocks Hit247283300   

Index Size   1752 kB 

 

 

Since there are no index scans, would it be safe to remove this one?

 



[PERFORM] Need some help analyzing some benchmarks

2011-02-06 Thread Benjamin Krajmalnik
Before I deploy some new servers, I figured I would do some
benchmarking.

Server is a Dual E5620, 96GB RAM, 16 x 450GB SAS(15K) drives.

Controller is an Areca 1680 with 2GB RAM and battery backup.

So far I have only run bonie++ since each cycle is quite long (writing
192GB).

 

My data partition is 12 drives in RAID 1+0 (2.7TB) running  UFS2.
Vfs.read_max has been set to 32, and no other tuning has been done.

Files system is not mounted with noatime at this point.

Below are the results:

 

 

db1# bonnie++ -d /usr/local/pgsql -c 4 -n 2:1000:100:64 -u pgsql

Version  1.96   --Sequential Output-- --Sequential Input-
--Random-

Concurrency   4 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--

MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
/sec %CP

db1.stackdump. 192G   860  99 213731  52 28518  45  1079  70 155479  34
49.9  12

Latency 10008us2385ms1190ms 457ms2152ms
231ms

Version  1.96   --Sequential Create-- Random
Create

db1.stackdump.local -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--

files:max:min/sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
/sec %CP

2:1000:100/6449  33   128  96   277  9757  39   130  90
275  97

Latency   660ms   13954us   13003us 904ms 334ms
13365us

 

Not having anything to compare it to, I do not know if these are decent
numbers or not - they are definitely slower than a similar setup which
was posted recently using  XFS on Linux, but I have not found anything
in FreeBSD using UFS2 to compare it  to.  What strikes me in particular
is that the write performance is higher than the read performance - I
would have intuitively expected it to be the other way around.

 

My log partition is a RAID1, same drives.  Performance follows:

 

Version  1.96   --Sequential Output-- --Sequential Input-
--Random-

Concurrency   4 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--

MachineSize K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP
/sec %CP

db1.stackdump. 192G   861  99 117023  28 20142  43   359  23 109719  24
419.5  12

Latency  9890us   13227ms8944ms3623ms2236ms
252ms

Version  1.96   --Sequential Create-- Random
Create

db1.stackdump.local -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--

files:max:min/sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
/sec %CP

2:1000:100/6424  16   121  93   276  9722  15   134  93
275  97

Latency  4070ms   14029us   13079us   15016ms 573ms
13369us

 

After seeing these results, I decided to download the areca cli and
check the actual setup

Info from the RAID controller follows:

 

CLI> sys info

The System Information

===

Main Processor : 1200MHz

CPU ICache Size: 32KB

CPU DCache Size: 32KB

CPU SCache Size: 512KB

System Memory  : 2048MB/533MHz/ECC

Firmware Version   : V1.48 2010-10-21

BOOT ROM Version   : V1.48 2010-01-04

Serial Number  : Y051CABVAR600825

Controller Name: ARC-1680

Current IP Address : 192.168.1.100

 

CLI> rsf info raid=2

Raid Set Information 

===

Raid Set Name: Raid Set # 001  

Member Disks : 12

Total Raw Capacity   : 5400.0GB

Free Raw Capacity: 0.0GB

Min Member Disk Size : 450.0GB

Raid Set State   : Normal

 

CLI> vsf info vol=2

Volume Set Information 

===

Volume Set Name : ARC-1680-VOL#001

Raid Set Name   : Raid Set # 001  

Volume Capacity : 2700.0GB

SCSI Ch/Id/Lun  : 00/00/01

Raid Level  : Raid1+0

Stripe Size : 8K

Member Disks: 12

Cache Mode  : Write Back

Tagged Queuing  : Enabled

Volume State: Normal

===

 

Having done this, I noticed that the stripe size is configured to 8K.

I am thinking the problem may be due to the stripe size.  I had asked
the vendor to set up the file system for these two arrays with 8K
blocks, and I believe they may have misunderstood my request and set the
stripe size to 8K.  I assume increasing the stripe size will improve the
performance.

What stripe sizes are you typically using?  I was planning on setting it
up with a 64K stripe size.

 

TIA,

 

Benjamin

 



Re: [PERFORM] Configuration for a new server.

2011-02-02 Thread Benjamin Krajmalnik


>See how buffers_backend is much larger than buffers_clean, even though 
>maxwritten_clean is low?  That means the background writer isn't running often 
>enough to keep up with cleaning things, even though >it does a lot of work 
>when it does kick in.  In your situation I'd normally do a first pass by 
>cutting bgwriter_lru_maxpages to 1/4 of what it is now, cut bgwriter_delay to 
>1/4 as well (to 50ms), and >then see how the proportions change.  You can 
>probably cut the multiplier, too, yet still see more pages written by the 
>cleaner.

>I recommend saving a snapsot of this data with a timestamp, i.e.:

>select now(),* from pg_stat_bgwriter;

>Anytime you make a change to one of the background writer or checkpoint timing 
>parameters.  That way you have a new baseline to compare against.  These 
>numbers aren't very useful with a single value, >but once you get two of them 
>with timestamps you can compute all sorts of fun statistics from the pair.

So, if I understand correctly, I should strive for a relative increase in 
buffers_clean to buffers_backend



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Configuration for a new server.

2011-02-01 Thread Benjamin Krajmalnik
There are approximately 50 tables which get updated with almost 100%
records updated every 5 minutes - what is a good number of autovacuum
processes to have on these?  The current server I am replacing only has
3 of them but I think I may gain a benefit from having more.


Watch pg_stat_user_tables and you can figure this out for your workload.
There are no generic answers in this area.

What in particular should I be looking at to help me decide?

 

 

Currently I have what I believe to be an aggressive bgwriter setting as
follows:

 

bgwriter_delay = 200ms  # 10-1ms between rounds

bgwriter_lru_maxpages = 1000# 0-1000 max buffers
written/round 

bgwriter_lru_multiplier = 10# 0-10.0 multipler on buffers
scanned/round

 

Does this look right?


You'd probably be better off decreasing the delay rather than pushing up
the other two parameters.  It's easy to tell if you did it right or not;
just look at pg_stat_bgwriter.  If buffers_backend is high relative to
the others, that means the multiplier or delay is wrong.  Or if
maxwritten_clean is increasing fast, that means bgwriter_lru_maxpages is
too low.

checkpoints_timed = 261

checkpoints_req = 0

buffers_checkpoint = 49058438

buffers_clean = 3562421

maxwritten_clean = 243

buffers_backend = 11774254

buffers_alloc = 42816578



Re: [PERFORM] Configuration for a new server.

2011-02-01 Thread Benjamin Krajmalnik
Greg,

 

Thank you very much for your quick response.

The servers are using Areca 1600 series controllers with battery backup and 2GB 
cache.

I really enjoyed your book (actually, both of the books your company 
published).  Found them extremely helpful and they filled a lot of gaps in my 
still gappy knowledge J

 

 

 

 

From: Greg Smith [mailto:g...@2ndquadrant.com] 
Sent: Tuesday, February 01, 2011 4:54 AM
To: Benjamin Krajmalnik
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Configuration for a new server.

 

Benjamin Krajmalnik wrote: 

 

 have a new set of servers coming in - Dual Xeon E5620's, 96GB RAM, 18 spindles 
(1 RAID1 for OS - SATA, 12 disk RAID10 for data - SAS, RAID-1 for logs - SAS, 2 
hot spares SAS). 


You didn't mention the RAID controller and its cache setup.  That's a critical 
piece to get write, err, right.  Presumably you've got a battery-backed RAID 
cache on your SAS controller.  Knowing that and what model it is (to make sure 
it's one of the ones that performs well) would be good info to pass along here.
  

Is the 25% RAM for shared memory still a good number to go with for this size 
server?


Several people have reported to me they see drop-offs in performance between 
8GB and 10GB for that setting.  I currently recommend limiting shared_buffers 
to 8GB until we have more data on why that is.  You suggested already having 
checkpoint issues, too; if that's true, you don't want to dedicate too much RAM 
to the database for that reason, too.




There are approximately 50 tables which get updated with almost 100% records 
updated every 5 minutes - what is a good number of autovacuum processes to have 
on these?  The current server I am replacing only has 3 of them but I think I 
may gain a benefit from having more.


Watch pg_stat_user_tables and you can figure this out for your workload.  There 
are no generic answers in this area.




Currently I have what I believe to be an aggressive bgwriter setting as follows:

 

bgwriter_delay = 200ms  # 10-1ms between rounds

bgwriter_lru_maxpages = 1000# 0-1000 max buffers written/round 

bgwriter_lru_multiplier = 10# 0-10.0 multipler on buffers 
scanned/round

 

Does this look right?


You'd probably be better off decreasing the delay rather than pushing up the 
other two parameters.  It's easy to tell if you did it right or not; just look 
at pg_stat_bgwriter.  If buffers_backend is high relative to the others, that 
means the multiplier or delay is wrong.  Or if maxwritten_clean is increasing 
fast, that means bgwriter_lru_maxpages is too low.





These are values which I arrived to by playing with them to make sure that the 
end user performance did not suffer.  The checkpoints are taking about 8 
minutes to complete, but between checkpoints the disk i/o on the data partition 
is very minimal - when I had lower segments running a 15 minute timeout with a 
.9 completion target, the platform was fairly slow vis-à-vis the end user.


The completion target isn't the main driver here, the number of 
segments/timeout is.  When you space checkpoints out further, the actual amount 
of total I/O the server does decreases, both to the WAL and to the main 
database.  So I suspect your tweaking the target had little impact, and it's 
possible you might even get smoother performance if you put it back to a higher 
value again.




-- 
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


[PERFORM] Configuration for a new server.

2011-01-31 Thread Benjamin Krajmalnik
Scott,

I don't know if you received my private email, but just in case you did not I 
am posting the infomration here.

 

I have a new set of servers coming in - Dual Xeon E5620's, 96GB RAM, 18 
spindles (1 RAID1 for OS - SATA, 12 disk RAID10 for data - SAS, RAID-1 for logs 
- SAS, 2 hot spares SAS).  They are replacing a single Dual Xeon E5406 with 
16GB RAM and 2x RAID1 - one for OS/Data, one for Logs.

Current server is using 3840MB of shared buffers.

 

It will be running FreeBSD 8.1 x64, PG 9.0.2, running streaming replication to 
a like server.

I have read the performance tuning book written by Greg Smith, and am using it 
as a guide to configure it for performance.

The main questions which I have are the following:

 

Is the 25% RAM for shared memory still a good number to go with for this size 
server?

There are approximately 50 tables which get updated with almost 100% records 
updated every 5 minutes - what is a good number of autovacuum processes to have 
on these?  The current server I am replacing only has 3 of them but I think I 
may gain a benefit from having more.

Currently I have what I believe to be an aggressive bgwriter setting as follows:

 

bgwriter_delay = 200ms  # 10-1ms between rounds

bgwriter_lru_maxpages = 1000# 0-1000 max buffers written/round 

bgwriter_lru_multiplier = 10# 0-10.0 multipler on buffers 
scanned/round

 

Does this look right?

 

I have the following settings:

work_mem = 64MB # min 64kB

maintenance_work_mem = 128MB# min 1MB

 

And, of course, some of the most critical ones - the WAL settings.  Right now, 
in order to give the best performance to the end users due to the size of the 
current box, I have a very unoptimal setting in my opinion 

 

fsync = off # turns forced synchronization on or off

#synchronous_commit = on# immediate fsync at commit

#wal_sync_method = fsync# the default is the first option

# supported by the operating system:

#   open_datasync

#   fdatasync

#   fsync

#   fsync_writethrough

#   open_sync

full_page_writes = on   # recover from partial page writes

wal_buffers = 16MB

#wal_buffers = 1024KB   # min 32kB

# (change requires restart)

# wal_writer_delay = 100ms  # 1-1 milliseconds



#commit_delay = 0   # range 0-10, in microseconds

#commit_siblings = 5# range 1-1000

 

# - Checkpoints -

 

#checkpoint_segments = 128  # in logfile segments, min 1, 16MB each

checkpoint_segments = 1024

checkpoint_timeout = 60min  # range 30s-1h

#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0

checkpoint_completion_target = 0.1

checkpoint_warning = 45min  # 0 disables

 

These are values which I arrived to by playing with them to make sure that the 
end user performance did not suffer.  The checkpoints are taking about 8 
minutes to complete, but between checkpoints the disk i/o on the data partition 
is very minimal - when I had lower segments running a 15 minute timeout with a 
.9 completion target, the platform was fairly slow vis-à-vis the end user.

 

The above configuration is using PG 8.4.

 

Thanks in advance for any insight.



Re: [PERFORM] Hardware recommendations

2010-12-13 Thread Benjamin Krajmalnik


> -Original Message-
> From: Greg Smith [mailto:g...@2ndquadrant.com]
> Sent: Saturday, December 11, 2010 2:18 AM
> To: Benjamin Krajmalnik
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Hardware recommendations
> 
> 
> What sort of total read/write rates are you seeing when iostat is
> showing the system 85% busy?  That's a useful number to note as an
> estimate of just how random the workload is.
> 

I did a vacuum full of the highly bloated, constantly accessed tables,
which has improved the situation significantly.  I am not seeing over
75% busy right now, but these are some values for the high busy
presently:

71%  344 w/s  7644 kw/s
81%  392 w/s  8880 kw/s
79%  393 w/s  9526 kw/s
75%  443 w/s  10245 kw/s
80%  436 w/s  10157 kw/s
76%  392 w/s  8438 kw/s




> Have you increased checkpoint parameters like checkpoint_segments?
You
> need to avoid having checkpoints too often if you're going to try and
> use 4GB of memory for shared_buffers.
> 

Yes, I have it configured at 1024 checkpoint_segments, 5min timeout,0.9
compiostat -x 5letion_target
> 
> It's nice to put the logs onto a separate disk because it lets you
> measure exactly how much I/O is going to them, relative to the
> database.  It's not really necessary though; with 14 disks you'll be
at
> the range where you can mix them together and things should still be
> fine.
> 

Thx.  I will place them in their own RAID1 (or mirror if I end up going
to ZFS)

> 
> > On the processor front, are there advantages to going to X series
> processors as opposed to the E series (especially since I am I/O
> bound)?  Is anyone running this type of hardware, specially on
FreeBSD?
> Any opinions, especially concerning the Areca controllers which they
> use?
> >
> 
> It sounds like you should be saving your hardware dollars for more RAM
> and disks, not getting faster procesors.  The Areca controllers are
> fast
> and pretty reliable under Linux.  I'm not aware of anyone using them
> for
> PostgreSQL in production on FreeBSD.  Aberdeen may have enough
> customers
> doing that to give you a good opinion on how stable that is likely to
> be; they're pretty straight as vendors go.  You'd want to make sure to
> stress test that hardware/software combo as early as possible
> regardless, it's generally a good idea and you wouldn't be running a
> really popular combination.
> 

Thx.  That was my overall plan - that's why I am opting for drives,
cache on the controller, and memory.

> --
> Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
> PostgreSQL Training, Services and Supportwww.2ndQuadrant.us
> "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hardware recommendations

2010-12-08 Thread Benjamin Krajmalnik
John,

The platform is a network monitoring system, so we have quite a lot of 
inserts/updates (every data point has at least one record insert as well as at 
least 3 record updates).  The management GUI has a lot of selects.  We are 
refactoring the database to some degree to aid in the performance, since the 
performance degradations are correlated to the number of users viewing the 
system GUI.
My biggest concern with SSD drives is their life expectancy, as well as our 
need for relatively high capacity.  From a purely scalability perspective, this 
setup will need to support terabytes of data.  I suppose I could use table 
spaces to use the most accessed data in SSD drives and the rest on regular 
drives.
As I stated, I am moving to RAID 10, and was just wondering if the logs should 
still be moved off to different spindles, or will leaving them on the RAID10 be 
fine and not affect performance.

> -Original Message-
> From: John W Strange [mailto:john.w.stra...@jpmchase.com]
> Sent: Wednesday, December 08, 2010 4:32 PM
> To: Benjamin Krajmalnik; pgsql-performance@postgresql.org
> Subject: RE: Hardware recommendations
> 
> Ben,
> 
> It would help if you could tell us a bit more about the read/write mix
> and transaction requirements. *IF* you are heavy writes I would suggest
> moving off the RAID1 configuration to a RAID10 setup.  I would highly
> suggest looking at SLC based solid state drives or if your budget has
> legs, look at fusionIO drives.
> 
> We currently have several setups with two FusionIO Duo cards that
> produce > 2GB second reads, and over 1GB/sec writes.  They are pricey
> but, long term cheaper for me than putting SAN in place that can meet
> that sort of performance.
> 
> It all really depends on your workload:
> 
> http://www.fusionio.com/products/iodrive/ - BEST in slot currently
> IMHO.
> http://www.intel.com/design/flash/nand/extreme/index.htm?wapkw=(X25-E)
> - not a bad alternative.
> 
> There are other SSD controllers on the market but I have experience
> with both so I can recommend both as well.
> 
> - John
> 
> 
> 
> -Original Message-
> From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-
> ow...@postgresql.org] On Behalf Of Benjamin Krajmalnik
> Sent: Wednesday, December 08, 2010 5:04 PM
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] Hardware recommendations
> 
> I need to build a new high performance server to replace our current
> production database server.
> The current server is a SuperMicro 1U with 2 RAID-1 containers (one for
> data, one for log, SAS - data is 600GB, Logs 144GB), 16GB of RAM,
> running 2 quad core processors (E5405 @ 2GHz), Adaptec 5405 Controller
> with BBU.  I am already having serious I/O bottlenecks with iostat -x
> showing extended periods where the disk subsystem on the data partition
> (the one with all the random i/o) at over 85% busy.  The system is
> running FreeBSD 7.2 amd64 and PostgreSQL 8.4.4 on amd64-portbld-
> freebsd7.2, compiled by GCC cc (GCC) 4.2.1 20070719  [FreeBSD], 64-bit.
> Currently I have about 4GB of shared memory allocated to PostgreSQL.
> Database is currently about 80GB, with about 60GB being in partitioned
> tables which get rotated nightly to purge old data (sort of like a
> circular buffer of historic data).
> 
> I was looking at one of the machines which Aberdeen has (the X438), and
> was planning  on something along the lines of 96GB RAM with 16 SAS
> drives (15K).  If I create a RAID 10 (stripe of mirrors), leaving 2 hot
> spares, should I still place the logs in a separate RAID-1 mirror, or
> can they be left on the same RAID-10 container?
> On the processor front, are there advantages to going to X series
> processors as opposed to the E series (especially since I am I/O
> bound)?  Is anyone running this type of hardware, specially on
> FreeBSD?  Any opinions, especially concerning the Areca controllers
> which they use?
> 
> The new box would ideally be built with the latest released version of
> FreeBSD, PG 9.x.  Also, is anyone running the 8.x series of FreeBSD
> with PG 9 in a high throughput production environment?  I will be
> upgrading one of our test servers in one week to this same
> configuration to test out, but just wanted to make sure there aren't
> any caveats others have experienced, especially as it pertains with the
> autovacuum not launching worker processes which I have experienced.
> 
> Best regards,
> 
> Benjamin
> 
> --
> Sent via pgsql-performance mailing list (pgsql-
> performa...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> This communication is for informational purposes only. It is not
> intended as an offer 

[PERFORM] Hardware recommendations

2010-12-08 Thread Benjamin Krajmalnik
I need to build a new high performance server to replace our current production 
database server.
The current server is a SuperMicro 1U with 2 RAID-1 containers (one for data, 
one for log, SAS - data is 600GB, Logs 144GB), 16GB of RAM, running 2 quad core 
processors (E5405 @ 2GHz), Adaptec 5405 Controller with BBU.  I am already 
having serious I/O bottlenecks with iostat -x showing extended periods where 
the disk subsystem on the data partition (the one with all the random i/o) at 
over 85% busy.  The system is running FreeBSD 7.2 amd64 and PostgreSQL 8.4.4 on 
amd64-portbld-freebsd7.2, compiled by GCC cc (GCC) 4.2.1 20070719  [FreeBSD], 
64-bit.
Currently I have about 4GB of shared memory allocated to PostgreSQL.  Database 
is currently about 80GB, with about 60GB being in partitioned tables which get 
rotated nightly to purge old data (sort of like a circular buffer of historic 
data).

I was looking at one of the machines which Aberdeen has (the X438), and was 
planning  on something along the lines of 96GB RAM with 16 SAS drives (15K).  
If I create a RAID 10 (stripe of mirrors), leaving 2 hot spares, should I still 
place the logs in a separate RAID-1 mirror, or can they be left on the same 
RAID-10 container?
On the processor front, are there advantages to going to X series processors as 
opposed to the E series (especially since I am I/O bound)?  Is anyone running 
this type of hardware, specially on FreeBSD?  Any opinions, especially 
concerning the Areca controllers which they use?

The new box would ideally be built with the latest released version of FreeBSD, 
PG 9.x.  Also, is anyone running the 8.x series of FreeBSD with PG 9 in a high 
throughput production environment?  I will be upgrading one of our test servers 
in one week to this same configuration to test out, but just wanted to make 
sure there aren't any caveats others have experienced, especially as it 
pertains with the autovacuum not launching worker processes which I have 
experienced.

Best regards,

Benjamin 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Quesion on the use of indexes

2010-08-16 Thread Benjamin Krajmalnik
A little background - I have various multi-column indexes whenever I
have queries which restrict the output based on the values of the 2
fields (for example, a client code and the date of a transaction).

Is there a performance gain using this approach as opposed to using 2
separate indexes, one on the first column and one on the second column?

 

The reason I am asking is that my coding convetion goes back to the days
where I used ISAM tables, so the systems did not know how to use more
than a single index.

In some cases, I may have an index on (columna, columnb)  and one on
(columnb, columna) due to the data access patterns.  If there are no
performance gains in having these multi-part indexes, and performance
will be the same as having one index solely on columna and one solely on
columnb, then I can reduce the disk usage significantly in some cases.



Re: [PERFORM] Question of using COPY on a table with triggers

2010-07-15 Thread Benjamin Krajmalnik
That is what I thought.
The trigger calls a 3000 row stored procedure which does all of the 
calculations to aggregate data into 3 separate tables and then insert the raw 
data point into a 4th table.


> -Original Message-
> From: Pierre C [mailto:li...@peufeu.com]
> Sent: Thursday, July 15, 2010 4:47 PM
> To: Benjamin Krajmalnik; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Question of using COPY on a table with triggers
> 
> > Essentially, we insert a set of columns into a table, and each row
> fires
> > a trigger function which calls a very large stored procedure
> 
> 
> For inserting lots of rows, COPY is much faster than INSERT because it
> parses data (a lot) faster and is more "data-stream-friendly". However
> the
> actual inserting into the tbale and trigger-calling has to be done for
> both.
> 
> If the trigger is a "very large stored procedure" it is very likely
> that
> executing it will take a lot more time than parsing & executing the
> INSERT. So, using COPY instead of INSERT will not gain you anything.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Question of using COPY on a table with triggers

2010-07-15 Thread Benjamin Krajmalnik
First of all, a little background.

We have a table which is used as a trigger table for entering and
processing data for a network monitoring system.

Essentially, we insert a set of columns into a table, and each row fires
a trigger function which calls a very large stored procedure which
aggregates data, etc.  At that point, the row is deleted from the temp
table.

Currently, records are transferred from the data collector as a series
of multi-row inserts.

Before going through the exercise of recoding, and given the fact that
each of this inserts fires of a trigger, will I see any noticeable
performance?

 

The table definition follows:

 

CREATE TABLE tbltmptests

(

  tmptestsysid bigserial NOT NULL,

  testhash character varying(32),

  testtime timestamp with time zone,

  statusid integer,

  replytxt text,

  replyval real,

  groupid integer,

  CONSTRAINT tbltmptests_pkey PRIMARY KEY (tmptestsysid)

)

WITH (

  OIDS=FALSE

);

ALTER TABLE tbltmptests OWNER TO postgres;

 

-- Trigger: tbltmptests_tr on tbltmptests

 

-- DROP TRIGGER tbltmptests_tr ON tbltmptests;

 

CREATE TRIGGER tbltmptests_tr

  AFTER INSERT

  ON tbltmptests

  FOR EACH ROW

  EXECUTE PROCEDURE fn_testtrigger();

 

 

Another question - is there anything special we need to do to handle the
primary constraint field?

 

Now, on a related note and looking forward to the streaming replication
of v9, will this work with it, since we have multiple tables being
update by a trigger function?



Re: [PERFORM] Question about partitioned query behavior

2010-07-02 Thread Benjamin Krajmalnik
In postgresql.conf, what are your settings for constraint_exclusion?

There are 3 settings - on, off, or partition.

Mine are set as follows:

 

constraint_exclusion = on# on, off, or partition

 

Under 8.4.4 I had it set to partition, but the behavior was not what I
expected so I set it back to "on" and only the applicable partitions get
processed.

 

From: pgsql-performance-ow...@postgresql.org
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Ranga
Gopalan
Sent: Friday, July 02, 2010 9:29 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Question about partitioned query behavior

 

Hi,

My question is regarding ORDER BY / LIMIT query behavior when using
partitioning.

I have a large table (about 100 columns, several million rows)
partitioned by a column called day (which is the date stored as mmdd
- say 20100502 for May 2nd 2010 etc.). Say the main table  is called
FACT_TABLE and each child table is called FACT_TABLE_mmdd (e.g.
FACT_TABLE_20100502, FACT_TABLE_20100503 etc.) and has an appropriate
CHECK constraint created on it to CHECK (day = mmdd).

Postgres Version:  PostgreSQL 8.4.2 on x86_64-unknown-linux-gnu,
compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-10), 64-bit

The query pattern I am looking at is (I have tried to simplify the
column names for readability):

SELECT F1 from FACT_TABLE 
where day >= 20100502 and day <= 20100507  # selecting for a week
ORDER BY F2 desc
LIMIT 100


This is what is happening:

When I query from the specific day's (child) table, I get what I expect
- a descending Index scan and good performance.

# explain  select F1 from FACT_TABLE_20100502 where day = 20100502 order
by F2 desc limit 100;
 
QUERY PLAN

  


--
 Limit  (cost=0.00..4.81 rows=100 width=41)
   ->  Index Scan Backward using F2_20100502 on FACT_TABLE_20100502
(cost=0.00..90355.89 rows=1876985 width=41
)
 Filter: (day = 20100502)



BUT:

When I do the same query against the parent table it is much slower -
two things seem to happen - one is that the descending scan of the index
is not done and secondly there seems to be a separate sort/limit at the
end - i.e. all data from all partitions is retrieved and then sorted and
limited - This seems to be much less efficient than doing a descending
scan on each partition and limiting the results and then combining and
reapplying the limit at the end.

explain  select F1 from FACT_TABLE where day = 20100502 order by F2 desc
limit 100;
 
QUERY PLAN

   


---
 Limit  (cost=2084948.01..2084948.01 rows=100 width=41)
   ->  Sort  (cost=2084948.01..2084994.93 rows=1876986 width=41)
 Sort Key: public.FACT_TABLE.F2
 ->  Result  (cost=100.00..2084230.64 rows=1876986
width=41)
   ->  Append  (cost=100.00..2084230.64
rows=1876986 width=41)
 ->  Seq Scan on FACT_TABLE
(cost=100.00..110.02 rows=1 width=186)
   Filter: (day = 20100502)
 ->  Seq Scan on FACT_TABLE_20100502 FACT_TABLE
(cost=100.00..1084220.62 rows=1876985 width=4
1)
   Filter: (day = 20100502)
(9 rows)


Could anyone please explain why this is happening and what I can do to
get the query to perform well even when querying from the parent table?

Thanks,

Ranga








Hotmail is redefining busy with tools for the New Busy. Get more from
your inbox. See how.
 



Re: [PERFORM] cpu bound postgresql setup.

2010-06-29 Thread Benjamin Krajmalnik
Bruce,
Unfortunately not.  The behavior I had was ebbs and flows.  On FreeBSD,
I was seeing a lot of kernel wait states in top.  So every few minutes,
responsiveness of the db was pretty bad.  8.4.4/amd64 on FreeBSD 7.2

> -Original Message-
> From: Bruce Momjian [mailto:br...@momjian.us]
> Sent: Monday, June 28, 2010 3:45 PM
> To: Benjamin Krajmalnik
> Cc: Rajesh Kumar Mallah; Kevin Grittner; pgsql-
> performa...@postgresql.org
> Subject: Re: [PERFORM] cpu bound postgresql setup.
> 
> Benjamin Krajmalnik wrote:
> > Rajesh,
> >
> > I had a similar situation a few weeks ago whereby performance all of
> a
> > sudden decreased.
> > The one tunable which resolved the problem in my case was increasing
> the
> > number of checkpoint segments.
> > After increasing them, everything went back to its normal state.
> 
> Did you get a db server log message suggesting in increasing that
> setting? I hope so.
> 
> --
>   Bruce Momjian  http://momjian.us
>   EnterpriseDB http://enterprisedb.com
> 
>   + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cpu bound postgresql setup.

2010-06-24 Thread Benjamin Krajmalnik
Rajesh,

I had a similar situation a few weeks ago whereby performance all of a
sudden decreased.
The one tunable which resolved the problem in my case was increasing the
number of checkpoint segments.
After increasing them, everything went back to its normal state.


> -Original Message-
> From: pgsql-performance-ow...@postgresql.org
[mailto:pgsql-performance-
> ow...@postgresql.org] On Behalf Of Rajesh Kumar Mallah
> Sent: Thursday, June 24, 2010 11:27 AM
> To: Kevin Grittner
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] cpu bound postgresql setup.
> 
> >i do not remember well but there is a system view that (i think)
> >guides at what stage the marginal returns of increasing it
> >starts disappearing , i had set it a few years back.
> 
> Sorry the above comment was regarding setting shared_buffers
> not effective_cache_size.
> 
> 
> 
> On Thu, Jun 24, 2010 at 10:55 PM, Rajesh Kumar Mallah
>  wrote:
> > On Thu, Jun 24, 2010 at 8:57 PM, Kevin Grittner
> >  wrote:
> >> I'm not clear whether you still have a problem, or whether the
> >> changes you mention solved your issues.  I'll comment on potential
> >> issues that leap out at me.
> >
> > It shall require more observation to know if the "problem" is
solved.
> > my  "problem"  was high load average in the server . We find that
> > when ldavg is between 10-20 responses of applications were
acceptable
> > ldavg  > 40 makes things slower.
> >
> > What prompted me to post to list is that the server transitioned
from
> > being IO bound to CPU bound and 90% of syscalls being
> > lseek(XXX, 0, SEEK_END) = YYY
> >
> >>
> >> Rajesh Kumar Mallah  wrote:
> >>
> >>> 3. we use xfs  and our controller has BBU , we changed barriers=1
> >>> to barriers=0 as i learnt that having barriers=1 on xfs  and fsync
> >>> as the sync method, the advantage of BBU is lost unless barriers
> >>> is = 0 (correct me if my understanding is wrong)
> >>
> >> We use noatime,nobarrier in /etc/fstab.  I'm not sure where you're
> >> setting that, but if you have a controller with BBU, you want to
set
> >> it to whichever disables write barriers.
> >
> > as per suggestion in discussions on some other thread I set it
> > in /etc/fstab.
> >
> >>
> >>> max_connections = 300
> >>
> >> As I've previously mentioned, I would use a connection pool, in
> >> which case this wouldn't need to be that high.
> >
> > We do use connection pooling provided to mod_perl server
> > via Apache::DBI::Cache. If i reduce this i *get* "too many
> > connections from non-superuser ... "  error. Will pgpool - I/II
> > still applicable in this scenario ?
> >
> >
> >>
> >>> work_mem = 4GB
> >>
> >> That's pretty high.  That much memory can be used by each active
> >> connection, potentially for each of several parts of the active
> >> query on each connection.  You should probably set this much lower
> >> in postgresql.conf and boost it if necessary for individual
queries.
> >
> > hmmm.. it was 8GB for many months !
> >
> > i shall reduce it further, but will it not result in usage of too
> many
> > temp files
> > and saturate i/o?
> >
> >
> >
> >>
> >>> effective_cache_size = 18GB
> >>
> >> With 32GB RAM on the machine, I would probably set this higher --
> >> somewhere in the 24GB to 30GB range, unless you have specific
> >> reasons to believe otherwise.  It's not that critical, though.
> >
> > i do not remember well but there is a system view that (i think)
> > guides at what stage the marginal returns of increasing it
> > starts disappearing , i had set it a few years back.
> >
> >
> >>
> >>> add_missing_from = on
> >>
> >> Why?  There has been discussion of eliminating this option -- do
you
> >> have queries which rely on the non-standard syntax this enables?
> >
> > unfortunately yes.
> >
> >>
> >>> Also i would like to apologize that some of the discussions on
> >>> this problem inadvertently became private between me & kevin.
> >>
> >> Oops.  I failed to notice that.  Thanks for bringing it back to the
> >> list.  (It's definitely in your best interest to keep it in front
of
> >> all the other folks here, some of whom regularly catch things I
miss
> >> or get wrong.)
> >>
> >> If you still do have slow queries, please follow up with details.
> >
> >
> > I have now set log_min_duration_statement = 5000
> > and there are few queries that come to logs.
> >
> > please comment on the connection pooling aspect.
> >
> > Warm  Regards
> > Rajesh Kumar Mallah.
> >
> >>
> >> -Kevin
> >>
> >
> 
> --
> Sent via pgsql-performance mailing list (pgsql-
> performa...@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance