Re: [GENERAL] Tuning 9.3 for 32 GB RAM

2013-11-14 Thread Michael Paquier
On Thu, Nov 14, 2013 at 10:44 PM, Alexander Farber
 wrote:
> sysctl.conf:
>
>kernel.shmmax=17179869184
>kernel.shmall=4194304
You do not need those settings in sysctl.conf since 9.3 as consumption
of V shared memory has been reduced with this commit:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b0fc0df9364d2d2d17c0162cf3b8b59f6cb09f67
Regards,
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning 9.3 for 32 GB RAM

2013-11-14 Thread Tomas Vondra
On 14 Listopad 2013, 16:09, Alexander Farber wrote:
> pgtune has produced the following for my server (the specs:
> http://www.hetzner.de/en/hosting/produkte_rootserver/px60ssd ):
>
> default_statistics_target = 50
> maintenance_work_mem = 1GB
> constraint_exclusion = on
> checkpoint_completion_target = 0.9
> effective_cache_size = 22GB
> work_mem = 192MB
> wal_buffers = 8MB
> checkpoint_segments = 16
> shared_buffers = 7680MB
> max_connections = 80
>
> Is it really okay? Isn't 22GB too high?

effective_cache_size is a hint on how much memory is there for filesystem
cache, so that it can be considered when planning queries. PostgreSQL
relies on filesystem cache, so this is needed to answer questions like
'how probable it's the block is in cache and won't actually cause any
I/O"?

It does not allocate anything. You do have 32GB of RAM in total, so 22GB
for caches seems about right unless you're running other memory-intensive
applications on the same box (making less memory to be available for the
filesystem cache).

Tomas



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning 9.3 for 32 GB RAM

2013-11-14 Thread Sergey Konoplev
On Thu, Nov 14, 2013 at 5:44 AM, Alexander Farber
 wrote:
> sysctl.conf:
>
>kernel.shmmax=17179869184
>kernel.shmall=4194304

You can also consult this document about kernel and OS settings:

https://github.com/grayhemp/pgcookbook/blob/master/database_server_configuration.md

-- 
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray...@gmail.com


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning 9.3 for 32 GB RAM

2013-11-14 Thread Zev Benjamin

On 11/14/2013 10:09 AM, Alexander Farber wrote:

pgtune has produced the following for my server (the specs:
http://www.hetzner.de/en/hosting/produkte_rootserver/px60ssd ):

default_statistics_target = 50
maintenance_work_mem = 1GB
constraint_exclusion = on
checkpoint_completion_target = 0.9
effective_cache_size = 22GB
work_mem = 192MB
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 7680MB
max_connections = 80

Is it really okay? Isn't 22GB too high?
And how does it know that max_connections =80 is enough in my case? (I
use pgbouncer).


It doesn't.  There is a static map between the "type" (the -T option) 
pgtune is using and the max_connections value it sets.  You should 
consider the output of pgtune as a guideline rather than "optimal settings."



Zev



Regards
Alex






On Thu, Nov 14, 2013 at 4:04 PM, Alexander Farber
mailto:alexander.far...@gmail.com>> wrote:

And pgtune is 4 years old...






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning 9.3 for 32 GB RAM

2013-11-14 Thread Alexander Farber
pgtune has produced the following for my server (the specs:
http://www.hetzner.de/en/hosting/produkte_rootserver/px60ssd ):

default_statistics_target = 50
maintenance_work_mem = 1GB
constraint_exclusion = on
checkpoint_completion_target = 0.9
effective_cache_size = 22GB
work_mem = 192MB
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = 7680MB
max_connections = 80

Is it really okay? Isn't 22GB too high?
And how does it know that max_connections =80 is enough in my case? (I use
pgbouncer).

Regards
Alex






On Thu, Nov 14, 2013 at 4:04 PM, Alexander Farber <
alexander.far...@gmail.com> wrote:

> And pgtune is 4 years old...
>
>
>


Re: [GENERAL] Tuning 9.3 for 32 GB RAM

2013-11-14 Thread Alexander Farber
And pgtune is 4 years old...


Re: [GENERAL] Tuning 9.3 for 32 GB RAM

2013-11-14 Thread Alexander Farber
Hello, my problem with
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
is that it never mentions larger amounts of RAM,
so while reading it I always wonder how up-to-date it is...


Re: [GENERAL] Tuning 9.3 for 32 GB RAM

2013-11-14 Thread Jayadevan M
Hi,

On Thu, Nov 14, 2013 at 7:14 PM, Alexander Farber <
alexander.far...@gmail.com> wrote:

> Hello,
>
> do these changes please look okay for a PostgreSQL 9.3 running on CentOS
> 6.4 server with 32 GB RAM (with Drupal 7 and few custom PHP scripts)
>
> postgresql.conf:
>
>shared_buffers = 4096MB
>work_mem = 32MB
>checkpoint_segments = 32
>log_min_duration_statement = 1
>
> sysctl.conf:
>
>kernel.shmmax=17179869184
>kernel.shmall=4194304
>
> pgbouncer.ini:
>
>listen_port = 6432
>unix_socket_dir = /tmp
>pool_mode = session
>server_reset_query = DISCARD ALL
>server_check_delay = 10
>max_client_conn = 600
>default_pool_size = 50
>
> I understand, that nobody can tell me the optimal settings - unless I
> provide full source code to everything. And if I provide "the full source
> code", nobody will look at it anyway.
>
> So I am just asking, if the settings look okay or if they will waste
> gigabytes of RAM.
>
> Thank you
> Alex
>
> You may want to look at effective_cache_size also. May be quickly go
through
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server and see what
http://pgfoundry.org/projects/pgtune/
says.
Jayadevan


[GENERAL] Tuning 9.3 for 32 GB RAM

2013-11-14 Thread Alexander Farber
Hello,

do these changes please look okay for a PostgreSQL 9.3 running on CentOS
6.4 server with 32 GB RAM (with Drupal 7 and few custom PHP scripts)

postgresql.conf:

   shared_buffers = 4096MB
   work_mem = 32MB
   checkpoint_segments = 32
   log_min_duration_statement = 1

sysctl.conf:

   kernel.shmmax=17179869184
   kernel.shmall=4194304

pgbouncer.ini:

   listen_port = 6432
   unix_socket_dir = /tmp
   pool_mode = session
   server_reset_query = DISCARD ALL
   server_check_delay = 10
   max_client_conn = 600
   default_pool_size = 50

I understand, that nobody can tell me the optimal settings - unless I
provide full source code to everything. And if I provide "the full source
code", nobody will look at it anyway.

So I am just asking, if the settings look okay or if they will waste
gigabytes of RAM.

Thank you
Alex


Re: [GENERAL] Tuning read ahead continued...

2013-05-17 Thread Ramsey Gurley
On May 16, 2013, at 5:56 PM, Ramsey Gurley wrote:Hi All,I tried bumping my read ahead up to 4096. Instead of having faster reads, it seems it actually slowed things down. In fact, most of the tuning suggestions I've tried have made little to no difference in the results I get from bonnie++.I've run more tests with bonnie++. I'm beginning to wonder if there's something wrong with my system or my setup. In every test I have run, Seq Reads is faster with read ahead set to 256. If I increase read ahead to 4096 as suggested in Postgresql 9.0 High Performance, I get slower reads and slower writes. Other settings I've made as suggested by the book,
		
	
	/dev/sdb1 / ext3 noatime,errors=remount-ro 0 1    vm.swappiness=0
		
	
	vm.overcommit_memory=2echo 2 > /proc/sys/vm/dirty_ratioecho 1 > /proc/sys/vm/dirty_background_ratioHere is 4096 read aheadVersion 1.03e       --Sequential Output-- --Sequential Input- --Random-                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP498088-db1.s 96280M           130123  24 103634  15           277467  14 652.4   1498088-db1.smarthealth.com,96280M,,,130123,24,103634,15,,,277467,14,652.4,1,And here is the default 256 read aheadVersion 1.03e       --Sequential Output-- --Sequential Input- --Random-                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP498088-db1.s 96280M           160881  28 104868  17           286109  17 591.9   0498088-db1.smarthealth.com,96280M,,,160881,28,104868,17,,,286109,17,591.9,0,I also made some zcav plots. They are very flat on 256, which seems to indicate some limiting factor, but they also appear to be consistently *higher* than the 4096 values after about 70GB.  Does this look familiar to anyone?

[GENERAL] Tuning read ahead continued...

2013-05-16 Thread Ramsey Gurley
Hi All,

I tried bumping my read ahead up to 4096. Instead of having faster reads, it 
seems it actually slowed things down. In fact, most of the tuning suggestions 
I've tried have made little to no difference in the results I get from 
bonnie++. I'll include a table of values in html. I'm wondering if these are 
normal values in my case; 4 disk RAID10 Linux ext3 146GB SAS 15K RPM Drive.



Title: Benchmarks

	
	
		
			4 disk RAID 10 ext3 Red Hat
			

	Config
	
	Block Out
	Rewrite Out
	Block In
	Random Seeks


	Read Ahead
	noatime
	swappiness
	overcommit
	dirty ratio
	dirty bg ratio
	Chunk size
	K/sec
	%cpu
	K/sec
	%cpu
	K/sec
	%cpu
	/sec
	%cpu

			
			

	192
	noatime
	0
	2
	2
	1
	96280M
	163189
	29
	102625
	17
	288505
	17
	583.4
	1


	256
	default
	default
	default
	40
	10
	96280M
	160848
	28
	105445
	17
	288802
	17
	588.2
	1


	256
	noatime
	default
	default
	40
	10
	96280M
	157318
	26
	106947
	17
	289369
	17
	603.7
	1


	256
	noatime
	0
	2
	2
	1
	96280M
	162137
	29
	104220
	17
	292700
	17
	609.3
	1


	1024
	noatime
	0
	2
	2
	1
	96280M
	162908
	29
	98724
	15
	267720
	13
	665.0
	0


	4096
	default
	default
	default
	40
	10
	96280M
	160507
	28
	105678
	15
	277123
	13
	666.5
	1


	4096
	noatime
	default
	default
	40
	10
	96280M
	159806
	27
	106396
	15
	276359
	13
	525.1
	1


	8192
	noatime
	default
	default
	40
	10
	96280M
	122381
	21
	103858
	15
	270377
	13
	658.0
	1

			
		
	


Thank you,

Ramsey
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning read ahead

2013-05-16 Thread Ramsey Gurley

On May 16, 2013, at 6:01 AM, Shaun Thomas wrote:

> On 05/15/2013 08:04 PM, Ramsey Gurley wrote:
> 
>> My question: Is that advice just for the database drive, or should I
>> increase read ahead on the OS/WAL disk as well?
> 
> Definitely the database drive, but it doesn't hurt to do both. It doesn't 
> mention it in the book, but if you have a Debian or Ubuntu system, you can 
> set it up to retain these settings through reboots very easily. The udev 
> system can be set with rules that can target whole ranges of devices. Here's 
> one we use:
> 
> * In a file named /etc/udev/rules.d/20-pg.rules
> 
> ACTION=="add|change", KERNEL=="sd[a-z]",ATTR{queue/read_ahead_kb}="4096"
> 
> Our systems are also NVRAM based, so we also throw in a NOOP access scheduler:
> 
> ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/scheduler}="noop"
> 
> There's really no reason to do it any other way if you have udev installed. 
> You *could* put blockdev calls in /etc/rc.local I suppose, but udev applies 
> rules at device detection, which can be beneficial.


Interesting point. I had not considered whether the setting would be maintained 
through reboots. I'll have to google for the appropriate settings on Red Hat.


>> I assume both. I should ask the same for noatime advice while I'm at
>> it.
> 
> You can probably get away with relatime, which is the default for most modern 
> systems these days.



I will probably go with noatime on the data drive then. I see where that would 
require lots of reads and should not be writing to the drive. In my mind, WAL 
should be read much less frequently. Maybe I am wrong about that :-)

Thank you,

Ramsey



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning read ahead

2013-05-16 Thread Shaun Thomas

On 05/15/2013 08:04 PM, Ramsey Gurley wrote:


My question: Is that advice just for the database drive, or should I
increase read ahead on the OS/WAL disk as well?


Definitely the database drive, but it doesn't hurt to do both. It 
doesn't mention it in the book, but if you have a Debian or Ubuntu 
system, you can set it up to retain these settings through reboots very 
easily. The udev system can be set with rules that can target whole 
ranges of devices. Here's one we use:


* In a file named /etc/udev/rules.d/20-pg.rules

ACTION=="add|change", KERNEL=="sd[a-z]",ATTR{queue/read_ahead_kb}="4096"

Our systems are also NVRAM based, so we also throw in a NOOP access 
scheduler:


ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/scheduler}="noop"

There's really no reason to do it any other way if you have udev 
installed. You *could* put blockdev calls in /etc/rc.local I suppose, 
but udev applies rules at device detection, which can be beneficial.



I assume both. I should ask the same for noatime advice while I'm at
it.


You can probably get away with relatime, which is the default for most 
modern systems these days.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning read ahead

2013-05-15 Thread Ramsey Gurley
Hi all,

I've just gotten into my new database server yesterday and I've started doing 
database setup and tuning.

I'm on a Rackspace Linux server with two raid arrays. Both are ext3. One is a 
two disk RAID1 I plan on using for WAL and OS, the other is a four disk RAID10 
I will use for the data. 

I read in Postgres 9.0 High Performance that one of the most important 
parameters I should tune is the device read-ahead. 

My question: Is that advice just for the database drive, or should I increase 
read ahead on the OS/WAL disk as well? I assume both. I should ask the same for 
noatime advice while I'm at it. Is it important to disable atime on the WAL as 
well as the data?

Thanks,

Ramsey

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning / performance questions

2012-11-06 Thread Craig Ringer
On 11/07/2012 01:29 PM, Bryan Montgomery wrote:
> Thanks for the insight. I know why some of the settings were set, but
> there are others that have been highlighted that now don't make sense.
> I'll strongly recommend adjusting those now.
Good. If you've been running with fsync=off all this time I hope you've
also been keeping good backups.

BTW, I should've sent you a link to
http://wiki.postgresql.org/wiki/Number_Of_Database_Connections in my
prior email, where I recommended connection pooling.

--
Craig Ringer


Re: [GENERAL] Tuning / performance questions

2012-11-06 Thread Bryan Montgomery
Thanks for the insight. I know why some of the settings were set, but there
are others that have been highlighted that now don't make sense. I'll
strongly recommend adjusting those now.

On Tue, Nov 6, 2012 at 8:54 PM, Craig Ringer  wrote:

>  Thanks for including your configuration and version; it makes things
> much easier.
>
> Reply follows inline.
>
>
> On 11/06/2012 09:04 PM, Bryan Montgomery wrote:
>
>  I'm wondering what general ways there are to monitor and improve
> performance? We look at pgadmin's server status but that only sees the
> function being run. Additionally, is there a good way to 'explain' a
> function? Or do you have to execute the function steps individually and
> explain those?
>
>
> See the auto_explain contrib module. It can explain statements within
> functions, as well as the functions themselves.
>
>  
> http://www.postgresql.org/docs/current/static/auto-explain.html
>
>
>   The server typically has up to 500 connections with a max of 750
> connections.
>
> Get a connection pooler. Urgently. See
> http://wiki.postgresql.org/wiki/PgBouncer . It is extremely unlikely that
> your server is running efficiently with that many concurrent connections
> actively working. Reducing it to (say) 100 and using transaction-level
> connection pooling may boost performance significantly.
>
>  work_mem = 512MB# min 64kB
>
> That's really dangerous with your connection count. If many connections
> actually use that, you'll run out of RAM in a hurry and enter nasty paging
> storm. If possible, reduce it, then raise it selectively in transactions
> where you know a high work_mem is needed.
>
>   fsync = off # turns forced synchronization
> on or off
>
>
> So you don't value your data and don't mind if you lose all of it,
> permanently and unrecoverably, if your server loses power or the host OS
> hard crashes?
>
> It's much safer to use `synchronous_commit = off` and a commit_delay. If
> that isn't enough, get fast-flushing storage like a good raid controller
> with a battery backed cache you can put in write-back mode, or some high
> quality SSDs with power-protected write caches.
>
>   full_page_writes = off  # recover from partial page
> writes
>
> As above: I hope your data isn't important to you.
>
> --
> Craig Ringer
>


Re: [GENERAL] Tuning / performance questions

2012-11-06 Thread Craig Ringer
Thanks for including your configuration and version; it makes things
much easier.

Reply follows inline.

On 11/06/2012 09:04 PM, Bryan Montgomery wrote:
> I'm wondering what general ways there are to monitor and improve
> performance? We look at pgadmin's server status but that only sees the
> function being run. Additionally, is there a good way to 'explain' a
> function? Or do you have to execute the function steps individually
> and explain those?

See the auto_explain contrib module. It can explain statements within
functions, as well as the functions themselves.

http://www.postgresql.org/docs/current/static/auto-explain.html


> The server typically has up to 500 connections with a max of 750
> connections.
>
Get a connection pooler. Urgently. See
http://wiki.postgresql.org/wiki/PgBouncer . It is extremely unlikely
that your server is running efficiently with that many concurrent
connections actively working. Reducing it to (say) 100 and using
transaction-level connection pooling may boost performance significantly.
>
> work_mem = 512MB# min 64kB
>
That's really dangerous with your connection count. If many connections
actually use that, you'll run out of RAM in a hurry and enter nasty
paging storm. If possible, reduce it, then raise it selectively in
transactions where you know a high work_mem is needed.

> fsync = off # turns forced synchronization
> on or off

So you don't value your data and don't mind if you lose all of it,
permanently and unrecoverably, if your server loses power or the host OS
hard crashes?

It's much safer to use `synchronous_commit = off` and a commit_delay. If
that isn't enough, get fast-flushing storage like a good raid controller
with a battery backed cache you can put in write-back mode, or some high
quality SSDs with power-protected write caches.

> full_page_writes = off  # recover from partial page writes
>
As above: I hope your data isn't important to you.

--
Craig Ringer


Re: [GENERAL] Tuning / performance questions

2012-11-06 Thread Kevin Grittner
Bryan Montgomery wrote:

> We have a system, that due to recent events is getting a lot
> heavier use. The application makes extensive use of functions.
> These functions would typically run sub-second but now can take
> several seconds.

> The server is running on suse 11.4 with 8 vcpu and 32Gb ram on a
> virtual machine.
> 
> Running pg_version returns 'PostgreSQL 9.1.0 on
> x86_64-unknown-linux-gnu, compiled by gcc (SUSE Linux) 4.5.1
> 20101208 [gcc-4_5-branch revision 167585], 64-bit' and select
> pg_size_pretty(pg_database_size('nrgdb')); returns 63 GB.

> The server typically has up to 500 connections with a max of 750
> connections.

> max_connections = 750 # (change requires restart)

> temp_buffers = 64MB # min 800kB

> work_mem = 512MB # min 64kB

750 * ((512 MB) + (64 MB)) = 421.87500 gigabytes

Once a connection uses memory for temp_buffers, it doesn't release it
for as long as that connection runs. Each connection can allocate
work_mem for each node of a query plan. Even though maximum usage can
be more than one allocation per connection, rule of thumb is to
assume just one. You are up to needing 422 GB + shared memory + OS
space (including some buffers and cache) + whatever else you run on
this VM. You have 32 GB. You will almost certainly have problems at
high load.

Try putting pgbouncer in front of the database, configured to use
transaction mode and accept 750 user connections while only keeping
20 or so database conections open.

> max_prepared_transactions = 250 # can be 0 or more

Are you really using two phase commit and a transaction manager?
(Don't confuse this setting with something related to prepared
*statements* -- prepared *transacitons* are a separate issue.)  Even
if you are using prepared transactions, do you really expect your
transaction manager to let 250 transactions pile up in the database
between the first and second phase of commit?

BTW, you should be monitoring this table for old prepared
transactions that to prevent problems with bloat.

> fsync = off
> full_page_writes = off

You didn't mention your backup scheme, but be prepared for the fact
that with these settings, if the VM (or its underlying OS or
hardward) fails, your database will be corrupted and you may have no
choice but to use your backup.

> commit_delay = 1000 # range 0-10, in microseconds
> 
> commit_siblings = 5 # range 1-1000

These settings are notoriously hard to configure from the default
without actually making things worse. Be very sure you know what you
are doing and have carefully benchmarked this against your real
workload; otherwise it is probably better to put these back to the
defaults.

There may be some other fine-tuning opportunities, but these issues
should be fixed first, and it would be best to have an actual query
that is performing poorly to try to tune some of the other settings.

-Kevin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning / performance questions

2012-11-06 Thread Bryan Montgomery
Hello experts! (and other like me).
**

We have a system, that due to recent events is getting a lot heavier use.
The application makes extensive use of functions. These functions would
typically run sub-second but now can take several seconds.

I'm wondering what general ways there are to monitor and improve
performance? We look at pgadmin's server status but that only sees the
function being run. Additionally, is there a good way to 'explain' a
function? Or do you have to execute the function steps individually and
explain those?

The server is running on suse 11.4 with 8 vcpu and 32Gb ram on a virtual
machine.

Running pg_version returns 'PostgreSQL 9.1.0 on x86_64-unknown-linux-gnu,
compiled by gcc (SUSE Linux) 4.5.1 20101208 [gcc-4_5-branch revision
167585], 64-bit' and select pg_size_pretty(pg_database_size('nrgdb'));
returns 63 GB.

The server typically has up to 500 connections with a max of 750
connections.

Below are the non-default values of our configuration file.

Any thoughts on what we should look at?

Thanks,

Bryan

listen_addresses = '*'  # what IP address(es) to listen on;

max_connections = 750   # (change requires restart)

superuser_reserved_connections = 9  # (change requires restart)

shared_buffers = 8192MB # min 128kB or max_connections*16kB

temp_buffers = 64MB # min 800kB

max_prepared_transactions = 250 # can be 0 or more

work_mem = 512MB# min 64kB

maintenance_work_mem = 1GB  # min 1MB

fsync = off # turns forced synchronization on
or off

full_page_writes = off  # recover from partial page writes

wal_buffers = 16MB  # min 32kB

commit_delay = 1000 # range 0-10, in microseconds

commit_siblings = 5 # range 1-1000

checkpoint_segments = 50# in logfile segments, min 1, 16MB
each

checkpoint_timeout = 5min   # range 30s-1h

checkpoint_warning = 1min   # 0 is off

effective_cache_size = 16GB

log_destination = 'stderr'  # Valid values are combinations of

logging_collector = on

Log_directory = '/var/log/postgres' # Directory where log files
are written

log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.

log_rotation_age = 1d   # Automatic rotation of logfiles
will

log_rotation_size = 20MB# Automatic rotation of logfiles
will

log_min_messages = info # Values, in order of decreasing detail:

log_min_duration_statement = 1 # -1 is disabled, 0 logs all statements

log_line_prefix = '%t %p %u@%h: '   # Special values for Pgsi

log_statement = 'none'  # none, ddl, mod, all

log_duration = off

autovacuum = on # enable autovacuum subprocess?

datestyle = 'iso, mdy'

lc_messages = 'en_US.UTF-8' # locale for system error
message

lc_monetary = 'en_US.UTF-8' # locale for monetary
formatting

lc_numeric = 'en_US.UTF-8'  # locale for number
formatting

lc_time = 'en_US.UTF-8' # locale for time formatting


Re: [GENERAL] Tuning PostgreSQL for very large database

2011-11-06 Thread John R Pierce

On 11/06/11 8:51 AM, René Fournier wrote:
Just wondering what I can do to squeeze out more performance of my 
database application? Here's my configuration:



- Mac mini server

- Core i7 quad-core at 2GHz

- 16GB memory
- Dedicated fast SSD (two SSDs in the server)
- Mac OS X 10.7.2 (*not* using OS X Server)

- PostgreSQL 9.05
- PostGIS 1.5.3
- Tiger Geocoder 2010 database (from build scripts from
http://svn.osgeo.org/postgis/trunk/extras/tiger_geocoder/tiger_2010/)
- Database size: ~90GB

I should say, this box does more than PostgreSQL 
geocoding/reverse-geocoding, so reasonably only half of the memory 
should be allotted to PostgreSQL.


Coming from MySQL, I would normally play with the my.cnf, using 
my-huge.cnf as a start. But I'm new to PostgreSQL and PostGIS (w/ a 
big database), so I was wondering if anyone had suggestions on tuning 
parameters (also, which files, etc.) Thanks!


postgresql.conf in the postgres 'data' directory is the only postgresql 
file you should have to touch. you -will- also need to increase the 
OSX "kernel.shmmax" and 'kernel.shmall' parameters (I'd set these to 4 
gigabytes each, note that in most 'nix systems shmall is NOT in bytes), 
I can not help you do this as I only know how to do it on 
linux/solaris/aix...


in postgresql.org, given what you've said above, and assuming your 
application uses relatively few concurrent connections (say, no more 
than a few dozen), I'd try something like...


shared_buffers = 1024mb
maintenance_work_mem = 512MB
work_mem = 128MB
effective_cache_size = 4096MB

if you expect 100s of concurrent connections, reduce work_mem accordingly.

I'm assuming your database workload is read-mostly, and that you're not 
going to be doing a high rate of transactional operations with 
updates/inserts.  if you /are/ getting into 100s/1000s of write 
transactions/second, then you'll want to watch your postgres logfiles 
and increase...


checkpoint_segments = **

such that ** is large enough that you no longer get any 
checkpoints-too-frequent warnings. one heavy OLTP transaction server 
recently, I had to increase the default 3 to like 100 to get to a happy 
place.   Increasing wal_buffers is probably a good idea too in these 
cases, but I'm suspecting this doesn't apply to you.



--
john r pierceN 37, W 122
santa cruz ca mid-left coast



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning PostgreSQL for very large database

2011-11-06 Thread René Fournier
Just wondering what I can do to squeeze out more performance of my database 
application? Here's my configuration:

- Mac mini server
- Core i7 quad-core at 2GHz
- 16GB memory
- Dedicated fast SSD (two SSDs in the server)
- Mac OS X 10.7.2 (*not* using OS X Server)
- PostgreSQL 9.05
- PostGIS 1.5.3 
- Tiger Geocoder 2010 database (from build scripts from 
http://svn.osgeo.org/postgis/trunk/extras/tiger_geocoder/tiger_2010/)
- Database size: ~90GB

I should say, this box does more than PostgreSQL geocoding/reverse-geocoding, 
so reasonably only half of the memory should be allotted to PostgreSQL.

Coming from MySQL, I would normally play with the my.cnf, using my-huge.cnf as 
a start. But I'm new to PostgreSQL and PostGIS (w/ a big database), so I was 
wondering if anyone had suggestions on tuning parameters (also, which files, 
etc.) Thanks!

…Rene

Re: [GENERAL] Tuning Variables For PostgreSQL

2011-10-06 Thread Dickson S. Guedes
2011/10/6 Carlos Mennens :
> I read all the time that most DBA's are required or should tune their
> DBMS which obviously in my case would be PostgreSQL but I'm curious
> what exactly is involved when tuning a DBMS like PostgreSQL. What are
> some of the basic functions involved when tuning? Are there specific
> things I should tweak on a newly built server freshly installed with
> PostgreSQL? Can someone please clarify what exactly most people do
> when they "tune"? Lastly I'm sure this has been discussed but after a
> Google search, I can't find any updated info since 2009 so I would
> like to know specifically what file system you've found PostgreSQL to
> work on the best? What file system will give me the best performance
> and stability on disk? I've currently running it on Linux Ext4 file
> system and have no had any issues but I was wondering if there was
> anything out there more suited to perform better on PostgreSQL.

I suggest you to read the wiki [1]. It contains many useful information.

[1] http://wiki.postgresql.org/wiki/Performance_Optimization

-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning Variables For PostgreSQL

2011-10-06 Thread Carlos Mennens
I read all the time that most DBA's are required or should tune their
DBMS which obviously in my case would be PostgreSQL but I'm curious
what exactly is involved when tuning a DBMS like PostgreSQL. What are
some of the basic functions involved when tuning? Are there specific
things I should tweak on a newly built server freshly installed with
PostgreSQL? Can someone please clarify what exactly most people do
when they "tune"? Lastly I'm sure this has been discussed but after a
Google search, I can't find any updated info since 2009 so I would
like to know specifically what file system you've found PostgreSQL to
work on the best? What file system will give me the best performance
and stability on disk? I've currently running it on Linux Ext4 file
system and have no had any issues but I was wondering if there was
anything out there more suited to perform better on PostgreSQL.

Thanks!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning for a tiny database

2011-06-23 Thread CSS
Steve Atkins wrote:
> On Jun 20, 2011, at 10:49 PM, CSS wrote:
> 
>> Hello,
>>
>> I couldn't find much information in the archives on this -- perhaps this
>> is a bit of a specialized need, but I was hoping for some input from
>> some experienced postgres admins.
>>
>> I'm moving some DNS servers from djbdns/tinydns to PowerDNS.  While it
>> supports many backends, postgresql seems like the best choice for us
>> since it's what is used elsewhere (in larger configurations).  As a bit
>> of background, PowerDNS does not query the db for every incoming DNS
>> query, it caches at various levels (both a "packet cache" and a db
>> cache), so it's database needs are quite modest.
>>
>> Some raw numbers: We're only looking at a total of about six tables in
>> one db.  In total there are going to be well under 10,000 records in ALL
>> tables.  That might increase to at most 100,000 in the next few years.
>> Our raw DNS queries/second tops out around 50 qps over three distinct
>> servers.  Keeping in mind that PowerDNS is doing heavy caching, we
>> should never really see more than a few db queries per second.  There
>> will be one "master" pgsql db and three slaves using streaming replication.
>>
>> Now given the number of records and the frequency of queries, how should
>> I be tuning for such a small setup?  Ideally PowerDNS with it's huge
>> amount of caching should get as much RAM and CPU as I can give it, but I
>> do want to ensure the tiny bit of data postgres has is stuck in physical
>> memory as well.
>>
>> What would you suggest for this scenario?
> 
> I'm doing pretty much the same thing. I installed postgresql 8.4.something and
> powerdns from the ubuntu repository on a small VM and hit go. It's running 
> just
> fine, at comparable query rates and maybe 5,000 records. At some point I'll
> tune it, just out of tidiness, but it's fine out of the box.

That's pretty much what I'm seeing.  As I mentioned in my last email in
this thread, the only other test case I need to poke around with is
hitting the powerdns server with a bunch of stuff it's not authoritative
for (forcing a lookup in the domains table) and non-existent records in
domains it is authoritative for (forcing a lookup in the records table).

> You'll also need replication, probably from a stealth master. I found bucardo
> very easy to set up for that, and it works nicely to replicate to multiple 
> public
> servers from a single master server inside the firewall.

Already there - I have a hidden master db server and then the three
public powerdns instances all slave from that.  I'm using the
"streaming" replication option in 9.0 and so far it seems to be working
well.  I need to look more closely at the whole log-shipping thing in
case a slave somehow gets cut off from the master, but in initial
testing, it looks really good.  My only problem was I initially was
putting some of these on older hosts and I ran into the 32 vs. 64 bit
problem in copying the data directory across.

Thanks for your input!

Charles

> Cheers,
>   Steve
> 
> 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning for a tiny database

2011-06-23 Thread CSS
Greg Smith wrote:
> On 06/21/2011 01:49 AM, CSS wrote:
>> Some raw numbers: We're only looking at a total of about six tables in
>> one db.  In total there are going to be well under 10,000 records in ALL
>> tables.  That might increase to at most 100,000 in the next few years.
>> Our raw DNS queries/second tops out around 50 qps over three distinct
>> servers.  Keeping in mind that PowerDNS is doing heavy caching, we
>> should never really see more than a few db queries per second.
> 
> I doubt you really need to do any tuning for this scenario.  I would set
> shared_buffers to a modest value--maybe 256MB--and stop further tuning
> until there's some evidence it's necessary.

Sounds good.  I'm at... wait for it... 8MB now.  This seems to work.
Since I originally posted this I've been benchmarking powerdns and I've
found that with it only hitting one core I can do about 50,000
queries/second.  During that time, Postgres is not even a blip.  I think
with the heavy caching powerdns does, it just doesn't really hit
postgres until it sees something in the cache has reached the max TTL.

In short, heavy DNS query traffic is not in turn causing heavy DNS
traffic.  This might change with someone pummeling it with queries for
domains or hosts that don't exist, but I believe it handles negative
caching of records as well.

> If presented with the same problem but with the much harder twist "I
> need to support >10,000 queries/second", I would recommend:
> 
> -Populate a prototype with a representative amount of data
> -Measure the database size
> -Set shared_buffers to that

Thank you.  I'm fairly green with Postgres, so this is very helpful.

> -Whenever the database is restarted, construct a series of queries that
> forces all the data used regularly into the database's cache
> -Use pg_buffercache to confirm what's in there is what you expect

I need to read up on that module, but it looks quite intriguing.

> Getting all the data into cache is sometimes harder than expected.  Some
> optimizations in PostgreSQL keep it from caching large amount of tables
> when you do a sequential scan of the contents, as one example that
> complicates things.  But if you get to where this is necessary, building
> such a tool isn't difficult, and there are some projects out there that
> address this particular need:  filling the cache back up with relevant
> data after restart.  This is the main one:
> 
> http://pgfoundry.org/projects/pgfincore/
> http://www.pgcon.org/2010/schedule/events/261.en.html

Thanks again, I appreciate all your input on what's currently looking
like a non-problem.  But there are other places I can certainly use this
info.

Charles


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning for a tiny database

2011-06-22 Thread Steve Atkins

On Jun 20, 2011, at 10:49 PM, CSS wrote:

> Hello,
> 
> I couldn't find much information in the archives on this -- perhaps this
> is a bit of a specialized need, but I was hoping for some input from
> some experienced postgres admins.
> 
> I'm moving some DNS servers from djbdns/tinydns to PowerDNS.  While it
> supports many backends, postgresql seems like the best choice for us
> since it's what is used elsewhere (in larger configurations).  As a bit
> of background, PowerDNS does not query the db for every incoming DNS
> query, it caches at various levels (both a "packet cache" and a db
> cache), so it's database needs are quite modest.
> 
> Some raw numbers: We're only looking at a total of about six tables in
> one db.  In total there are going to be well under 10,000 records in ALL
> tables.  That might increase to at most 100,000 in the next few years.
> Our raw DNS queries/second tops out around 50 qps over three distinct
> servers.  Keeping in mind that PowerDNS is doing heavy caching, we
> should never really see more than a few db queries per second.  There
> will be one "master" pgsql db and three slaves using streaming replication.
> 
> Now given the number of records and the frequency of queries, how should
> I be tuning for such a small setup?  Ideally PowerDNS with it's huge
> amount of caching should get as much RAM and CPU as I can give it, but I
> do want to ensure the tiny bit of data postgres has is stuck in physical
> memory as well.
> 
> What would you suggest for this scenario?

I'm doing pretty much the same thing. I installed postgresql 8.4.something and
powerdns from the ubuntu repository on a small VM and hit go. It's running just
fine, at comparable query rates and maybe 5,000 records. At some point I'll
tune it, just out of tidiness, but it's fine out of the box.

You'll also need replication, probably from a stealth master. I found bucardo
very easy to set up for that, and it works nicely to replicate to multiple 
public
servers from a single master server inside the firewall.

Cheers,
  Steve


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning for a tiny database

2011-06-21 Thread Greg Smith

On 06/21/2011 01:49 AM, CSS wrote:

Some raw numbers: We're only looking at a total of about six tables in
one db.  In total there are going to be well under 10,000 records in ALL
tables.  That might increase to at most 100,000 in the next few years.
Our raw DNS queries/second tops out around 50 qps over three distinct
servers.  Keeping in mind that PowerDNS is doing heavy caching, we
should never really see more than a few db queries per second.


I doubt you really need to do any tuning for this scenario.  I would set 
shared_buffers to a modest value--maybe 256MB--and stop further tuning 
until there's some evidence it's necessary.


If presented with the same problem but with the much harder twist "I 
need to support >10,000 queries/second", I would recommend:


-Populate a prototype with a representative amount of data
-Measure the database size
-Set shared_buffers to that
-Whenever the database is restarted, construct a series of queries that 
forces all the data used regularly into the database's cache

-Use pg_buffercache to confirm what's in there is what you expect

Getting all the data into cache is sometimes harder than expected.  Some 
optimizations in PostgreSQL keep it from caching large amount of tables 
when you do a sequential scan of the contents, as one example that 
complicates things.  But if you get to where this is necessary, building 
such a tool isn't difficult, and there are some projects out there that 
address this particular need:  filling the cache back up with relevant 
data after restart.  This is the main one:


http://pgfoundry.org/projects/pgfincore/
http://www.pgcon.org/2010/schedule/events/261.en.html

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning for a tiny database

2011-06-21 Thread Vincent Veyron
Le mardi 21 juin 2011 à 01:49 -0400, CSS a écrit :

> I couldn't find much information in the archives on this -- perhaps this
> is a bit of a specialized need, but I was hoping for some input from
> some experienced postgres admins.
> 
Hi,

I am not an experienced postgres admin, but I am an experienced reader
of this list, and from what I gather, the figures you quote are small
enough that you probably can work out of the box without doing anything,
unless DNS servers have special needs I am not aware of.

> I'm moving some DNS servers from djbdns/tinydns to PowerDNS.  While it
> supports many backends, postgresql seems like the best choice for us
> since it's what is used elsewhere (in larger configurations).  As a bit
> of background, PowerDNS does not query the db for every incoming DNS
> query, it caches at various levels (both a "packet cache" and a db
> cache), so it's database needs are quite modest.
> 
> Some raw numbers: We're only looking at a total of about six tables in
> one db.  In total there are going to be well under 10,000 records in ALL
> tables.  That might increase to at most 100,000 in the next few years.


Unless those DNS records are large (I guess not), the db should reside
entirely in memory at least at the beginning. I am guessing also, but
you should validate it, that your tables, with 1 500 records on average,
probably don't even need an index, as the engine normally does a
sequential scans over small datasets.


-- 
Vincent Veyron
http://marica.fr/
Logiciel de gestion des sinistres et des contentieux pour le service juridique


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning for a tiny database

2011-06-20 Thread CSS
Hello,

I couldn't find much information in the archives on this -- perhaps this
is a bit of a specialized need, but I was hoping for some input from
some experienced postgres admins.

I'm moving some DNS servers from djbdns/tinydns to PowerDNS.  While it
supports many backends, postgresql seems like the best choice for us
since it's what is used elsewhere (in larger configurations).  As a bit
of background, PowerDNS does not query the db for every incoming DNS
query, it caches at various levels (both a "packet cache" and a db
cache), so it's database needs are quite modest.

Some raw numbers: We're only looking at a total of about six tables in
one db.  In total there are going to be well under 10,000 records in ALL
tables.  That might increase to at most 100,000 in the next few years.
Our raw DNS queries/second tops out around 50 qps over three distinct
servers.  Keeping in mind that PowerDNS is doing heavy caching, we
should never really see more than a few db queries per second.  There
will be one "master" pgsql db and three slaves using streaming replication.

Now given the number of records and the frequency of queries, how should
I be tuning for such a small setup?  Ideally PowerDNS with it's huge
amount of caching should get as much RAM and CPU as I can give it, but I
do want to ensure the tiny bit of data postgres has is stuck in physical
memory as well.

What would you suggest for this scenario?

Thanks,

Charles

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning for a tiny database

2011-06-04 Thread CSS
Hello,

I couldn't find much information in the archives on this -- perhaps this
is a bit of a specialized need, but I was hoping for some input from
some experienced pgsql admins.

I'm moving some DNS servers from djbdns/tinydns to PowerDNS.  While it
supports many backends, postgres seems like the best choice for us since
it's what is used elsewhere (in larger configurations).  As a bit of
background, PowerDNS does not query the db for every incoming DNS query,
it caches at various levels (both a "packet cache" and a db cache), so
it's database needs are quite modest.

Some raw numbers: We're only looking at a total of about six tables in
one db.  In total there are going to be well under 10,000 records in ALL
tables.  That might increase to at most 100,000 in the next few years at
most.  Our raw DNS queries/second tops out around 50 qps over three
distinct servers.  Keeping in mind that PowerDNS is doing heavy caching,
we should never really see more than a few db queries per second.  There
will be one "master" pgsql db and two slaves using streaming replication.

Now given the number of records and the frequency of queries, how should
I be tuning for such a small setup?  Ideally PowerDNS with it's huge
amount of caching should get as much RAM and CPU as I can give it, but I
do want to ensure the tiny bit of data postgres has is stuck in physical
memory as well.

What would you suggest for this scenario?

Thanks,

Charles

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning on ec2

2011-04-27 Thread Toby Corkindale

On 27/04/11 01:34, Joel Reymont wrote:

On Apr 26, 2011, at 4:31 PM, Scott Marlowe wrote:

What's your work_mem and max_connections set to?


I have the default settings, e.g. work_mem = 1MB and max_connections = 100.

I'm looking to process 400 requests per second, though. What should I use for 
the above?


Those two options don't really affect the number of requests per second, 
as I understand things.. They're more about how many simultaneous 
requests you can have going.


Make sure you have sensible SQL for the queries you're making, and have 
some fast disks. If you're doing a lot of read-only queries, consider 
whether you can cache the results in your application rather than 
requesting them from the DB every time.


-Toby

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning on ec2

2011-04-26 Thread Scott Marlowe
On Tue, Apr 26, 2011 at 9:34 AM, Joel Reymont  wrote:
>
> On Apr 26, 2011, at 4:31 PM, Scott Marlowe wrote:
>
>> It's a reasonable start.  However, if you consistently using less than
>> that in aggregate then lowering it is fine.
>
> Is there a way to tell if I consistently use less than that in aggregate?

Keep an eye on shr for postgres processes in top.   It should
eventually reach about the same size as your shared_buffers if you're
using it.  If it stays at 500M or something then you don't need as
much shared_buffers.

>> What's your work_mem and max_connections set to?
>
> I have the default settings, e.g. work_mem = 1MB and max_connections = 100.
>
> I'm looking to process 400 requests per second, though. What should I use for 
> the above?

Those settings are fine to start.  Look at connection pooling, as
having too many backends can cause the db to run slower.  Generally 2x
or so as many cores as you have is a good place to start with how many
connections to keep alive by the pooler.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning on ec2

2011-04-26 Thread Joel Reymont

On Apr 26, 2011, at 4:31 PM, Scott Marlowe wrote:

> It's a reasonable start.  However, if you consistently using less than
> that in aggregate then lowering it is fine.

Is there a way to tell if I consistently use less than that in aggregate?

> What's your work_mem and max_connections set to?

I have the default settings, e.g. work_mem = 1MB and max_connections = 100.

I'm looking to process 400 requests per second, though. What should I use for 
the above?

Thanks, Joel

--
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
-++---
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
-++---




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning on ec2

2011-04-26 Thread Scott Marlowe
On Tue, Apr 26, 2011 at 8:59 AM, Joel Reymont  wrote:
> I'm running pgsql on an m1.large EC2 instance with 7.5gb available memory.
>
> The free command shows 7gb of free+cached. My understand from the docs is 
> that I should dedicate 1.75gb to shared_buffers (25%) and set 
> effective_cache_size to 7gb.
>
> Is this correct? I'm running 64-bit Ubuntu 10.10, e.g.

It's a reasonable start.  However, if you consistently using less than
that in aggregate then lowering it is fine.  What's your work_mem and
max_connections set to?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] tuning on ec2

2011-04-26 Thread Joel Reymont
I'm running pgsql on an m1.large EC2 instance with 7.5gb available memory. 

The free command shows 7gb of free+cached. My understand from the docs is that 
I should dedicate 1.75gb to shared_buffers (25%) and set effective_cache_size 
to 7gb. 

Is this correct? I'm running 64-bit Ubuntu 10.10, e.g. 

Linux ... 2.6.35-28-virtual #50-Ubuntu SMP Fri Mar 18 19:16:26 UTC 2011 x86_64 
GNU/Linux

Thanks, Joel

--
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
-++---
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
-++---




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning postgresql writes to disk

2011-02-09 Thread Vick Khera
On Mon, Feb 7, 2011 at 7:43 PM, Vinubalaji Gopal wrote:

> Thank you. I will try to run some performance tests using the async
> commit option. Is there an easy way to find the lost transactions or
> does it have to be handled by the application?
>

By definition, your application has to be able to detect it. If the DB were
able find them, they wouldn't be lost.

Personally, I turn off synchronous commit on a per-connection basis based on
what my application is doing. When I insert or update data, I need to ensure
it is there; when I move data around or do batch processing on it, it is
usually a restartable operation.


Re: [GENERAL] tuning postgresql writes to disk

2011-02-07 Thread Vinubalaji Gopal
Thank you. I will try to run some performance tests using the async
commit option. Is there an easy way to find the lost transactions or
does it have to be handled by the application?

On Mon, Feb 7, 2011 at 6:23 AM, Vick Khera  wrote:
> On Thu, Feb 3, 2011 at 7:08 PM, Vinubalaji Gopal  wrote:
>> already does this.  I looked at the WAL parameters and the new async
>> commit  but not sure if I am looking at the right place. Say i have 10
>> clients connecting and each client is inserting a record. I want to
>>
>
> You want the async commit.  If you can detect and re-execute "lost"
> transactions, it gives you the best of everything: defer disk I/O and
> transaction boundaries are honored so you never have inconsistent data
> after crash recovery.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
Vinu

In a world without fences who needs Gates?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning postgresql writes to disk

2011-02-07 Thread Vick Khera
On Thu, Feb 3, 2011 at 7:08 PM, Vinubalaji Gopal  wrote:
> already does this.  I looked at the WAL parameters and the new async
> commit  but not sure if I am looking at the right place. Say i have 10
> clients connecting and each client is inserting a record. I want to
>

You want the async commit.  If you can detect and re-execute "lost"
transactions, it gives you the best of everything: defer disk I/O and
transaction boundaries are honored so you never have inconsistent data
after crash recovery.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] tuning postgresql writes to disk

2011-02-03 Thread Vinubalaji Gopal
Hi,
  I was wondering if Postgresql can be tuned such that it writes to
disk every n seconds or until the buffer reaches a certain threshold
to tune the performance . This is a special case where the application
wouldn't care if there is a data loss of 1 seconds or less. I would be
interested  to know if there is some configuration parameter which
already does this.  I looked at the WAL parameters and the new async
commit  but not sure if I am looking at the right place. Say i have 10
clients connecting and each client is inserting a record. I want to
force Postgresql only to write to disk when all the 10 records have
been written to the buffer or after 1 second (configurable). My
question is can I partly  achieve this by setting wal_writer_delay =
1000 ms or is there some other parameter which will help me to achieve
the desired effect?


--
Vinu

In a world without fences who needs Gates?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-10 Thread tv
> Hi ( sorry for the double posting, thought Id use the wrong email
> address but both have been posted anyway). As far as the db is concerned
> Im just reading data then writing the data to a lucene search index (which
> is outside of the database) , but my labtop is jut a test machine I want
> to run the same code on production. Why would work_mem not be safe at 64MB
> if I have 2GB of memory - what have I got to be careful of.
>
> Paul

Well, the problem with work_mem is that it's 'per operation' so a query
may actually need several work_mem segments. And it's not just sorting, a
query containing a hash join, hash aggregate and a sort may consume up to
3x work_mem memory.

And if you have a lot of concurrent users running such queries, you may
easily run out of memory - in that case the feared OOM killer comes and
kills one of the processes (usually postmaster, which means the database
goes bottoms up). Not sure how OOM works on MacOS.

But as you said there will be single user running queries on the database,
you can set the work_mem limit pretty high. Depends on the queries though
- a complicated query may consume a lot of memory.

Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-10 Thread Reid Thompson
On Thu, 2010-12-09 at 17:40 +, Paul Taylor wrote:
> what 
> have I got to be careful of. 

I think that was in reference to turning fsync off, not work_mem values.


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread Scott Marlowe
On Thu, Dec 9, 2010 at 5:25 AM, Paul Taylor  wrote:
> Hi, Im using Postgres 8.3 on a Macbook Pro Labtop.
> I using the database with just one db connection to build a lucene search
> index from some of the data, and Im trying to improve performance. The key
> thing is that I'm only a single user but manipulating large amounts of data
> , i.e processing tables with upto 10 million rows in them, so I think want
> to configure Postgres so that it can create large temporary tables in memory
>
> I've tried changes various parameters such as shared_buffers, work_mem and
> checkpoint_segments but I don't really understand what they values are, and
> the documentation seems to be aimed towards configuring for multiple users,
> and my changes make things worse. For example my machine has 2GB of memory
> and I read if using as a dedicated server you should set shared memory to
> 40% of total memory, but when I increase to more than 30MB Postgres will not
> start complaining about my SHMMAX limit.

So you're pretty much batch processing.  Not Postgresql's strongest
point.  But we'll see what we can do.  Large shared buffers aren't
gonna help a lot here, since your OS will be caching files as well,
and you've only got one process running.  You do want a large enough
shared_buffer to hold everything you're working on at one point in
time, so getting it into the hundred or so megabyte range will likely
help.  After that you'll be stealing memory that could be used for OS
caching or work_mem, so don't go crazy, especially on a machine with
only 2 Gigs ram.  Note I just picked up 8 gigs of DDR3 ram for $99 on
newegg, so if you MBP can handle more memory, now's the time to
splurge.

Crank up work_mem to something pretty big, in the 60 to 200 meg range.
 note that work_mem is PER sort, not total or per connection.  So if
your single user runs a query with three sorts, it could use 3x
work_mem.  Once it allocates too much memory your machine will start
swapping and slow to a crawl.  So don't overshoot.

Assuming you can recreate your db should things go horribly wrong, you
can turn off fsync.  Also crank up WAL segments to 32 or 64 or so.

Make sure accesstime updates are turned off for the file system.
(noatime mount option).

-- 
To understand recursion, one must first understand recursion.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread Paul Taylor

On 09/12/2010 15:51, t...@fuzzy.cz wrote:

Hi, Im using Postgres 8.3 on a Macbook Pro Labtop.
I using the database with just one db connection to build a lucene
search index from some of the data, and Im trying to improve
performance. The key thing is that I'm only a single user but
manipulating large amounts of data , i.e processing tables with upto 10
million rows in them, so I think want to configure Postgres so that it
can create large temporary tables in memory

I'm not sure what exactly you mean by "manipulating data" - does that mean
reading or writing?

I'd definitely increase shared_buffers, work_mem (don't be afraid to set
work_mem to say 32MB or 64MB - this should be safe with a single
connection, although it depends on the queries). To improve writes,
increase checkpoint_segments etc.

If you really don't need extra safety - e.g. if you have all the data
backed up and just need to run some ad-hoc analysis (and it does not
matter if it crashes as you can recover it from backup), you can disable
fsync. This will make writes much faster, but it won't be safe in case of
crash.

DON'T DO THIS IF YOU NEED TO KEEP YOUR DATA SAFE!

But as you're runninng the app on your laptop, I guess you can live with
frync=off. You'll loose the consistency but you'll get better performance.

Hi ( sorry for the double posting, thought Id use the wrong email 
address but both have been posted anyway).
As far as the db is concerned Im just reading data then writing the data 
to a lucene search index (which is outside of the database) , but my 
labtop is jut a test machine I want to run the same code on production.
Why would work_mem not be safe at 64MB if I have 2GB of memory - what 
have I got to be careful of.


Paul

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread Reid Thompson
On 12/09/2010 12:36 PM, Paul Taylor wrote:
> On 09/12/2010 15:12, Reid Thompson wrote:
>> On 12/09/2010 09:59 AM, Andy Colson wrote:
>>> On 12/9/2010 8:50 AM, Andy Colson wrote:
 On 12/9/2010 6:25 AM, Paul Taylor wrote:
 You need to bump up your SHMMAX is your OS.
>>> sorry: SHMMAX _in_ your OS.
>>>
>>> its an OS setting not a PG one.
>>>
>>> -Andy
>>>
>>>
>> scroll down to the section on OSX
>> http://developer.postgresql.org/pgdocs/postgres/kernel-resources.html
>>
> Thanks guys, but one think I dont get is why does setting shared_buffers to 
> 40mb break the kernel limit, I mean 40 mb doesnt sound
> like very much at all
> 
> Paul

It's not -- from the same page (near the top)
17.4.1. Shared Memory and Semaphores

Shared memory and semaphores are collectively referred to as "System V IPC" 
(together with message queues, which are not relevant
for PostgreSQL). Almost all modern operating systems provide these features, 
but many of them don't have them turned on or
sufficiently sized by default, especially as available RAM and the demands of 
database applications grow.

and/but most of these system defaults originated when system RAM availability 
was much smaller

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread Paul Taylor

On 09/12/2010 15:12, Reid Thompson wrote:

On 12/09/2010 09:59 AM, Andy Colson wrote:

On 12/9/2010 8:50 AM, Andy Colson wrote:

On 12/9/2010 6:25 AM, Paul Taylor wrote:
You need to bump up your SHMMAX is your OS.

sorry: SHMMAX _in_ your OS.

its an OS setting not a PG one.

-Andy



scroll down to the section on OSX
http://developer.postgresql.org/pgdocs/postgres/kernel-resources.html

Thanks guys, but one think I dont get is why does setting shared_buffers 
to 40mb break the kernel limit, I mean 40 mb doesnt sound like very much 
at all


Paul

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread tv
> Hi, Im using Postgres 8.3 on a Macbook Pro Labtop.
> I using the database with just one db connection to build a lucene
> search index from some of the data, and Im trying to improve
> performance. The key thing is that I'm only a single user but
> manipulating large amounts of data , i.e processing tables with upto 10
> million rows in them, so I think want to configure Postgres so that it
> can create large temporary tables in memory

I'm not sure what exactly you mean by "manipulating data" - does that mean
reading or writing?

I'd definitely increase shared_buffers, work_mem (don't be afraid to set
work_mem to say 32MB or 64MB - this should be safe with a single
connection, although it depends on the queries). To improve writes,
increase checkpoint_segments etc.

If you really don't need extra safety - e.g. if you have all the data
backed up and just need to run some ad-hoc analysis (and it does not
matter if it crashes as you can recover it from backup), you can disable
fsync. This will make writes much faster, but it won't be safe in case of
crash.

DON'T DO THIS IF YOU NEED TO KEEP YOUR DATA SAFE!

But as you're runninng the app on your laptop, I guess you can live with
frync=off. You'll loose the consistency but you'll get better performance.

> Ive tried changes various paramters such as shared_buffers, work_mem and
> checkpoint_segments but I don't really understand what they values are,
> and the documentation seems to be aimed towards configuring for multiple
> users, and my changes make things worse. For example my machine has 2GB
> of memory and I read if using as a dedicated server you should set
> shared memory to 40% of total memory, but when I increase to more than
> 30MB Postgres will not start complaining about my SHMMAX limit.

You're heading in the right direction I think, but you're hitting kernel
limits. A process can't allocate more shared memory (shared buffers) than
SHMMAX limit, so you need to bump this up.

See this - http://www.postgresql.org/docs/9.0/static/kernel-resources.html
There's even a section for MacOS X (which is the OS you're running, I
guess).

Tomas


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread Paul Taylor

Hi, Im using Postgres 8.3 on a Macbook Pro Labtop.
I using the database with just one db connection to build a lucene 
search index from some of the data, and Im trying to improve 
performance. The key thing is that I'm only a single user but 
manipulating large amounts of data , i.e processing tables with upto 10 
million rows in them, so I think want to configure Postgres so that it 
can create large temporary tables in memory


Ive tried changes various paramters such as shared_buffers, work_mem and 
checkpoint_segments but I don't really understand what they values are, 
and the documentation seems to be aimed towards configuring for multiple 
users, and my changes make things worse. For example my machine has 2GB 
of memory and I read if using as a dedicated server you should set 
shared memory to 40% of total memory, but when I increase to more than 
30MB Postgres will not start complaining about my SHMMAX limit.


Paul

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread Reid Thompson
On 12/09/2010 09:59 AM, Andy Colson wrote:
> On 12/9/2010 8:50 AM, Andy Colson wrote:
>> On 12/9/2010 6:25 AM, Paul Taylor wrote:

>> You need to bump up your SHMMAX is your OS.
> 
> sorry: SHMMAX _in_ your OS.
> 
> its an OS setting not a PG one.
> 
> -Andy
> 
> 
scroll down to the section on OSX
http://developer.postgresql.org/pgdocs/postgres/kernel-resources.html

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread Andy Colson

On 12/9/2010 8:50 AM, Andy Colson wrote:

On 12/9/2010 6:25 AM, Paul Taylor wrote:

Hi, Im using Postgres 8.3 on a Macbook Pro Labtop.
I using the database with just one db connection to build a lucene
search index from some of the data, and Im trying to improve
performance. The key thing is that I'm only a single user but
manipulating large amounts of data , i.e processing tables with upto 10
million rows in them, so I think want to configure Postgres so that it
can create large temporary tables in memory

I've tried changes various parameters such as shared_buffers, work_mem
and checkpoint_segments but I don't really understand what they values
are, and the documentation seems to be aimed towards configuring for
multiple users, and my changes make things worse. For example my machine
has 2GB of memory and I read if using as a dedicated server you should
set shared memory to 40% of total memory, but when I increase to more
than 30MB Postgres will not start complaining about my SHMMAX limit.

Paul



You need to bump up your SHMMAX is your OS.


sorry: SHMMAX _in_ your OS.

its an OS setting not a PG one.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread Andy Colson

On 12/9/2010 6:25 AM, Paul Taylor wrote:

Hi, Im using Postgres 8.3 on a Macbook Pro Labtop.
I using the database with just one db connection to build a lucene
search index from some of the data, and Im trying to improve
performance. The key thing is that I'm only a single user but
manipulating large amounts of data , i.e processing tables with upto 10
million rows in them, so I think want to configure Postgres so that it
can create large temporary tables in memory

I've tried changes various parameters such as shared_buffers, work_mem
and checkpoint_segments but I don't really understand what they values
are, and the documentation seems to be aimed towards configuring for
multiple users, and my changes make things worse. For example my machine
has 2GB of memory and I read if using as a dedicated server you should
set shared memory to 40% of total memory, but when I increase to more
than 30MB Postgres will not start complaining about my SHMMAX limit.

Paul



You need to bump up your SHMMAX is your OS.  I'm sure google knows how 
to do it.  (in linux use sysctl, so it may be similar in macos).


checkpoint_segments: I've bumped them up to 10, but only when inserting 
a huge amount of data, not sure how much it'll help otherwise.


shared_buffers: this is the big one.  Set it big, 1G maybe

work_mem: this is for temp work a query might need to do, like sorting, 
merging, etc.  A big value (100Meg or so) would be ok.  Its Per User, 
but since there is only one of you, splurge.


There is also an effective_cache_size (or something like that): its the 
amount of memory PG can assume is being used for disk cache.  Its not 
something that'll be allocated.  So you have 2G, 1G for PG, 300Meg for 
os and other stuff, so 700Meg for effective_cache_size?


In Linux I use "free" to see how much is being used for disk cache, and 
set it to that.



-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning Postgres for single user manipulating large amounts of data

2010-12-09 Thread Paul Taylor

Hi, Im using Postgres 8.3 on a Macbook Pro Labtop.
I using the database with just one db connection to build a lucene 
search index from some of the data, and Im trying to improve 
performance. The key thing is that I'm only a single user but 
manipulating large amounts of data , i.e processing tables with upto 10 
million rows in them, so I think want to configure Postgres so that it 
can create large temporary tables in memory


I've tried changes various parameters such as shared_buffers, work_mem 
and checkpoint_segments but I don't really understand what they values 
are, and the documentation seems to be aimed towards configuring for 
multiple users, and my changes make things worse. For example my machine 
has 2GB of memory and I read if using as a dedicated server you should 
set shared memory to 40% of total memory, but when I increase to more 
than 30MB Postgres will not start complaining about my SHMMAX limit.


Paul

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning bgwriter in 8.4.2

2010-02-23 Thread Greg Smith

Ben Chobot wrote:


Hm, my shared_buffers is already 10GB, but I'm using about 80GB for filesystem 
cache. Would a larger shared_buffers make sense? I thought I read somewhere 
that 10GB is on the high end of the useful size for shared_buffers.


Yeah, I figured that out when I was analyzing your figures and thought 
I'd missed a decimal place when I first saw it.  The problem with huge 
increases in shared_buffer is that they can increase the amount of time 
it takes to allocate a new buffer.  If all you've got in there are lots 
of records with high usage counts, it can end up taking multiple "sweeps 
of the clock hand" over things to drop usage counts.  With >75% of your 
buffers already having a usage count of 4 or 5, you've already gone 
pretty far in the direction where that could happen.  With still around 
13% only have a 0 or 1 usage count I don't think it's too bad yet.


You're certainly not in well explored territory though.  If you were 
seeing large amounts of backend writes or buffers being allocated, maybe 
a larger shared_buffers would make sense.  From the snapshots of data 
you've provided, that doesn't seem to be the case though, so I wouldn't 
be too worried about it.  The only thing you could really do here is 
increase checkpoint_timeout - with this much data, having a checkpoint 
every 5 minutes is on the fast side, and I'd bet you could tolerate the 
disk space and additional crash recovery time in return for better 
average performance the rest of the time.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning bgwriter in 8.4.2

2010-02-23 Thread Ben Chobot
On Feb 22, 2010, at 6:47 PM, Greg Smith wrote:

> Ben Chobot wrote:
>> Is it reading it correctly to say that the bgwriter probably wouldn't help 
>> much, because a majority of the dirty pages appear to be popular?
> 
> Yes.  The background writer cleaner process only does something useful if 
> there are pages with low usage counts it can evict.  You would need to 
> increase shared_buffers significantly before it's likely that would happen.  
> Right now, 87% of your buffer cache has a usage count of 2 or higher, which 
> basically means it's filled with almost nothing but the working set of data 
> it never wants to evict unless it's for a checkpoint.

Hm, my shared_buffers is already 10GB, but I'm using about 80GB for filesystem 
cache. Would a larger shared_buffers make sense? I thought I read somewhere 
that 10GB is on the high end of the useful size for shared_buffers.
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning bgwriter in 8.4.2

2010-02-22 Thread Greg Smith

Ben Chobot wrote:
Is it reading it correctly to say that the bgwriter probably wouldn't 
help much, because a majority of the dirty pages appear to be popular?


Yes.  The background writer cleaner process only does something useful 
if there are pages with low usage counts it can evict.  You would need 
to increase shared_buffers significantly before it's likely that would 
happen.  Right now, 87% of your buffer cache has a usage count of 2 or 
higher, which basically means it's filled with almost nothing but the 
working set of data it never wants to evict unless it's for a checkpoint.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning bgwriter in 8.4.2

2010-02-22 Thread Ben Chobot
On Feb 17, 2010, at 6:38 PM, Greg Smith wrote:

> Ben Chobot wrote:
>> Is there a way to tell if I really am just keeping the same few pages dirty 
>> throughout every checkpoint? I wouldn't have expected that, but given our 
>> application I suppose it is possible.
> 
> You can install pg_buffercache and look at what's in the cache to check your 
> theory.  I have some sample queries that show neat things at 
> http://www.westnet.com/~gsmith/content/postgresql/bufcache.sh


This appears to be fairly typical:

# select count(*),isdirty,usagecount from pg_buffercache group by 
isdirty,usagecount order by usagecount desc,isdirty;
 count  | isdirty | usagecount 
+-+
 670629 | f   |  5
  75766 | t   |  5
 237311 | f   |  4
   5372 | t   |  4
  74682 | f   |  3
 31 | t   |  3
  73786 | f   |  2
 18 | t   |  2
 104112 | f   |  1
 62 | t   |  1
  68951 | f   |  0
(11 rows)

Is it reading it correctly to say that the bgwriter probably wouldn't help 
much, because a majority of the dirty pages appear to be popular?

Re: [GENERAL] tuning bgwriter in 8.4.2

2010-02-17 Thread Greg Smith

Ben Chobot wrote:

Is there a way to tell if I really am just keeping the same few pages dirty 
throughout every checkpoint? I wouldn't have expected that, but given our 
application I suppose it is possible.


You can install pg_buffercache and look at what's in the cache to check 
your theory.  I have some sample queries that show neat things at 
http://www.westnet.com/~gsmith/content/postgresql/bufcache.sh


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning bgwriter in 8.4.2

2010-02-17 Thread Ben Chobot
On Feb 17, 2010, at 3:41 PM, Greg Smith wrote:

> Ben Chobot wrote:
>> As I understand things, assuming I don't keep updating the same pages then 
>> buffers_backend should be a small percentage of buffers_alloc, and 
>> buffers_clean should be larger than it is compared to buffers_checkpoint. Is 
>> my understanding correct?
> 
> Sure; your buffers_clean is really low relative to the totals.  You should 
> take a snapshot now that you've fixed bgwriter_lru_maxpages, with a 
> timestamp, and then another sometime later to get really useful numbers.  A 
> diff only considering the current setup and with a time interval to go along 
> with it is much more useful than the aggregate numbers here (patch to make 
> that easier already in 9.0:  
> http://blog.2ndquadrant.com/en/2010/01/measuring-postgresql-checkpoin.html )  
> Keep taking regular snapshots with a timestamp:
> 
> select current_timestamp,* from pg_stat_bgwriter;
> 
> So you can compute a diff to measure what's changing as you go.
> 
> The only explanation I can offer is that your workload might be really 
> bursty.  The method used for estimating how much the cleaner should do is 
> most likely to break down when the load comes in narrow spikes.  The main way 
> to improve response in that situation is by decreasing the interval, so it 
> kicks in and does the "what's happened during the last  ms?" computations 
> more often.  Right now, a burst that lasts less than 200ms can be completely 
> missed, if the system was mostly idle before that.
> 
> You can try lowering bgwriter_delay and proportionally decreasing 
> bgwriter_lru_maxpages to make response time to burst workloads better.  In 
> your situation, I'd try make the writer wake up 4X as often, only do 1/4 as 
> much maximum work as it currently does each time, and doubling the multiplier 
> too; see if things move in the right direction, and maybe keep going from 
> there afterwards.

Thanks for the suggestions Greg. I'll monitor it closely over the next few 
days, but it doesn't really seem to have changed much so far. Is there a way to 
tell if I really am just keeping the same few pages dirty throughout every 
checkpoint? I wouldn't have expected that, but given our application I suppose 
it is possible.
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning bgwriter in 8.4.2

2010-02-17 Thread Greg Smith

Ben Chobot wrote:
As I understand things, assuming I don't keep updating the same pages 
then buffers_backend should be a small percentage of buffers_alloc, 
and buffers_clean should be larger than it is compared to 
buffers_checkpoint. Is my understanding correct?


Sure; your buffers_clean is really low relative to the totals.  You 
should take a snapshot now that you've fixed bgwriter_lru_maxpages, with 
a timestamp, and then another sometime later to get really useful 
numbers.  A diff only considering the current setup and with a time 
interval to go along with it is much more useful than the aggregate 
numbers here (patch to make that easier already in 9.0:  
http://blog.2ndquadrant.com/en/2010/01/measuring-postgresql-checkpoin.html 
)  Keep taking regular snapshots with a timestamp:


select current_timestamp,* from pg_stat_bgwriter;

So you can compute a diff to measure what's changing as you go.

The only explanation I can offer is that your workload might be really 
bursty.  The method used for estimating how much the cleaner should do 
is most likely to break down when the load comes in narrow spikes.  The 
main way to improve response in that situation is by decreasing the 
interval, so it kicks in and does the "what's happened during the last 
 ms?" computations more often.  Right now, a burst that lasts less 
than 200ms can be completely missed, if the system was mostly idle 
before that.


You can try lowering bgwriter_delay and proportionally decreasing 
bgwriter_lru_maxpages to make response time to burst workloads better.  
In your situation, I'd try make the writer wake up 4X as often, only do 
1/4 as much maximum work as it currently does each time, and doubling 
the multiplier too; see if things move in the right direction, and maybe 
keep going from there afterwards.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] tuning bgwriter in 8.4.2

2010-02-14 Thread Ben Chobot
On Feb 14, 2010, at 10:25 AM, Ben Chobot wrote:

> We recently upgraded to 8.4.2 and I'm trying to make sure our bgwriter is 
> working as well as it can. Based on:
> 
> # select * from pg_stat_bgwriter ;
>  checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | 
> maxwritten_clean | buffers_backend | buffers_alloc 
> ---+-++---+--+-+---
>804 |   2 |   39171885 | 22562 |   
>211 |24759656 |   4488627
> (1 row)
> 
> ...I'm not sure that it is, because as I understand things, assuming I don't 
> keep updating the same pages then buffers_backend should be a small 
> percentage of buffers_alloc, and buffers_clean should be larger than it is 
> compared to buffers_checkpoint. Is my understanding correct?
> 
> My checkpoints are spread exactly 5 minutes apart, and a typical checkpoint 
> log entry looks like:
> 
> checkpoint complete: wrote 48289 buffers (3.7%); 0 transaction log file(s) 
> added, 0 removed, 14 recycled; write=149.872 s, sync=0.378 s, total=150.256 s
> 
> The only bgwriter tunable we've adjusted so far is bgwriter_lru_maxpages = 
> 500, though we've also set checkpoint_segments = 768 (not that we need it 
> that high, but we have the space on the wal volume.) 

I should have added that those 211 maxwritten_clean entries came about before 
we set bgwriter_lru_maxpages to 500. And the 2 requested checkpoints came with 
the initial slony load.

[GENERAL] tuning bgwriter in 8.4.2

2010-02-14 Thread Ben Chobot
We recently upgraded to 8.4.2 and I'm trying to make sure our bgwriter is 
working as well as it can. Based on:

# select * from pg_stat_bgwriter ;
 checkpoints_timed | checkpoints_req | buffers_checkpoint | buffers_clean | 
maxwritten_clean | buffers_backend | buffers_alloc 
---+-++---+--+-+---
   804 |   2 |   39171885 | 22562 | 
 211 |24759656 |   4488627
(1 row)

...I'm not sure that it is, because as I understand things, assuming I don't 
keep updating the same pages then buffers_backend should be a small percentage 
of buffers_alloc, and buffers_clean should be larger than it is compared to 
buffers_checkpoint. Is my understanding correct?

My checkpoints are spread exactly 5 minutes apart, and a typical checkpoint log 
entry looks like:

checkpoint complete: wrote 48289 buffers (3.7%); 0 transaction log file(s) 
added, 0 removed, 14 recycled; write=149.872 s, sync=0.378 s, total=150.256 s

The only bgwriter tunable we've adjusted so far is bgwriter_lru_maxpages = 500, 
though we've also set checkpoint_segments = 768 (not that we need it that high, 
but we have the space on the wal volume.) 



Re: [GENERAL] Tuning resource parameters for a logging database.

2009-05-22 Thread Alex Thurlow
Just wanted to let everyone know that I set up partitioning and got my 
30 days of data in.  This sucker is FAST.  And it wasn't nearly as 
complicated to make happen as I thought it would be.  I didn't use the 
triggers or rules to make sure data goes in the right table on insert, 
as there's only one script that inserts, so I just generate the correct 
table name there.



Alex Thurlow
Blastro Networks

http://www.blastro.com
http://www.roxwel.com
http://www.yallwire.com


On 5/22/2009 9:56 AM, Vick Khera wrote:

On Thu, May 21, 2009 at 3:37 PM, Alex Thurlow  wrote:
   

I was hoping to not have to change all my code to automate the partitioning
table creation stuff, but if that's really the best way, I'll check it out.
Thanks for the advice.
 


About a 18 months ago we split a large table with 300+ million rows
into 100 partitions.  The query speed was improved by at least 2
orders of magnitude.  Postgres is exceptionally good at dealing with
tables in the 10 million row range, and that's what we gave it.  Our
primary queries on the data were able to go directly to the right
partition, but using constraint exclusion was still nearly just as
fast.

It was totally worth the 10 days or so it took to set up, test (on a
replica!) and migrate the data.  In your case you could have a natural
migration by just adding the child tables and inserting your new data
there and deleting old data from your main table.  After 30 days, your
main table will be empty and you just truncate it, freeing up all the
space.

   


Re: [GENERAL] Tuning resource parameters for a logging database.

2009-05-22 Thread Vick Khera
On Thu, May 21, 2009 at 3:37 PM, Alex Thurlow  wrote:
> I was hoping to not have to change all my code to automate the partitioning
> table creation stuff, but if that's really the best way, I'll check it out.
> Thanks for the advice.

About a 18 months ago we split a large table with 300+ million rows
into 100 partitions.  The query speed was improved by at least 2
orders of magnitude.  Postgres is exceptionally good at dealing with
tables in the 10 million row range, and that's what we gave it.  Our
primary queries on the data were able to go directly to the right
partition, but using constraint exclusion was still nearly just as
fast.

It was totally worth the 10 days or so it took to set up, test (on a
replica!) and migrate the data.  In your case you could have a natural
migration by just adding the child tables and inserting your new data
there and deleting old data from your main table.  After 30 days, your
main table will be empty and you just truncate it, freeing up all the
space.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning resource parameters for a logging database.

2009-05-21 Thread Alex Thurlow
I was hoping to not have to change all my code to automate the 
partitioning table creation stuff, but if that's really the best way, 
I'll check it out.  Thanks for the advice.



Alex Thurlow
Blastro Networks

http://www.blastro.com
http://www.roxwel.com
http://www.yallwire.com


On 5/21/2009 2:24 PM, Vick Khera wrote:

On Thu, May 21, 2009 at 3:13 PM, Alex Thurlow  wrote:
   

I have a postgresql database that I'm using for logging of data. There's
basically one table where each row is a line from my log files.  It's
getting to a size where it's running very slow though.  There are about 10
million log lines per day and I keep 30 days of data in it. All the columns
 


Are you using partitioning on this table?  Your use case is literally
the exact example everyone uses to show how to do partitioning on
tables.

Since you mostly scan on date, this will speed up your queries significantly.

   


Re: [GENERAL] Tuning resource parameters for a logging database.

2009-05-21 Thread Scott Marlowe
On Thu, May 21, 2009 at 1:39 PM, Alex Thurlow  wrote:
> I should also specify how my inserts are happening I guess.  I'm actually
> doing the logs to flat files and then inserting them into the database on an
> hourly basis using COPY, so I don't need to worry as much about the log
> insert speed as I do the reporting.

Cool.  Then definitely look at partitioning, and also start running
explain analyze on your longer running queries.  You'll often find
some part of the plan that makes no sense (usually a difference
between estimated and actual returned rows is a clue).  You can earn a
lot of performance by tuning your queries in this way.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning resource parameters for a logging database.

2009-05-21 Thread Scott Marlowe
On Thu, May 21, 2009 at 1:36 PM, Scott Marlowe  wrote:
>

Below, I meant with a logging / reporting database...

> With a logging database you're optimizing two often opposing actions.
> Lots of small inserts in a stream that HAVE to get processed and put
> in efficiently.  This is often accomplished with minimum
> shared_buffers and work_mem, because there's no need for the overhead
> of large shared_buffers and insert queries for logging dbs don't need
> much work_mem.
>
> With a reporting database you run queries that chew up tons of memory
> both shared_buffers and work_mem for efficient operation.
>

>> work_mem = 128MB
>
> Bigger than needed for logging, good for reporting.  You can probably
> just leave it.

Note that you can set work_mem per user, so have the reporting users
log in with a different user and you can crank this up a bit, say 512M
to 1G if you're only ever running 1 or 2 reports.  Careful about
running the machine out of memory, work_mem is a foot gun if you set
it too high and run a lot of queries at once.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning resource parameters for a logging database.

2009-05-21 Thread Alex Thurlow
I should also specify how my inserts are happening I guess.  I'm 
actually doing the logs to flat files and then inserting them into the 
database on an hourly basis using COPY, so I don't need to worry as much 
about the log insert speed as I do the reporting.


On 5/21/2009 2:36 PM, Scott Marlowe wrote:

On Thu, May 21, 2009 at 1:13 PM, Alex Thurlow  wrote:
   

I have a postgresql database that I'm using for logging of data. There's
basically one table where each row is a line from my log files.  It's
getting to a size where it's running very slow though.  There are about 10
million log lines per day and I keep 30 days of data in it. All the columns
I filter on are indexed (mostly I just use date).
 


**DING DING DING**  you've just said the magic phrase that says that
partitioning would be a help.

   

  And I tend to pull one
day of data at a time with grouped counts by 1 or 2 other columns.  There
also tends to be only 1 or 2 of these large queries running at any given
time, so a lot of resources can be thrown at each one.

I'm wondering what my resource parameters should be for optimal speed of the
selects on this database, since I haven't seen a good example where someone
has done anything like this.
 


With a logging database you're optimizing two often opposing actions.
Lots of small inserts in a stream that HAVE to get processed and put
in efficiently.  This is often accomplished with minimum
shared_buffers and work_mem, because there's no need for the overhead
of large shared_buffers and insert queries for logging dbs don't need
much work_mem.

With a reporting database you run queries that chew up tons of memory
both shared_buffers and work_mem for efficient operation.

   

The machine is an 8 core opteron (I know I won't really use those, but Dell
threw in the 2nd proc for free) with 8 Gb RAM.  The database is on a RAID 10
JFS partition.
 


Yeah CPUs are cheap, might as well stock up on them.  A reporting
database can quickly go cpu bound if everything the users want to see
fits in memory.

   

This is what I have in postgresql.conf right now..

shared_buffers = 64MB
 


Small for reporting, just right for logging.  I'd try something bigger
but not insanely huge.  Let the OS do the caching of 90% of the data,
let the db cache a good sized working set.  256M to 1G is reasonable
based on benchmarks of your own queries.

   

work_mem = 128MB
 


Bigger than needed for logging, good for reporting.  You can probably
just leave it.

   

maintenance_work_mem = 256MB
max_fsm_pages = 614400
 


If you're not partitioning then this needs to be big enough to contain
1 days+ worth of dead rows.

Look at lowering your random_page_cost, and increasing default stats
target to 100 to 1000 depending on your data and explain analyze query
testing.

   



--
Alex Thurlow
Blastro Networks

http://www.blastro.com
http://www.roxwel.com
http://www.yallwire.com



Re: [GENERAL] Tuning resource parameters for a logging database.

2009-05-21 Thread Scott Marlowe
On Thu, May 21, 2009 at 1:13 PM, Alex Thurlow  wrote:
> I have a postgresql database that I'm using for logging of data. There's
> basically one table where each row is a line from my log files.  It's
> getting to a size where it's running very slow though.  There are about 10
> million log lines per day and I keep 30 days of data in it. All the columns
> I filter on are indexed (mostly I just use date).

**DING DING DING**  you've just said the magic phrase that says that
partitioning would be a help.

>  And I tend to pull one
> day of data at a time with grouped counts by 1 or 2 other columns.  There
> also tends to be only 1 or 2 of these large queries running at any given
> time, so a lot of resources can be thrown at each one.
>
> I'm wondering what my resource parameters should be for optimal speed of the
> selects on this database, since I haven't seen a good example where someone
> has done anything like this.

With a logging database you're optimizing two often opposing actions.
Lots of small inserts in a stream that HAVE to get processed and put
in efficiently.  This is often accomplished with minimum
shared_buffers and work_mem, because there's no need for the overhead
of large shared_buffers and insert queries for logging dbs don't need
much work_mem.

With a reporting database you run queries that chew up tons of memory
both shared_buffers and work_mem for efficient operation.

> The machine is an 8 core opteron (I know I won't really use those, but Dell
> threw in the 2nd proc for free) with 8 Gb RAM.  The database is on a RAID 10
> JFS partition.

Yeah CPUs are cheap, might as well stock up on them.  A reporting
database can quickly go cpu bound if everything the users want to see
fits in memory.

> This is what I have in postgresql.conf right now..
>
> shared_buffers = 64MB

Small for reporting, just right for logging.  I'd try something bigger
but not insanely huge.  Let the OS do the caching of 90% of the data,
let the db cache a good sized working set.  256M to 1G is reasonable
based on benchmarks of your own queries.

> work_mem = 128MB

Bigger than needed for logging, good for reporting.  You can probably
just leave it.

> maintenance_work_mem = 256MB
> max_fsm_pages = 614400

If you're not partitioning then this needs to be big enough to contain
1 days+ worth of dead rows.

Look at lowering your random_page_cost, and increasing default stats
target to 100 to 1000 depending on your data and explain analyze query
testing.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning resource parameters for a logging database.

2009-05-21 Thread Alex Thurlow
I have a postgresql database that I'm using for logging of data.  
There's basically one table where each row is a line from my log files.  
It's getting to a size where it's running very slow though.  There are 
about 10 million log lines per day and I keep 30 days of data in it.  
All the columns I filter on are indexed (mostly I just use date).  And I 
tend to pull one day of data at a time with grouped counts by 1 or 2 
other columns.  There also tends to be only 1 or 2 of these large 
queries running at any given time, so a lot of resources can be thrown 
at each one.


I'm wondering what my resource parameters should be for optimal speed of 
the selects on this database, since I haven't seen a good example where 
someone has done anything like this.


The machine is an 8 core opteron (I know I won't really use those, but 
Dell threw in the 2nd proc for free) with 8 Gb RAM.  The database is on 
a RAID 10 JFS partition.


This is what I have in postgresql.conf right now..

shared_buffers = 64MB
work_mem = 128MB
maintenance_work_mem = 256MB
max_fsm_pages = 614400
max_fsm_relations = 1

Can anyone give me some insight as to what I should set these to or if 
there are others I should be using that I'm missing?



Thanks,
Alex


--
Alex Thurlow
Blastro Networks

http://www.blastro.com
http://www.roxwel.com
http://www.yallwire.com



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Tuning resource parameters for a logging database.

2009-05-21 Thread Vick Khera
On Thu, May 21, 2009 at 3:13 PM, Alex Thurlow  wrote:
> I have a postgresql database that I'm using for logging of data. There's
> basically one table where each row is a line from my log files.  It's
> getting to a size where it's running very slow though.  There are about 10
> million log lines per day and I keep 30 days of data in it. All the columns

Are you using partitioning on this table?  Your use case is literally
the exact example everyone uses to show how to do partitioning on
tables.

Since you mostly scan on date, this will speed up your queries significantly.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Tuning resource parameters for a logging database.

2009-05-21 Thread Alex Thurlow
I have a postgresql database that I'm using for logging of data. 
There's basically one table where each row is a line from my log files. 
 It's getting to a size where it's running very slow though.  There are 
about 10 million log lines per day and I keep 30 days of data in it. 
All the columns I filter on are indexed (mostly I just use date).  And I 
tend to pull one day of data at a time with grouped counts by 1 or 2 
other columns.  There also tends to be only 1 or 2 of these large 
queries running at any given time, so a lot of resources can be thrown 
at each one.


I'm wondering what my resource parameters should be for optimal speed of 
the selects on this database, since I haven't seen a good example where 
someone has done anything like this.


The machine is an 8 core opteron (I know I won't really use those, but 
Dell threw in the 2nd proc for free) with 8 Gb RAM.  The database is on 
a RAID 10 JFS partition.


This is what I have in postgresql.conf right now..

shared_buffers = 64MB
work_mem = 128MB
maintenance_work_mem = 256MB
max_fsm_pages = 614400
max_fsm_relations = 1

Can anyone give me some insight as to what I should set these to or if 
there are others I should be using that I'm missing?



Thanks,
Alex


--
Alex Thurlow
Blastro Networks

http://www.blastro.com
http://www.roxwel.com
http://www.yallwire.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Windows Client App Was [Re: [GENERAL] Tuning configuration]

2007-12-04 Thread Ow Mun Heng

On Tue, 2007-12-04 at 14:21 -0500, Gauthier, Dave wrote:
> Is there something like a freeware windows client app that does DBA
> stuff for a remote server?  Sort of like TOAD for Oracle?

pgadmin3?
and please don't hijack threads

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Tuning configuration

2007-12-04 Thread Andrew Sullivan
On Tue, Dec 04, 2007 at 02:21:43PM -0500, Gauthier, Dave wrote:
> Is there something like a freeware windows client app that does DBA
> stuff for a remote server?  Sort of like TOAD for Oracle?

TOAD either does or used to work for Postgres.

A


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Tuning configuration

2007-12-04 Thread Gauthier, Dave
Is there something like a freeware windows client app that does DBA
stuff for a remote server?  Sort of like TOAD for Oracle?

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Erik Jones
Sent: Tuesday, December 04, 2007 1:34 PM
To: Konrad Neuwirth
Cc: Postgresql
Subject: Re: [GENERAL] Tuning configuration

On Dec 4, 2007, at 11:17 AM, Konrad Neuwirth wrote:

> Hello,
>
> I have a hopefully simple question.  I've found documentation about
> the meanings of the various variables in postgres.conf -- if it comes
> to memory consumption and buffer size.  I've also found hints as to
> making them too large decreases performance.  But -- how can I measure
> how well the current settings fit? Are there tables to watch on how
> quickly things like temp tables get written out to disk vs. being held
> in memory?  We'd like to find an appropriate size for our database,
> but can't even yet say if we're in the right ballpark on a machine
> that has RAM to spare still.
>
> What can I look into?

What version of postgres are you using?  8.3 (currently in beta) has  
facilities for tracking temp file creation.

Erik Jones

Software Developer | Emma(r)
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 6: explain analyze is your friend

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Tuning configuration

2007-12-04 Thread Erik Jones

On Dec 4, 2007, at 11:17 AM, Konrad Neuwirth wrote:


Hello,

I have a hopefully simple question.  I've found documentation about
the meanings of the various variables in postgres.conf -- if it comes
to memory consumption and buffer size.  I've also found hints as to
making them too large decreases performance.  But -- how can I measure
how well the current settings fit? Are there tables to watch on how
quickly things like temp tables get written out to disk vs. being held
in memory?  We'd like to find an appropriate size for our database,
but can't even yet say if we're in the right ballpark on a machine
that has RAM to spare still.

What can I look into?


What version of postgres are you using?  8.3 (currently in beta) has  
facilities for tracking temp file creation.


Erik Jones

Software Developer | Emma®
[EMAIL PROTECTED]
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Tuning configuration

2007-12-04 Thread Konrad Neuwirth
Hello,

I have a hopefully simple question.  I've found documentation about
the meanings of the various variables in postgres.conf -- if it comes
to memory consumption and buffer size.  I've also found hints as to
making them too large decreases performance.  But -- how can I measure
how well the current settings fit? Are there tables to watch on how
quickly things like temp tables get written out to disk vs. being held
in memory?  We'd like to find an appropriate size for our database,
but can't even yet say if we're in the right ballpark on a machine
that has RAM to spare still.

What can I look into?

Thank you,
 Konrad Neuwirth

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Tuning to speed select

2006-08-11 Thread Joshua D. Drake

Roman Neuhauser wrote:

# [EMAIL PROTECTED] / 2006-08-11 10:12:40 -0400:
I think my Dell Precision 650 has SATA on the motherboard. The boss says 
I can order one drive, so what should I get? How much faster is RAID 0+1 
than a single drive?


If you can order one drive, get a drive that is twice the size of what 
you have. Use one partition of that drive to create a raid 1. Use the 
other partition to push off nightly rsyncs :)


Sincerely,

Joshua D. Drake




You need 4 disks for 0+1 (or 1+0, also called 10).
 



--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Tuning to speed select

2006-08-11 Thread Roman Neuhauser
# [EMAIL PROTECTED] / 2006-08-11 10:12:40 -0400:
> I think my Dell Precision 650 has SATA on the motherboard. The boss says 
> I can order one drive, so what should I get? How much faster is RAID 0+1 
> than a single drive?

You need 4 disks for 0+1 (or 1+0, also called 10).
 
-- 
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE. http://bash.org/?255991

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Tuning to speed select

2006-08-11 Thread Merlin Moncure

On 8/11/06, Tom Laudeman <[EMAIL PROTECTED]> wrote:

Merlin,
The problem is disk. I've got a WD Caviar. hdparm says it does 44MB/sec
(I ran that in single user mode so there was nothing interfering). A WD
Caviar SE SATA in one of my servers at home gets 56MB/sec on a quiescent
system at runlevel 3. What kind of values does hdparm give for a SATA
Raptor?


i dont have one handy on a linux box to test, but all raptors are 10k
drives with 16 mb cache. they feel about twice as fast as 7200 rpm ide
drives in general use.


I think my Dell Precision 650 has SATA on the motherboard. The boss says
I can order one drive, so what should I get? How much faster is RAID 0+1
than a single drive?


depends on a lot of factors, near 100% improvement is realistic even
with software raid.  I would tell your boss that you could buy 2 36g
raptors (110$ each) do a simple raid 0.  just be aware that either
drive failing will take you out.  or, you could do raid 1 for
redundancy.


Aside from size, I can't see much difference between these drives (WD
Raptors at NewEgg):
http://www.newegg.com/Product/ProductList.asp?DEPA=0&type=&Description=raptor&Submit=ENE&Ntk=all&N=0&minPrice=&maxPrice=&Go.x=0&Go.y=0


the retail parts are more expensive as is the silly drive that you can
look into.  buy the cheapest part at the size level you need.


CLUSTER certainly helped. Each of the following queries would have
returned roughly 50,000 records. Note that selecting a single record
from blast_result using an index is plenty fast ( ~ 50 ms), so my
primary concern is pulling back larger subsets of data.


maybe. you may have had table bloat as well, cluster does a full table
rebuild like vacuum fuul.


It appears that count(*) on a CLUSTERed table uses the index (as opposed
to the old way of doing a sequential scan). Count on the table after
CLUSTER appears to be a *lot* faster, maybe almost 100x. I know we
shouldn't count, but we've been too lazy to keep the record counts in
another table, and our customers occasionally want to know how many
records are in a certain subset.


no, afaik count(*) uses the table still (try explain analyze). you
just compacted and optimized the table for efficient sequential scans.
are you vacuuming regulary?

merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Tuning to speed select

2006-08-11 Thread Tom Laudeman

Merlin,
The problem is disk. I've got a WD Caviar. hdparm says it does 44MB/sec 
(I ran that in single user mode so there was nothing interfering). A WD 
Caviar SE SATA in one of my servers at home gets 56MB/sec on a quiescent 
system at runlevel 3. What kind of values does hdparm give for a SATA 
Raptor?


I think my Dell Precision 650 has SATA on the motherboard. The boss says 
I can order one drive, so what should I get? How much faster is RAID 0+1 
than a single drive?


Aside from size, I can't see much difference between these drives (WD 
Raptors at NewEgg):

http://www.newegg.com/Product/ProductList.asp?DEPA=0&type=&Description=raptor&Submit=ENE&Ntk=all&N=0&minPrice=&maxPrice=&Go.x=0&Go.y=0


CLUSTER certainly helped. Each of the following queries would have 
returned roughly 50,000 records. Note that selecting a single record 
from blast_result using an index is plenty fast ( ~ 50 ms), so my 
primary concern is pulling back larger subsets of data.


It appears that count(*) on a CLUSTERed table uses the index (as opposed 
to the old way of doing a sequential scan). Count on the table after 
CLUSTER appears to be a *lot* faster, maybe almost 100x. I know we 
shouldn't count, but we've been too lazy to keep the record counts in 
another table, and our customers occasionally want to know how many 
records are in a certain subset.


Before CLUSTER:
explain analyze select * from blast_result where si_fk=11843253;
Total runtime: 16334.539 ms

explain analyze select * from blast_result where si_fk=11843248;
Total runtime: 31406.999 ms

explain analyze select * from blast_result where si_fk=11218929;
Total runtime: 15319.440 ms


After CLUSTER and vacuum analyze:
explain analyze select * from blast_result where si_fk=11843253;
Total runtime: 2343.893 ms

explain analyze select * from blast_result where si_fk=11843248;
Total runtime: 2158.395 ms

explain analyze select * from blast_result where si_fk=11218929;
Total runtime: 1880.586 ms

explain analyze select * from blast_result where si_fk=11843250;
Total runtime: 2085.253 ms


Thanks,
Tom





Are your data structures normalized?  Performance problems queying a
single giganto table is  usually (but not necessirly in your case) a
sign of a poorly designed table structure.

otherwise it's pretty clear you get the most bang for the buck with
hardware.  consider upping ram and/or buying better disks.  you could
buy cheap sata controller and 4 raptors in raid 0+1 configuration for
<1000$ and you will feel like you have supercomputer relative to what
you have now :)

merlin



--
Tom Laudeman
[EMAIL PROTECTED]
(434) 924-2456
http://www.people.virginia.edu/~twl8n/
http://laudeman.com/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Tuning to speed select

2006-08-10 Thread Tom Lane
Michael Fuhr <[EMAIL PROTECTED]> writes:
> Excerpt from the 8.1 Release Notes:
>   * Allow nonconsecutive index columns to be used in a multicolumn
> index (Tom)

> For example, this allows an index on columns a,b,c to be used in
> a query with WHERE a = 4 and c = 10. 

> If you're querying only on c then an index on (a,b,c) will probably
> be slower than an index on (c), but if you have only (a,b,c) then
> the planner will consider using it.

Note that if your query doesn't have a constraint on the first index
column, it's unlikely that that index will get chosen, because the
entire index would have to be scanned.  Before 8.1 it was guaranteed
not to be chosen.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Tuning to speed select

2006-08-10 Thread Michael Fuhr
On Thu, Aug 10, 2006 at 11:00:00AM -0400, Tom Laudeman wrote:
> As far as I can tell (from running actual tests) Postgres will not use a 
> multi-column index when the SELECT is constraining on only one of the 
> columns in the index.

Excerpt from the 8.1 Release Notes:

  * Allow nonconsecutive index columns to be used in a multicolumn
index (Tom)

For example, this allows an index on columns a,b,c to be used in
a query with WHERE a = 4 and c = 10. 

If you're querying only on c then an index on (a,b,c) will probably
be slower than an index on (c), but if you have only (a,b,c) then
the planner will consider using it.

-- 
Michael Fuhr

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Tuning to speed select

2006-08-10 Thread Tom Laudeman




Reece,
We have some multi-column indexes. Speed of insert, update, and delete
are not an issue since this data is essentially static: write-once,
read-many.

As far as I can tell (from running actual tests) Postgres will not use
a multi-column index when the SELECT is constraining on only one of the
columns in the index. If I need a single column index, I create one. If
I need three columns in an index, I create a specific 3 column index.
Granted, my memory is fuzzy, and some of my testing was done on version
7.x and there may be improvements on version 8.x

-Tom


...
  
I found multi-column indexes and clustering to be extremely beneficial
in 7.4. I still use them in 8.1, but I haven't compared them
extensively with equivalent queries that use bitmap index scans. The
obvious downside of having more indexes is the additional time and
space overhead during insert, update, or delete.
  
...
  
-Reece
  
  

  

-- 
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0


  

  


-- 
Tom Laudeman
[EMAIL PROTECTED]
(434) 924-2456
http://www.people.virginia.edu/~twl8n/
http://laudeman.com/





Re: [GENERAL] Tuning to speed select

2006-08-10 Thread Merlin Moncure

On 8/9/06, Tom Laudeman <[EMAIL PROTECTED]> wrote:


 The speed of the query is (as Michael implies) limited to the rate at which
the disk can seek and read.  I have done experiments with views and cursors;
there was no improvement in speed. I've also tried only pulling back
primary keys in the hope that a smaller amount of data would more quickly be
read into memory. No speed increase. I have also raised all the usual memory
limits, with the expected results (slight speed improvements).



Are your data structures normalized?  Performance problems queying a
single giganto table is  usually (but not necessirly in your case) a
sign of a poorly designed table structure.

otherwise it's pretty clear you get the most bang for the buck with
hardware.  consider upping ram and/or buying better disks.  you could
buy cheap sata controller and 4 raptors in raid 0+1 configuration for
<1000$ and you will feel like you have supercomputer relative to what
you have now :)

merlin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Tuning to speed select

2006-08-10 Thread Alban Hertroys

Richard Broersma Jr wrote:

Views certainly help in managing complexity. They do nothing to improve
query-speed.

Querying a view gets rewritten to queries to the underlying tables on the
fly.
(as long as there are no materialized views, which are still on a the TODO
list)


Would partial indexs on the most queried regions of the table help in query 
speed?


They would, as long as they can be used in the client application 
queries. If all the data is often used (partially or not), then there's 
little point to partial indices. But that's a border-case. In common, 
they can make major differences.


Also, make sure you add enough constraints to your query so that the 
partial index constraints match your query constraints, otherwise the 
index will not be used.


Another point to check; make sure you don't cause type casts over 
indexed columns. That might (under certain conditions) cause a type cast 
on the column value of every indexed row, which does slow down things 
significantly. I believe this was solved in PostgreSQL 8.something.


--
Alban Hertroys
[EMAIL PROTECTED]

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
   7500 AK Enschede

// Integrate Your World //

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [GENERAL] Tuning to speed select

2006-08-09 Thread Reece Hart




On Wed, 2006-08-09 at 16:54 -0400, Tom Laudeman wrote:

Great suggestion. I've read about CLUSTER, but never had a chance to use it. The only problem is that this table with 9 million records has 5 or 6 indexes. It is hard to pick the most used, but I'll bet CLUSTER will make at least one of the queries run very fast, especially for an index with a small number of distinct values.


Tom-

I found multi-column indexes and clustering to be extremely beneficial in 7.4. I still use them in 8.1, but I haven't compared them extensively with equivalent queries that use bitmap index scans. The obvious downside of having more indexes is the additional time and space overhead during insert, update, or delete.

The approach I took to design multi-column indexes was to run explain on representative queries and look for seq scans. The seq scans indicate which columns /might/ be well-served by indexes. In 7.4, the order of indexed columns was important. (I saw something in the 8.1 release notes that made me think that this was no longer true, but I haven't verified that.) 

-Reece





-- 
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0








Re: [GENERAL] Tuning to speed select

2006-08-09 Thread Michael Fuhr
On Wed, Aug 09, 2006 at 04:54:00PM -0400, Tom Laudeman wrote:
> I'll try CLUSTER (I'm looking forward to that test), but if we really 
> need speed, it will probably be necessary to create copies of the table, 
> or copy portions of the table elsewhere (essentially creating 
> materialized views, I suppose). I'm still trying to get my science 
> compatriot here to tell me which index he most wants to improve, then 
> I'll CLUSTER the table on that index.

If you enable statistics collection then you could use those
statistics to see which indexes are used the most.  Those indexes
might be good candidates for clustering.

http://www.postgresql.org/docs/8.1/interactive/monitoring-stats.html

-- 
Michael Fuhr

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Tuning to speed select

2006-08-09 Thread Tom Laudeman




Michael,
Great suggestion. I've read about CLUSTER, but never had a chance to
use it. The only problem is that this table with 9 million records has
5 or 6 indexes. It is hard to pick the most used, but I'll bet CLUSTER
will make at least one of the queries run very fast, especially for an
index with a small number of distinct values.

The speed of the query is (as Michael implies) limited to the rate at
which the disk can seek and read.  I have done experiments with views
and cursors; there was no improvement in speed. I've also tried only
pulling back  primary keys in the hope that a smaller amount of data
would more quickly be read into memory. No speed increase. I have also
raised all the usual memory limits, with the expected results (slight
speed improvements). 

I'll try CLUSTER (I'm looking forward to that test), but if we really
need speed, it will probably be necessary to create copies of the
table, or copy portions of the table elsewhere (essentially creating
materialized views, I suppose). I'm still trying to get my science
compatriot here to tell me which index he most wants to improve, then
I'll CLUSTER the table on that index.

Thanks!
Tom

Michael Fuhr wrote:

  On Wed, Aug 09, 2006 at 03:46:38PM +0200, Martijn van Oosterhout wrote:
  
  
On Wed, Aug 09, 2006 at 09:19:31AM -0400, Tom Laudeman wrote:


  Is there a tuning parameter I can change to increase speed of selects? 
Clearly, there's already some buffering going on since selecting an 
indexed ~50,000 records takes 17 seconds on the first try, and only 0.5 
seconds on the second try (from pgsql).
  

Your OS is probably buffering, 1GB of RAM holds a lot of data. You can
try increasing the shared_buffers parameter, but if the delay is
getting data from the disk, that won't really help you.

  
  
If most of your queries use the same index then clustering on that
index might speed up initial (i.e., not-cached) queries by reducing
the number of disk pages that need to be read.  See the documentation
for more information.

http://www.postgresql.org/docs/8.1/interactive/sql-cluster.html

  


-- 
Tom Laudeman
[EMAIL PROTECTED]
(434) 924-2456
http://www.people.virginia.edu/~twl8n/
http://laudeman.com/





Re: [GENERAL] Tuning to speed select

2006-08-09 Thread Scott Marlowe
Title: Re: [GENERAL] Tuning to speed select






On Wed, 2006-08-09 at 14:58, louis gonzales wrote:
> I'm not so sure about that, when you create a view on a table - at least
> with Oracle - which is a subset(the trivial or 'proper' subset is the
> entire table view) of the information on a table, when a select is
> issued against a table, Oracle at least, determines if there is a view
> already on a the table which potentially has a smaller amount of
> information to process - as long as the view contains the proper
> constraints that meet your 'select' criteria, the RDBMS engine will have
> fewer records to process - which I'd say, certainly constitutes a time
> benefit, in terms of 'performance gain.'
>
> Hence my reasoning behind determining IF there is a subset of the 'big
> table' that is frequented, I'd create a view on this, assuming
> postgresql does this too?  Maybe somebody else can answer that for the
> pgsql-general's general information?
>
> query-speed itself is going to be as fast/slow as your system is
> configured for, however my point was to shave some time off of a 1M+
> record table, but implementing views of 'frequently' visisted/hit
> records meeting the same specifications.

There are basically two ways to do views.  The simple way, is to have a
view represent a query that gets run everytime you call it.  The more
complex way is to "materialize" the view data, and put it into a new
table, and then update that table whenever the source table changes.

PostgreSQL has native support for the first type.  They're cheap and
easy, and work for most of the things people need views for (i.e. hiding
complexity).

PostgreSQL is extensible, and therefore you can institute the second
type (i.e. materialized views) on your own.  Thanksfully, someone else
has already done most of the work for us, by the name of Jonathan
Gardner, and you can find his nifty guide by typing "materialized views
postgresql" into google.

Gardner's materialized views support several update methods depending on
what you need from your mat views.  It's also a danged fine tutorial on
how to write some simple plpgsql functions.






Re: [GENERAL] Tuning to speed select

2006-08-09 Thread louis gonzales
I'm not so sure about that, when you create a view on a table - at least 
with Oracle - which is a subset(the trivial or 'proper' subset is the 
entire table view) of the information on a table, when a select is 
issued against a table, Oracle at least, determines if there is a view 
already on a the table which potentially has a smaller amount of 
information to process - as long as the view contains the proper 
constraints that meet your 'select' criteria, the RDBMS engine will have 
fewer records to process - which I'd say, certainly constitutes a time 
benefit, in terms of 'performance gain.'


Hence my reasoning behind determining IF there is a subset of the 'big 
table' that is frequented, I'd create a view on this, assuming 
postgresql does this too?  Maybe somebody else can answer that for the 
pgsql-general's general information?


query-speed itself is going to be as fast/slow as your system is 
configured for, however my point was to shave some time off of a 1M+ 
record table, but implementing views of 'frequently' visisted/hit 
records meeting the same specifications.


Harald Armin Massa wrote:


Louis,

Views certainly help in managing complexity. They do nothing to 
improve query-speed.


Querying a view gets rewritten to queries to the underlying tables on 
the fly.
(as long as there are no materialized views, which are still on a the 
TODO list)


--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Let's set so double the killer delete select all. 




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Tuning to speed select

2006-08-09 Thread Richard Broersma Jr
> Views certainly help in managing complexity. They do nothing to improve
> query-speed.
> 
> Querying a view gets rewritten to queries to the underlying tables on the
> fly.
> (as long as there are no materialized views, which are still on a the TODO
> list)

Would partial indexs on the most queried regions of the table help in query 
speed?

Regards,

Richard Broersma Jr.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Tuning to speed select

2006-08-09 Thread Harald Armin Massa
Louis,Views certainly help in managing complexity. They do nothing to improve query-speed. Querying a view gets rewritten to queries to the underlying tables on the fly.(as long as there are no materialized views, which are still on a the TODO list)
-- GHUM Harald Massapersuadere et programmareHarald Armin MassaReinsburgstraße 202b70197 Stuttgart0173/9409607-Let's set so double the killer delete select all.


Re: [GENERAL] Tuning to speed select

2006-08-09 Thread louis gonzales
What about creating views on areas of the table that are queried often?  
I don't know if you have access or the ability to find what type of 
trends the table has, in terms of queries, but if you create some views 
on frequently visited information, this could also help.


Tom Laudeman wrote:


Hi,

I'm running PostgreSQL version 8 on a dual 2.4GHz Xeon with 1GB of RAM 
and an IDE hard drive. My big table has around 9 million records.


Is there a tuning parameter I can change to increase speed of selects? 
Clearly, there's already some buffering going on since selecting an 
indexed ~50,000 records takes 17 seconds on the first try, and only 
0.5 seconds on the second try (from pgsql).


cowpea=> explain analyze select bs_fk from blast_result where 
si_fk=11843254;
QUERY 
PLAN 
 

Index Scan using si_fk_index on blast_result  (cost=0.00..22874.87 
rows=58118 width=4) (actual time=112.249..17472.935 rows=50283 loops=1)

  Index Cond: (si_fk = 11843254)
Total runtime: 17642.522 ms
(3 rows)

cowpea=>  explain analyze select bs_fk from blast_result where 
si_fk=11843254;
  QUERY 
PLAN   
 

Index Scan using si_fk_index on blast_result  (cost=0.00..22874.87 
rows=58118 width=4) (actual time=0.178..341.643 rows=50283 loops=1)

  Index Cond: (si_fk = 11843254)
Total runtime: 505.011 ms
(3 rows)

cowpea=> 



Thanks,
Tom





---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] Tuning to speed select

2006-08-09 Thread Michael Fuhr
On Wed, Aug 09, 2006 at 03:46:38PM +0200, Martijn van Oosterhout wrote:
> On Wed, Aug 09, 2006 at 09:19:31AM -0400, Tom Laudeman wrote:
> > Is there a tuning parameter I can change to increase speed of selects? 
> > Clearly, there's already some buffering going on since selecting an 
> > indexed ~50,000 records takes 17 seconds on the first try, and only 0.5 
> > seconds on the second try (from pgsql).
> 
> Your OS is probably buffering, 1GB of RAM holds a lot of data. You can
> try increasing the shared_buffers parameter, but if the delay is
> getting data from the disk, that won't really help you.

If most of your queries use the same index then clustering on that
index might speed up initial (i.e., not-cached) queries by reducing
the number of disk pages that need to be read.  See the documentation
for more information.

http://www.postgresql.org/docs/8.1/interactive/sql-cluster.html

-- 
Michael Fuhr

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Tuning to speed select

2006-08-09 Thread Martijn van Oosterhout
On Wed, Aug 09, 2006 at 09:19:31AM -0400, Tom Laudeman wrote:
> Hi,
> 
> I'm running PostgreSQL version 8 on a dual 2.4GHz Xeon with 1GB of RAM 
> and an IDE hard drive. My big table has around 9 million records.
> 
> Is there a tuning parameter I can change to increase speed of selects? 
> Clearly, there's already some buffering going on since selecting an 
> indexed ~50,000 records takes 17 seconds on the first try, and only 0.5 
> seconds on the second try (from pgsql).

Your OS is probably buffering, 1GB of RAM holds a lot of data. You can
try increasing the shared_buffers parameter, but if the delay is
getting data from the disk, that won't really help you.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


[GENERAL] Tuning to speed select

2006-08-09 Thread Tom Laudeman

Hi,

I'm running PostgreSQL version 8 on a dual 2.4GHz Xeon with 1GB of RAM 
and an IDE hard drive. My big table has around 9 million records.


Is there a tuning parameter I can change to increase speed of selects? 
Clearly, there's already some buffering going on since selecting an 
indexed ~50,000 records takes 17 seconds on the first try, and only 0.5 
seconds on the second try (from pgsql).


cowpea=> explain analyze select bs_fk from blast_result where 
si_fk=11843254;
QUERY 
PLAN 

Index Scan using si_fk_index on blast_result  (cost=0.00..22874.87 
rows=58118 width=4) (actual time=112.249..17472.935 rows=50283 loops=1)

  Index Cond: (si_fk = 11843254)
Total runtime: 17642.522 ms
(3 rows)

cowpea=>  explain analyze select bs_fk from blast_result where 
si_fk=11843254;
  QUERY 
PLAN   

Index Scan using si_fk_index on blast_result  (cost=0.00..22874.87 
rows=58118 width=4) (actual time=0.178..341.643 rows=50283 loops=1)

  Index Cond: (si_fk = 11843254)
Total runtime: 505.011 ms
(3 rows)

cowpea=>  




Thanks,
Tom


--
Tom Laudeman
[EMAIL PROTECTED]
(434) 924-2456
http://www.people.virginia.edu/~twl8n/
http://laudeman.com/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] Tuning queries inside a function

2005-05-03 Thread Richard Huxton
Mike Nolan wrote:
Maybe you could return a refcursor pointing to the EXPLAIN ANALYZE of
the query inside the function.
The raw materials exist to do this: if you know which elements of a
query will be replaced by plpgsql variables, you can duplicate the
results via
PREPARE foo(...) AS ...
EXPLAIN EXECUTE foo(...)
Certainly there is a lot more that we can and must do about making
it easier to debug and tune plpgsql functions.  But you can fix 'em
with a little determination even now...

If I know which elements of a query will be replaced by variables, I can
enter the query in psql, which I've done.  (I can always output the variables
to the log from inside the function.) 
Be aware that if you're pasting values in the place of the variables 
then PG can come up with a different plan.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] Tuning queries inside a function

2005-05-02 Thread Tom Lane
"Joshua D. Drake" <[EMAIL PROTECTED]> writes:
> Mike Nolan wrote:
>> That part I get, but I cannot seem to get an 'explain select' to return
>> the explain output inside a function.

> Oh interesting. Hmmm. Alvaro can you think of a way to execute the 
> result into a variable and return it as a notice?

I think it's done already, at least if you are using a recent release.
I note the following relevant items in the CVS log:

2005-04-05 14:05  tgl

* doc/src/sgml/plpgsql.sgml, src/pl/plpgsql/src/gram.y: Adjust
grammar for plpgsql's OPEN command so that a cursor can be OPENed
on non-SELECT commands such as EXPLAIN or SHOW (anything that
returns tuples is allowed).  This flexibility already existed for
bound cursors, but OPEN was artificially restricting what it would
take.  Per a gripe some months back.

2005-02-10 15:36  tgl

* src/backend/: executor/spi.c, tcop/pquery.c (REL8_0_STABLE),
executor/spi.c, tcop/pquery.c: Fix SPI cursor support to allow
scanning the results of utility commands that return tuples (such
as EXPLAIN).  Per gripe from Michael Fuhr.  Side effect: fix an old
bug that unintentionally disabled backward scans for all
SPI-created cursors.

(The latter is in 8.0.2 and up, the former only in CVS tip.)

This is relevant to plpgsql because both "FOR ... IN query" and plpgsql
cursors depend on SPI cursors.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Tuning queries inside a function

2005-05-02 Thread Joshua D. Drake
Mike Nolan wrote:
Mike Nolan wrote:
select * from foo('bar','debug')

But how do I do that inside a pl/pgsql function?  'select into' doesn't
seem to work properly.

You would have to code it. For example:
IF $2 = ''debug'' THEN:

That part I get, but I cannot seem to get an 'explain select' to return
the explain output inside a function.
Oh interesting. Hmmm. Alvaro can you think of a way to execute the 
result into a variable and return it as a notice?

Sincerely,
Joshua D. Drake

--
Mike Nolan

--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedication Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [GENERAL] Tuning queries inside a function

2005-05-02 Thread Mike Nolan
> Mike Nolan wrote:
> >>select * from foo('bar','debug')
> > 
> > 
> > But how do I do that inside a pl/pgsql function?  'select into' doesn't
> > seem to work properly.
> 
> 
> You would have to code it. For example:
> 
> IF $2 = ''debug'' THEN:

That part I get, but I cannot seem to get an 'explain select' to return
the explain output inside a function.
--
Mike Nolan

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Tuning queries inside a function

2005-05-02 Thread Joshua D. Drake
Mike Nolan wrote:
select * from foo('bar','debug')

But how do I do that inside a pl/pgsql function?  'select into' doesn't
seem to work properly.

You would have to code it. For example:
IF $2 = ''debug'' THEN:
 

I would have to check be able to include a timestamp at the beginning
of each notice.

You can do that from the config file, but it only gives the time to the 
nearest second, which may not be a fine enough time interval.
--
Mike Nolan

--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedication Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


  1   2   >