RE: [PHP-DB] optimization (another tack)

Hunter, Ray Mon, 04 Mar 2002 08:06:47 -0800

Title: Message

Here is general information on postgres: http://www.ca.postgresql.org/docs/aw_pgsql_book/index.html

Here is an optimization file included: This was not written by me!!!

Let me know if you have questions...s

Ray Hunter

Firmware Engineer

ENTERASYS NETWORKS

-----Original Message-----
From: Aron Pilhofer [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 04, 2002 9:10 AM
To: Hunter, Ray
Subject: RE: [PHP-DB] optimization (another tack)

That would be great! Thanks.

[Aron Pilhofer]

-----Original Message-----
From: Hunter, Ray [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 04, 2002 11:04 AM
To: 'Aron Pilhofer'; [EMAIL PROTECTED]
Subject: RE: [PHP-DB] optimization (another tack)

If you are using php and a database you can add more memory to the script and optimize the database. I only use postgres databases for all my large data so I can let you know how to optimize postgres...

Ray Hunter
Firmware Engineer

ENTERASYS NETWORKS

-----Original Message-----
From: Aron Pilhofer [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 04, 2002 9:02 AM
To: [EMAIL PROTECTED]
Subject: [PHP-DB] optimization (another tack)

Let me try this again more generally. I am trying to optimize a function in PHP that handles very large result sets, which are transferred to arrays, and that does some extremely heavy lifting in terms of calculations on those arrays. By design, it iterates through each possible combination of two result sets, and does some calculations on those results. As you can imagine, the numbers get quite large, quite fast; sets of 500 by 1000 necessitate a half-million calculations.

So, short of rewriting this function in C, which I cannot do, are there any suggestions for optimizing. For example:

1) is there any advantage to caching an array as a local file?
2) the script pumps the results of the calculations into a new table.. would it be faster to dump it into a local file instead?

3) is there any advantage to executing the script as a CGI? (does that make sense? I don't know if I know the correct jargon here...)

Any other tips folks have for scripts that handle a lot of calculations would be greatly appreciated.

Thanks in advance.

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Optimizing Postgresql

Ericson Smith


Following Tim Perdue's excellent article on the comparison between MySQL and 
Postgresql, 
I decided to take a shot at installing and using this database. For most of our work I 
use MySQL 
and will continue to do so, because of its ease of use and unrivaled select query 
speed, 
and also because there is no point in trying to mess around with production systems 
that already work fine. 

But some new projects suffered greatly from MySQL's table locking feature when I 
needed 
to update data (which I do a lot). Here are my adventures in setting up a Postgresql 
database server. 

Our configuration for a dedicated Postgresql server was: 

Redhat 7.1 
Dual PIII 650Mhz System 
512MB RAM 
18Gig SCSI drive for the postgresql data partition 

Downloading and Installing
I downloaded and installed the 7.1.2 RPM's from http://postgres.org without any 
trouble. 
For a server installation, I only installed: postgresql-server and postgresql-7.1.2 
(base). 

I then started the server up and running by executing: 
/etc/init.d/postgresql start 

A small sized database was ported from MySQL (three tables totaling about 5000 
records). 
I created sufficient indexes for postgresql's optimizer to use, and modified our C 
application 
to use the postgresql C client interface for a small CGI program that would brutally 
query 
this table. This small CGI program receives thousands of queries per minute. 



Optimizing
One of the first things I noticed after turning on the CGI program, was that although 
queries 
were returned almost as fast as from the previous MySQL based system, the load on the 
server was much higher -- in fact almost 90-percent! Then I started to go down into 
the 
nitty-gritty of things. I had optimized MySQL before by greatly increasing cache and 
buffer 
sizes and by throwing more ram towards the problem. 
The single biggest thing that you have to do before running Postgresql, is to provide 
enough 
shared buffer space. Let me repeat: provide enough buffer space! Let's say you have 
about 
512MB of ram on a dedicated database server, then you need to turn over about 
75-percent 
of it to this shared buffer. Postgresql does best when it can load most or -- even 
better -- all 
of a table into its shared memory space. In our case, since our database was fairly 
small, 
I decided to allocate 128MB of RAM towards the shared buffer space. 

The file /var/lib/pgsql/data/postgresql.conf contains settings for the database 
server. 
Postgresql uses system shared memory as a buffer. On a Linux system, you can see how 
much shared memory was allocated by your system by running the command: 
cat /proc/sys/kernel/shmmax 

And to view shared memory use on the system: 
ipcs 

The result will be in bytes. By default RedHat 7.1 allocates 32MB of shared memory, 
hardly 
enough for postgresql. I increased this limit to 128MB by doing the command: 

cat 128000000 > /proc/sys/kernel/shmmax 

Be aware that once you reboot the server, this setting will disappear. You need to 
place this 
line in your postgresql startup file, or by editing the /etc/sysctl.conf file for a 
more permanent 
setting. 

Then in our postgresql.conf I set shared_buffers to 15200. Because Postgresql uses 8K 
segments, I made a calculation of 128000/8192 plus a 512K overhead. I also set our 
sort_mem
to 32168 (32Megs for a sort memory area). Since connection pooling was in effect, I 
set 
max_connections to 64. And fsync was also set to false. 

shared_buffers = 15200
sort_mem = 32168
max_connections=64
fsync=false

You can read the manual to tweak other settings, but I never had the need to do so. 
Note that if you set shared_buffers to more than what your shared memory limit is, 
postgresql will refuse to start. This confused us for a while, since no logging was 
taking place. 
You can tweak the startup file in /etc/init.d for the postmaster to write its output 
to a log file. 
Change the fragment from 

/postmaster start > /dev/null 2>

to 
/postmaster start > /var/lib/pgsql.log 2>

(or wherever you want to store the log.) 
Tailing the log file clearly explained what the problem was. 

All sorts of sexy debugging info will show up in this file, which includes SQL syntax 
errors, 
the output of EXPLAIN state, emts, connection problems, authentication attempts, and 
so forth. 

I restarted postgresql and brought our CGI online. Our jaws collectively dropped to 
the floor 
as postgresql literally flew as soon as it started to use the buffer. Server load by 
postgresql 
dropped to just under 10-percent. 

One hitch I found with an early version of the system was that it had to build up and 
tear 
down a postgresql connection with each request. This was intolerable, so I started to 
use 
the connection pooling features of the C library. Server load dropped another few 
notches 
with this option. With PHP you will want to use persistent connections 
(pg_pconnect instead of pg_connect) to fully take advantage of this effect. 



Indexes
I cannot emphasize enough the need to have proper indexing in postgresql. One early 
mistake 
that I made was to index BIGINT columns. The columns were indexed ok, but postgresql
 refused to make use them. After two days of tearing out my hair, it came to me that 
the 
architecture of the system was 32 bits. Could it be that postgresql refuses to make 
use of a 
64 bit (BIGINT) index? Changing the type to INTEGER quickly solved that problem. Maybe 
if I 
had one of those new-fangled 64 bit Itanium processors. 


Conclusion
There are many things that you can do with your SQL statements to also improve query 
response, but these are adequately covered in the interactive postgresql 
documentation. 


Ericson Smith is a web developer at http://did-it.com.;

-- 
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DB] optimization (another tack)

Reply via email to