Ray Hunter
Firmware Engineer
ENTERASYS NETWORKS
-----Original Message-----
From: Aron Pilhofer [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 04, 2002 9:10 AM
To: Hunter, Ray
Subject: RE: [PHP-DB] optimization (another tack)That would be great! Thanks.
[Aron Pilhofer]-----Original Message-----
From: Hunter, Ray [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 04, 2002 11:04 AM
To: 'Aron Pilhofer'; [EMAIL PROTECTED]
Subject: RE: [PHP-DB] optimization (another tack)If you are using php and a database you can add more memory to the script and optimize the database. I only use postgres databases for all my large data so I can let you know how to optimize postgres...
Ray Hunter
Firmware EngineerENTERASYS NETWORKS
-----Original Message-----
From: Aron Pilhofer [mailto:[EMAIL PROTECTED]]
Sent: Monday, March 04, 2002 9:02 AM
To: [EMAIL PROTECTED]
Subject: [PHP-DB] optimization (another tack)
Let me try this again more generally. I am trying to optimize a function in PHP that handles very large result sets, which are transferred to arrays, and that does some extremely heavy lifting in terms of calculations on those arrays. By design, it iterates through each possible combination of two result sets, and does some calculations on those results. As you can imagine, the numbers get quite large, quite fast; sets of 500 by 1000 necessitate a half-million calculations.
So, short of rewriting this function in C, which I cannot do, are there any suggestions for optimizing. For example:
1) is there any advantage to caching an array as a local file?
2) the script pumps the results of the calculations into a new table.. would it be faster to dump it into a local file instead?3) is there any advantage to executing the script as a CGI? (does that make sense? I don't know if I know the correct jargon here...)
Any other tips folks have for scripts that handle a lot of calculations would be greatly appreciated.
Thanks in advance.
--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Optimizing Postgresql Ericson Smith
Following Tim Perdue's excellent article on the comparison between MySQL and Postgresql, I decided to take a shot at installing and using this database. For most of our work I use MySQL and will continue to do so, because of its ease of use and unrivaled select query speed, and also because there is no point in trying to mess around with production systems that already work fine. But some new projects suffered greatly from MySQL's table locking feature when I needed to update data (which I do a lot). Here are my adventures in setting up a Postgresql database server. Our configuration for a dedicated Postgresql server was: Redhat 7.1 Dual PIII 650Mhz System 512MB RAM 18Gig SCSI drive for the postgresql data partition Downloading and Installing I downloaded and installed the 7.1.2 RPM's from http://postgres.org without any trouble. For a server installation, I only installed: postgresql-server and postgresql-7.1.2 (base). I then started the server up and running by executing: /etc/init.d/postgresql start A small sized database was ported from MySQL (three tables totaling about 5000 records). I created sufficient indexes for postgresql's optimizer to use, and modified our C application to use the postgresql C client interface for a small CGI program that would brutally query this table. This small CGI program receives thousands of queries per minute. Optimizing One of the first things I noticed after turning on the CGI program, was that although queries were returned almost as fast as from the previous MySQL based system, the load on the server was much higher -- in fact almost 90-percent! Then I started to go down into the nitty-gritty of things. I had optimized MySQL before by greatly increasing cache and buffer sizes and by throwing more ram towards the problem. The single biggest thing that you have to do before running Postgresql, is to provide enough shared buffer space. Let me repeat: provide enough buffer space! Let's say you have about 512MB of ram on a dedicated database server, then you need to turn over about 75-percent of it to this shared buffer. Postgresql does best when it can load most or -- even better -- all of a table into its shared memory space. In our case, since our database was fairly small, I decided to allocate 128MB of RAM towards the shared buffer space. The file /var/lib/pgsql/data/postgresql.conf contains settings for the database server. Postgresql uses system shared memory as a buffer. On a Linux system, you can see how much shared memory was allocated by your system by running the command: cat /proc/sys/kernel/shmmax And to view shared memory use on the system: ipcs The result will be in bytes. By default RedHat 7.1 allocates 32MB of shared memory, hardly enough for postgresql. I increased this limit to 128MB by doing the command: cat 128000000 > /proc/sys/kernel/shmmax Be aware that once you reboot the server, this setting will disappear. You need to place this line in your postgresql startup file, or by editing the /etc/sysctl.conf file for a more permanent setting. Then in our postgresql.conf I set shared_buffers to 15200. Because Postgresql uses 8K segments, I made a calculation of 128000/8192 plus a 512K overhead. I also set our sort_mem to 32168 (32Megs for a sort memory area). Since connection pooling was in effect, I set max_connections to 64. And fsync was also set to false. shared_buffers = 15200 sort_mem = 32168 max_connections=64 fsync=false You can read the manual to tweak other settings, but I never had the need to do so. Note that if you set shared_buffers to more than what your shared memory limit is, postgresql will refuse to start. This confused us for a while, since no logging was taking place. You can tweak the startup file in /etc/init.d for the postmaster to write its output to a log file. Change the fragment from /postmaster start > /dev/null 2> to /postmaster start > /var/lib/pgsql.log 2> (or wherever you want to store the log.) Tailing the log file clearly explained what the problem was. All sorts of sexy debugging info will show up in this file, which includes SQL syntax errors, the output of EXPLAIN state, emts, connection problems, authentication attempts, and so forth. I restarted postgresql and brought our CGI online. Our jaws collectively dropped to the floor as postgresql literally flew as soon as it started to use the buffer. Server load by postgresql dropped to just under 10-percent. One hitch I found with an early version of the system was that it had to build up and tear down a postgresql connection with each request. This was intolerable, so I started to use the connection pooling features of the C library. Server load dropped another few notches with this option. With PHP you will want to use persistent connections (pg_pconnect instead of pg_connect) to fully take advantage of this effect. Indexes I cannot emphasize enough the need to have proper indexing in postgresql. One early mistake that I made was to index BIGINT columns. The columns were indexed ok, but postgresql refused to make use them. After two days of tearing out my hair, it came to me that the architecture of the system was 32 bits. Could it be that postgresql refuses to make use of a 64 bit (BIGINT) index? Changing the type to INTEGER quickly solved that problem. Maybe if I had one of those new-fangled 64 bit Itanium processors. Conclusion There are many things that you can do with your SQL statements to also improve query response, but these are adequately covered in the interactive postgresql documentation. Ericson Smith is a web developer at http://did-it.com.;
-- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php