[GENERAL] optimizing postgres
Hello all - I'm working on a postgres project after coming from a MySQL background ( no flames, please :). We are importing fairly large xml datasets ( 10-20 MB of xml files per 'project', currently 5 projects) into the database for querying. We are using PHP to create a web interface where users can create projects and upload their files. We have a parsing routine that gets the xml data into the database. The parsing has turned out to be pretty intense. It takes about 10-20 minutes for any project. When we are parsing data, it really slows down the site's response. I tested serving static webpages from apache, endless loops in php , but the choke point seems to be doing any other query on postgres when constructing a php page during parsing. As an example, the original designer specified separate tables for each project. Since they were all the same structure, I suggested combining them into a single table with a project_id column, but he said it would take too long to query. I was suspicious, but I went with his design anyway. It turned out he was right for our current set up. When I needed to empty the project table to re-parse data, doing a cascading delete could take up to 10 minutes! I cut re-parsing time in half by just dropping the table and creating a new one. Which was an okay thing to do, since the table only belonged to one project anyway. But I hate to think how long it would have taken to do a delete, cascading to child tables, if all the project data was in a single table. Since I'm not an expert in Postgres database design, I'm assuming I've done something sub-optimal. Are there some common techniques for tuning postgres performance? Do we need beefier hardware? Or is it a problem with how PHP or apache pools connections to the database? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] table disk space usage query?
Hello all - I was looking for a way to find out how much disk space each table is using. I stumbled upon this page ( http://www.ffnn.nl/pages/articles/linux/postgresql-tips-and-tricks.php ) which gave me a query to show the number of disk pages per object. Given that a page is 8kb, I added these calculated columns to the query: SELECT relname, reltuples, relpages, relpages * 8 AS relpagesKB, (relpages * 8 )/1024 AS relpagesMB FROM pg_class ORDER BY relpages DESC ; Is this correct? ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] persistent db connections in PHP
This seems to be a problem with PHP, or at least my set up. I'm writing pages in basically the same way. Each page has an include at the top that gets you a database session. The function, either pg_connect() or mysql_connect(), is supposed to either create a new connection, or return your existing one. So after I have a connection, I can navigate to other pages, reload or post to the current one, trigger the x_connect(), and get the session I created earlier. In my Mysql site, if I create temporary tables, I still have access to them after I have traversed a mysql_connect. So it looks like PHP is giving me the connection I had when I created the temp tables. However, with this new Postgres site, I don't have access to my temp tables after I've traversed another pg_connect. So PHP is either creating a new connection, or giving me another session, not the one which I created my tables in. Steve ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] persistent db connections in PHP
Hello all! I'm working on a PHP site using Postgres as a back-end. I have an include at the top of each page that runs the pg_connect function. I'm implementing some temporary tables, which I understand are destroyed automatically at the end of the session. It seems to me that when I navigate to a new page, or re-load my existing page, I lose my temporary tables. Is this because I am creating a new session at the beginning of each page with my pg_connect include? Also, are temporary tables globally visible, or only within that session? I mean, can one user see another user's temporary tables? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] using subselects
Hello everyone - I'm moving from MySQL to Postgres and I am in a situation where I am trying to use subselects properly. I have a table of projects, users, and user_projects. The table user_projects creates a many-to-many relationship between users and projects. I'm creating a select list on a web form for adding a new user to a project. I want to select users from the user table, who aren't already listed in the join table under that project. Here's my select: SELECT * FROM users WHERE user_id <> $current_user_id AND user_id <> ( SELECT user_id FROM user_projects WHERE project_id = $project_id ) This query returns no rows, even on projects that have no records in the user_projects table! I am certain that I am not the $current_user_id. If I run this query: SELECT * FROM users WHERE user_id <> $current_user_id I get all the user records besides myself. What am I doing wrong? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] collision in serial numbers after INSERT?
Hello all - I'm working on a site with PHP and Postgres, coming from a MySQL background. I was looking for an equivalent to the mysql_insert_id() function, and a site recommended this: function postg_insert_id($tablename, $fieldname) { global connection_id; $result=pg_exec($connection_id, "SELECT last_value FROM ${tablename}_ ${fieldname}_seq"); $seq_array=pg_fetch_row($result, 0); return $seq_array[0]; } It relies on pg's sequencing ability. However, I wondered, if I were in an environment where there were many concurrent inserts, would it be possible that I didn't get the serial number of the insert that *I* just did? That if I do an insert, and several inserts happen after mine, wouldn't I get the id of the latest row, which is several inserts after mine? I don't think this would be a problem in our environment, but I am just wondering. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/