[GENERAL] Perfomance difference between 7.2 and 7.3
Hi all: I have here a table with the following schema: Table todocinetv Column|Type | Modifiers -+-+-- id | integer | not null default '0' datestamp | timestamp without time zone | not null thread | integer | not null default '0' parent | integer | not null default '0' author | character(37) | not null default '' subject | character(255) | not null default '' email | character(200) | not null default '' attachment | character(64) | default '' host| character(50) | not null default '' email_reply | character(1)| not null default 'N' approved| character(1)| not null default 'N' msgid | character(100) | not null default '' modifystamp | integer | not null default '0' userid | integer | not null default '0' Indexes: todocinetv_approved, todocinetv_author, todocinetv_datestamp, todocinetv_modifystamp, todocinetv_msgid, todocinetv_parent, todocinetv_subject, todocinetv_thread, todocinetv_userid, todocinetvpri_key (It's actually a table created by the discussion board application Phorum (version 3.3)). This table has about 28000 rows, and is running with Postgres 7.2.3 under Red Hat 8.0, in a 2.4 Ghz. Pentiun 4 with 512 Mb. of RAM. The problem I'm having is that, when you access the main page of the discussion board, it takes forever to show you the list of posts. The query that Phorum uses for doing so is: phorum=# explain phorum-# SELECT thread, modifystamp, count(id) AS tcount, datetime(modifystamp) AS latest, max(id) as maxid FROM todocinetv WHERE approved='Y' GROUP BY thread, modifystamp ORDER BY modifystamp desc, thread desc limit 30; NOTICE: QUERY PLAN: Limit (cost=40354.79..40354.79 rows=30 width=12) - Sort (cost=40354.79..40354.79 rows=2879 width=12) - Aggregate (cost=39901.43..40189.35 rows=2879 width=12) - Group (cost=39901.43..40045.39 rows=28792 width=12) - Sort (cost=39901.43..39901.43 rows=28792 width=12) - Seq Scan on todocinetv (cost=0.00..37768.90 rows=28792 width=12) This query takes up to 3 minutes to execute. I have tried to strip it down and leaving it in its most vanilla form (without count(id) and such), and it's still almost as slow: phorum=# explain phorum-# SELECT thread, modifystamp, datetime(modifystamp) AS latest from todocinetv WHERE approved='Y' ORDER BY modifystamp desc, thread desc limit 30; NOTICE: QUERY PLAN: Limit (cost=39901.43..39901.43 rows=30 width=8) - Sort (cost=39901.43..39901.43 rows=28792 width=8) - Seq Scan on todocinetv (cost=0.00..37768.90 rows=28792 width=8) But here is the weird thing: I dump the table, export it into another machine running Postgres 7.3.2 (Celeron 1.7 Ghz, 512 Mb. of memory), and the query takes only 2 or 3 seconds to execute, even though the query plan is almost the same: provphorum=# explain provphorum-# SELECT thread, modifystamp, count(id) AS tcount, modifystamp AS latest, max(id) as maxid FROM todocinetv WHERE approved='Y' GROUP BY thread, modifystamp ORDER BY modifystamp desc, thread desc limit 30 ; QUERY PLAN Limit (cost=5765.92..5765.99 rows=30 width=12) - Sort (cost=5765.92..5772.96 rows=2817 width=12) Sort Key: modifystamp, thread - Aggregate (cost=5252.34..5604.49 rows=2817 width=12) - Group (cost=5252.34..5463.63 rows=28172 width=12) - Sort (cost=5252.34..5322.77 rows=28172 width=12) Sort Key: thread, modifystamp - Seq Scan on todocinetv (cost=0.00..3170.15 rows=28172 width=12) Filter: (approved = 'Y'::bpchar) (9 rows) (I took out the datetime function, since 7.3 didn't accept it and I didn't think it was relevant to the performance problem (am I wrong?)) So my question is: what causes such a big difference? (3 min. vs. 3 seconds) Does the version difference (7.2 vs. 7.3) account for all of it? Or should I start looking at other factors? As I said, both machines are almost equivalent hardware-wise, and as for the number of shared buffers, the faster machine actually has less of them (the 7.3 machine has shared_buffers = 768, while the 7.2 one has shared_buffers = 1024). Paulo Jan. DDnet. ---(end of broadcast
[GENERAL] Database design question: ugliness or referential integrity?
Hi all: Let's say I'm designing a database (Postgres 7.3) with a list of all email accounts in a certain server: CREATE TABLE emails ( clienteid INT4, direccion VARCHAR(512) PRIMARY KEY, login varchar(128) NOT NULL, password VARCHAR(128), dominio VARCHAR(256) ); The PHBs want to have a log of when was an email account added, which technician did it, when was it deleted, when did we have to reset its password, etc.: CREATE TABLE emails_log ( direccion varchar(512) references emails, fecha date, autor varchar(32), texto varchar(1024) ); texto would be a free form text field explaining what has been done. Now, let's suppose that an email account is deleted, and six months later another user requests it and we add it again. Do we want to keep an audit trail for the old version of that account? The PHBs say yes. Which means that we can't use the email address as primary key. Fine, we add an ID column to the emails table and make it the primary key, and point the foreign key in emails_log to that column. But now we have two options, and here is my question: -In emails, the direccion column needs to be unique... but only for the active email addresses (there can be 5, 10, or 20 dead addresses called [EMAIL PROTECTED], but only one alive at the moment). We could add an active boolean column to emails, and write a custom constraint to check this condition, but I find it ugly (and I saw similar objections when another user came up with a similar problem some time ago)... -...Or we could create a table called dead_emails, and add to it the email addresses that we delete (using an ON DELETE trigger, perhaps). Basically, store the deleted email accounts in another table... but then we lose the referential integrity check in emails_log. The question is: what would you do? (I don't really like the idea of creating yet another dead_emails_log table pointing to dead_emails; I find it almost as ugly as the first one). Paulo Jan. DDnet. ---(end of broadcast)--- TIP 8: explain analyze is your friend
[GENERAL] Tuple too big when the tuple is not that big...
Hi all: I have a problem here, using Postgres 6.5.3 on a Red Hat Linux 6.0. I have a table where, each time I do a "vacuum analyze", the database complains saying "ERROR: Tuple is too big: size 10460"... and the problem is that there isn't any record as far as I know that goes beyond the 8K limit. Some background: the table in question was initially created with a "text" field, and it gave us endless problems (crashes, coredumps, etc.). After searching the archives and finding a number of people warning against using the "text" field (specially in the 6.x series), I dumped the table contents (with COPY) and recreated it using "varchar(8088)" instead. When importing the data back Postgres didn't say anything, and I assume that if there had been any field bigger than 8K it would have complained. BUT... right after importing the data in the brand new table, I try a "vacuum analyze" again and it does the same thing. Some other facts: -"Vacuum" works fine. It's just "vacuum analyze" what gives problems. -The table doesn't have any indices. -Everytime I try to do a "\d (table)", Postgres dumps core with the "backend closed the channel unexpectedly". Any ideas? (Aside of upgrading to 7.x; we can't do that for now). Do you need any other information? Paulo Jan. DDnet. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Tuple is too big
Steve Wolfe wrote: After moving a database to a new machine, I tried a vaccum analyze, and get "ERROR: Tuple is too big: size 8180, max size 8140". I know that there's a limit on the tuple size and all of that - my question is: How do I fix it? Vaccum analyze barfs, as does trying "\d table name". I tried it on both machines, and the same thing. I suppose that I could write a parser to go through the pg_dump and find offending fields, but I'm hoping that there's a way for PostgreSQL to fix it. If there isn't a way for it to fix data that it's created, that's scary. : ) It is postgresql 6.5.3. Me too. It's been happening for the last weeks with a database that didn't have any problems before. By experimenting, I've observed that this behaviour disappeared when I removed a certain table that another co-worker created; the problem is that such table doesn't have any tuple bigger than the max. supported size. Looking at the data stored in it, I don't see anything bigger than 8000 bytes (more or less) either. We are using that table (and others) to store texts, in field defined with varchar(8000). I suppose that if somebody had tried to insert a text bigger than that, the database would have refused with an error... Just in case, is there any character that, when inserted, will make the tuple grow beyond the maximum size, while still taking technically just one byte? (Some of the inserted texts were Front Page-generated HTML, and had all kinds of tabs, return carriages and such). Paulo Jan. DDnet.
Re: [GENERAL] existing but not-existing database.
psql todojamon FATAL 1: Database 'todojamon' does not exist. We know this because the directory '/usr/local/pgsql/data/base/todojamon' does not exist. You can create a database with the SQL command CREATE DATABASE. To see what databases exist, look at the subdirectories of '/usr/local/pgsql/data/base/'. Connection to database 'todojamon' failed. FATAL 1: Database 'todojamon' does not exist. We know this because the directory '/usr/local/pgsql/data/base/todojamon' does not exist. You can create a database with the SQL command CREATE DATABASE. To see what databases exist, look at the subdirectories of '/usr/local/pgsql/data/base/'. if psql template1 [inside postgres] template1= create database todojamon; ERROR: createdb: database 'todojamon' already exists ERROR: createdb: database 'todojamon' already exists how can i delete todojamon database?? You should have deleted the database with "destroydb". Now you can edit "by hand" the pg_database table, which is the system table where the available databases are stored. Use standard SQL commands like "SELECT * FROM pg_database" and etc. to delete the entry for your database. Paulo Jan. DDnet.
[GENERAL] pg_operator corrupted... Any ideas?
Hi all: This is the third time I am writing to the list asking for help; the two first ones nobody bothered to answer (admittedly, they were questions answered in the documentation, but nobody even bothered to say "RTFM"), so let's see if this time I've got a bit more of luck... I have a database where the pg_operator table became corrupted. No matter what I do in it, it says cannot find attribute 8 of relation pg_operator. In other occasions, when one of the pg_* tables went bad, I tried to dig into it and fix it manually, but this time it's different because I can't even issue any SQL commands; no matter what I do, it gives me the same error. I have even tried to copy the pg_operator table from another database by hand, with no success. Any ideas? (BTW, I tried searching the mailing list archives at http://www.postgresql.org, and it said: "An error occured! connectDB() -- connect() failed: Operation timed out Is the postmaster running (with -i) at'pgsql.tht.net' and accepting connections on TCP/IP port '5432'?"). Paulo Jan. DDnet.
[GENERAL] Recovering data from binary files?
Hi all: Well, after my earlier post regarding pg_operator being corrupted, I received a mail from one of the Postgres developers (I think) talking about one of the system indexes (pg_attribute_relid_attnum_index) being corrupted, so I gave it a try and copied said file from the template1 database to the one I have to recover... and now it gives me another error: "cannot find attribute 1 of relation pg_user". Like before, I can't even do a \d to see the existing tables in the database, much less insert/update anything in the pg_* tables to fix it, and after struggling the whole morning with it, I have pretty much given up on that path. I am trying now to: 1) Re-create the database in another machine. 2) Copy the binary files that contain the tables (minus the pg_* ones) to the other server, in the place where the new database has been created ($PGDATA/base/$dbname). 3) Try to access the tables from there using psql. The problem now is that, although I can view the tables correctly, they all come up as empty. All the SELECTs I try on them return nothing. And the strange thing is that pg_class.reltuples says that said table has 5478 tuples, which is correct. Any other ideas? Is there any other way to access the binary files and extract the information in them to text? My manager is starting to talk about migrating to SQL Server, and frankly, at this point I'm finding less and less arguments to counter him... Paulo Jan. DDnet.
[GENERAL] Miracle
Oh it seems almost same as pg_upgrade. However you should probably copy pg_log and pg_variable to $PGDATA as pg_upgrade does. In addtion if postmaster is running,you should shutdown before processing 2). Regards. Well. It worked!!! The database was succesfully recovered. Thanks a lot. Paulo Jan. DDnet.
[GENERAL] System tables
Hi all: Is there any document that describes the system tables of Postgres (the pg_* ones), their function, utility, relations among each other, etc., and how can one manipulate them to recover from errors? I am asking this because I've just experienced one of those awful database corruption cases, where one connects to an existing database and gets a "Couldn't find any tables, sequences or indices!" when trying to list the tables. Okay, let's do a "SELECT * FROM pg_class"... but it doesn't show any of the existing tables. No problem, we'll just dump the database and copy it from the "pg_dump" done by the nightly backup cronjob. "destroydb $dbname; createdb $dbname"... but you can't; Postgres tells you that the "database already exists". You do a "psql -l" and a "SELECT * FROM pg_database" while connected to "template1", and none of them show said database, but you still can't create it, nor connect to it because then it tells you that the database doesn't exist... In the end, I had to delete completely the Postgres installation directory, do an initdb and restore the databases by hand. All of the above makes me think that, despite what I have mentioned above about pg_class and pg_database, there's something that I'm missing about the inner workings of Postgres, and I'd really know what it is. I have read the FAQ, the user's guide, the administrator's guide (the place where I expected to find said information) and have searched the list archives, and have come up with nothing (or at least, nothing comprehensive; there are mentions scattered here and there to "look at this pg_foo table, which holds the information about blah blah..."). Sorry for the rant-like nature of this post, but in any case... can anybody help? And while we're on this, what do people here do in cases like what I have described above? Paulo Jan. DDnet.
[GENERAL] Cannot find attribute of relation...
Hi all: The error: "cannot find attribute 3 of relation [table name]" What does it usually mean? How can one fix it, aside of dumping the table and creating it again? If the above is solved with RTFM, can anybody give some pointers? Thanks in advance, Paulo Jan. DDnet.
[GENERAL] Creating numeric type by hand?
Hi all: I have Postgres 6.5.3 installed on Linux... well, sort of. The problem is that the installation in question must be screwed up somewhere, because I can't find the "numeric" type in it (while I can find it in another clean 6.5.3 install), and now I need it for a project. My question is: what do I have to do to create the type in question by hand? I suppose that I have to connect to the "template1" database and edit the pg_* tables... but which ones? Of course, a possible solution would be to dump everything and reinstall Postgres from scratch, but I can't even do a pg_dumpall in it; it tells me: Can not create pgdump_oid table. Explanation from backend: 'ERROR: Relation 'pgdump_oid' already exists'. pg_dump failed on [db_name], exiting (If anybody can explain possible causes for this error, I'd be grateful too). Paulo Jan. DDnet.