[GENERAL] protect a database
Hello. Is there any way to hide database structure (at least functions and triggers) from a superuser/administrator? Regards, Zlatko
Re: [GENERAL] protect a database
Zlatko Matić wrote: Hello. Is there any way to hide database structure (at least functions and triggers) from a superuser/administrator? No. Otherwise they can't be an administrator. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] why postgresql over other RDBMS
Surely such a use case could, and more to the point *should* be met using PITR? Regards, - Naz. Alvaro Herrera wrote: A.M. wrote: On May 24, 2007, at 14:29 , Wiebe Cazemier wrote: On Thursday 24 May 2007 17:30, Alexander Staubo wrote: [2] Nobody else has this, I believe, except possibly Ingres and NonStop SQL. This means you can do a begin transaction, then issue create table, alter table, etc. ad nauseum, and in the mean time concurrent transactions will just work. Beautiful for atomically upgrading a production server. Oracle, of course, commits after each DDL statements. If this is such a rare feature, I'm very glad we chose postgresql. I use it all the time, and wouldn't know what to do without it. We circumvented Ruby on Rails' migrations, and just implemented them in SQL. Writing migrations is a breeze this way, and you don't have to hassle with atomicity, or the pain when you discover the migration doesn't work on the production server. Indeed. Wouldn't it be a cool feature to persists transaction states across connections so that a new connection could get access to a sub- transaction state? That way, you could make your schema changes and test them with any number of test clients (which designate the state to connect with) and then you would commit when everything works. Unfortunately, the postgresql architecture wouldn't lend itself well to this. Still, it seems like a basic extension of the notion of sub- transactions. Hmm, doesn't this Just Work with two-phase commit?
[GENERAL] redirecting output of pg_dump
Hello all, I want to take backup from one server and save it to another machine hard drive. The backup will be taken through a shell script attached to a cron job. something like: pg_dump -d postgres -U postgres -f IP address of other machine and path to save the file on that machine is there any way? With regards ashish...
Re: [GENERAL] redirecting output of pg_dump
On Wed, Jul 18, 2007 at 03:39:01PM +0530, Ashish Karalkar wrote: I want to take backup from one server and save it to another machine hard drive. The backup will be taken through a shell script attached to a cron job. something like: pg_dump -d postgres -U postgres -f IP address of other machine and path to save the file on that machine Can you use ssh? For example: ssh -x nirmalvihar.info pg_dump -F c ppx `date +%Y%m%d` -- Make April 15 just another day, visit http://fairtax.org ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] redirecting output of pg_dump
am Wed, dem 18.07.2007, um 15:39:01 +0530 mailte Ashish Karalkar folgendes: Hello all, I want to take backup from one server and save it to another machine hard drive. The backup will be taken through a shell script attached to a cron job. something like: pg_dump -d postgres -U postgres -f IP address of other machine and path to save the file on that machine is there any way? Yes, any. For instance: - you are on the remote machine (which should store the backup) ssh remote pg_dump ... backup.sql - you have the pg_dump installed on the backup-machine: pg_dump -h remote ... backup.sql - you are on the server: pg_dump ... | ssh backup_server cat - backup.sql (all untested, but should work) Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: - Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] unconvertable characters
I fixed my data, but I did it manually. It seems like there were hidden characters, which may actually be the 0xc2 (which should not have been there. The data must have been pasted in somehow, but when I copied the value and pasted it back in (or ran an update statement, I tried both) the same value including the character that looked like it was wrong, it worked fine. Sim Zacks wrote: Michael, I have been manually debugging and each symbol is different, though they each give the same error code. For example, in one it was a pound sign, though when I did an update and put in the pound sign it worked. Another time it was the degree symbol. I'm going to look at iconv as that sounds like the best possibility. Sim Michael Fuhr wrote: On Mon, Jul 16, 2007 at 04:20:22PM +0300, Sim Zacks wrote: My 8.0.1 database is using ISO_8859_8 encoding. When I select specific fields I get a warning: WARNING: ignoring unconvertible ISO_8859_8 character 0x00c2 Did any of the data originate on Windows? Might the data be in Windows-1255 or some encoding other than ISO-8859-8? In Windows-1255 0xc2 represents U+05B2 HEBREW POINT HATAF PATAH -- does that character seem correct in the context of the data? http://en.wikipedia.org/wiki/Windows-1255 I now want to upgrade my database to 8.2.4 and change the encoding to UTF-8. When the restore is done, I get the following errors: pg_restore: restoring data for table manufacturers_old pg_restore: [archiver (db)] Error from TOC entry 4836; 0 9479397 TABLE DATA manufacturers postgres pg_restore: [archiver (db)] COPY failed: ERROR: character 0xc2 of encoding ISO_8859_8 has no equivalent in UTF8 CONTEXT: COPY manufacturers_old, line 331 And no data is put into the table. Is there a function I can use to replace the unconvertable charachters to blanks? If the data is in an encoding other than ISO-8859-8 then you could redirect the output of pg_restore to a file or pipe it through a filter and change the SET client_encoding line to whatever the encoding really is. For example, if the data is Windows-1255 then you'd use the following: SET client_encoding TO win1255; Another possibility would be to use a command like iconv to convert the data to UTF-8 and strip unconvertible characters; on many systems you could do that with iconv -f iso8859-8 -t utf-8 -c. If you convert to UTF-8 then you'd need to change client_encoding accordingly. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] redirecting output of pg_dump
Hello ALL, I need one more help also... I already connected two systems (one is server that is ip address of 172.16.84.76/32, another is like a client 172.16.85.243/32) this connection is working fine... now i want to connect one more system in to that server (172.16.84.76/32) The new system IP address is 172.16.82.247/32.. I inserted the server ip address(172.16.84.76/32)on pg_hba.conf file in new system same thing i insereted the ip address(172.16.82.247/32) in pg_hba.conf file and postgres.conf file in server system also. If while conect through pgAdmin III the new system shows the error is Could not resolve hostname 172.16.84.76/32(the error comes in new system only).. I think i did some minor mistake only... Kindly give me the solution. (The first one is (172.16.85.243) is working fine..) Thanks Regards Jayakumar M 09963432764 DISCLAIMER: This email (including any attachments) is intended for the sole use of the intended recipient/s and may contain material that is CONFIDENTIAL AND PRIVATE COMPANY INFORMATION. Any review or reliance by others or copying or distribution or forwarding of any or all of the contents in this message is STRICTLY PROHIBITED. If you are not the intended recipient, please contact the sender by email and delete all copies; your cooperation in this regard is appreciated.
[GENERAL] it works, but is it legal to separate schema/table/field references using spaces
Hi, Just something I noticed to 'simply work': db=# select avg(pg_catalog. pg_stats . avg_width) from pg_stats; avg - 10.6654945054945055 (1 row) It seems that all whitespace between schema/table/field references is ignored? Not saying this is a bad thing per se, it just surprised me and made me wonder if this intended to work like this and/or legal per sql-specs ;) db=# select version(); version PostgreSQL 8.2.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.3 (1 row) -- Best, Frank. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] REQUEST: option to auto-generate new sequences with CREATE TABLE (LIKE)
Hi, as the subjects reads I searched in the docs a way to instruct postgres to create new sequences when copying tables containing serial columns, but the resulting serial fields in the new tables reference the original sequence. Yes, there are workarounds, but having an option to make postgres automatically generate a new sequence for every serial field would be splendid. I hope you will consider this feature for one of the future versions of Postgres. Thanks, Nico ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] Can't SELECT from (INSERT ... RETURNING)
I thought I could use the output of INSERT...RETURNING as a set of tuples for a subquery, but it seems it's not the case: nb1=# select * from (insert into m(a) values(112) returning a); ERROR: syntax error at or near into LINE 1: select * from (insert into m(a) values(112) returni... ^ Is this a bug or it's not even supposed to work in theory? Such a feature would be extremely useful to have. P.S. I know it's non-portable, but this is not a problem. Thanks, Nico ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] REQUEST: option to auto-generate new sequences with CREATE TABLE (LIKE)
On Wed, Jul 18, 2007 at 03:10:30PM +0200, Nico Sabbi wrote: Hi, as the subjects reads I searched in the docs a way to instruct postgres to create new sequences when copying tables containing serial columns, but the resulting serial fields in the new tables reference the original sequence. That's the right behavior. You should be using pg_get_serial_sequence('your_table','your_column') to get the sequence name anyhow. :) Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Again about varchar()
Hi all: I have tables with one or several varchar(n) columns(utf8, n=200) which I believe those tables' row length will not exceed the page length. Will it helps to the performance that I turn off TOAST of those 'short' varchar() columns by issuing “ALTER TABLE foo ALTER COLUMN bar SET STORAGE PLAIN”? Regards Ian ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Again about varchar()
=?UTF-8?B?5p2O5b2mIElhbiBMaQ==?= [EMAIL PROTECTED] writes: I have tables with one or several varchar(n) columns(utf8, n=200) which I believe those tables' row length will not exceed the page length. Will it helps to the performance that I turn off TOAST of those 'short' varchar() columns by issuing âALTER TABLE foo ALTER COLUMN bar SET STORAGE PLAINâ? No. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] [PERFORM] Parrallel query execution for UNION ALL Queries
On 7/18/07, Benjamin Arai [EMAIL PROTECTED] wrote: But I want to parrallelize searches if possible to reduce the perofrmance loss of having multiple tables. PostgreSQL does not support parallel query. Parallel query on top of PostgreSQL is provided by ExtenDB and PGPool-II. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Parrallel query execution for UNION ALL Queries
Hi, If I have a query such as: SELECT * FROM (SELECT * FROM A) UNION ALL (SELECT * FROM B) WHERE blah='food'; Assuming the table A and B both have the same attributes and the data between the table is not partitioned in any special way, does Postgresql execute WHERE blah=food on both table simultaiously or what? If not, is there a way to execute the query on both in parrallel then aggregate the results? To give some context, I have a very large amount of new data being loaded each week. Currently I am partitioning the data into a new table every month which is working great from a indexing standpoint. But I want to parrallelize searches if possible to reduce the perofrmance loss of having multiple tables. Benjamin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] [PERFORM] Parrallel query execution for UNION ALL Queries
On 7/18/07, Benjamin Arai [EMAIL PROTECTED] wrote: Hi, If I have a query such as: SELECT * FROM (SELECT * FROM A) UNION ALL (SELECT * FROM B) WHERE blah='food'; Assuming the table A and B both have the same attributes and the data between the table is not partitioned in any special way, does Postgresql execute WHERE blah=food on both table simultaiously or what? If not, is there a way to execute the query on both in parrallel then aggregate the results? To give some context, I have a very large amount of new data being loaded each week. Currently I am partitioning the data into a new table every month which is working great from a indexing standpoint. But I want to parrallelize searches if possible to reduce the perofrmance loss of having multiple tables. Most of the time, the real issue would be the I/O throughput for such queries, not the CPU capability. If you have only one disk for your data storage, you're likely to get WORSE performance if you have two queries running at once, since the heads would not be going back and forth from one data set to the other. EnterpriseDB, a commercially enhanced version of PostgreSQL can do query parallelization, but it comes at a cost, and that cost is making sure you have enough spindles / I/O bandwidth that you won't be actually slowing your system down. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] multirow insert
Le samedi 14 juillet 2007, Tom Allison a écrit : On Jul 13, 2007, at 2:11 PM, A. Kretschmer wrote: am Fri, dem 13.07.2007, um 18:50:26 +0200 mailte Zlatko Mati? folgendes: When using multirow INSERT INTO...VALUES command, are all rows inserted in a batch, or row by row? Within one transaction, yes. Trust me... It's MUCH faster then trying to do each insert. Or read this article about the case: http://www.depesz.com/index.php/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/ Regards, -- dim signature.asc Description: This is a digitally signed message part.
Re: [GENERAL] createing indexes on large tables and int8
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 07/17/07 17:12, [EMAIL PROTECTED] wrote: On Tuesday 17 July 2007 17:47:01 Tom Lane wrote: [EMAIL PROTECTED] writes: i think i got it fixed as i saw that i pushed my maintenance_work_mem too high. It was higher than physical ram :-( Ooops, that will definitely cause problems. yes it did! I ran it again. And now it takes 10 minutes per index instead of 10 hours (still 8.1). maybe something postgres should complain about if setting maintance_work_mem too high. Unless it does some really OS-specific calls, *can* PostgreSQL know how much *physical* RAM is in a box? - -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFGnk+zS9HxQb37XmcRAsDtAKCCadB0CF8ATeHCtO79wcTD3lER7wCgttoF E9Rndryd/IhZEP2FY7yIr/A= =bDSf -END PGP SIGNATURE- ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Update of table lags execution of statement by 1 minute?
This happens in the same session. I have an application that had been encountering this issue periodically, and I have rigged it to send me an email whenever it happens. I¹ll get that email, fire up my client, and try to update the row manually. Right after my update (autocommit is on, but it makes no difference if I do a manual commit), I¹ll issue a select statement on the same row, and it returns with the value from before the update. I¹ll keep watching that row, and eventually it will show the right value in that column. Once again, looking at the query logs, there are no conflicting updates happening. I¹m not sure the corrupted index issue is it. After updating, the attribute shows up as the ³old² value with selects on different columns, ie: UPDATE mytable SET myattribute=1 WHERE id=14; COMMIT; SELECT * from mytable WHERE myattribute=0 Would include the that row (id=14). So it isn¹t just a single corrupted index, if that is indeed the issue. Thanks for your help, Erik Peterson On 7/17/07 10:54 AM, Tom Lane [EMAIL PROTECTED] wrote: Erik Peterson [EMAIL PROTECTED] writes: I'm having this issue where once or twice per day (out of ~100,000 = queries) the table doesn't reflect a committed update immediately. Usually when = this problem occurs the update takes 1-3 minutes to be reflected in SELECT queries. Occasionally, it has taken more than 10 minutes. The session could go something like this: UPDATE mytable SET myattribute=1 WHERE id=14; COMMIT; SELECT myattribute FROM mytable WHERE id=14; (Query returns myattribute with a value of 0) (Wait 5 minutes) SELECT myattribute FROM mytable WHERE id=14; (Query returns myattribute with a value of 1) To be blunt, I don't believe it. I can think of bugs by which a commit might be lost entirely, but there is no mechanism that would make it good five minutes later. I think you've misdiagnosed your problem somehow --- either you're not really committing where you think you are, or the observing query is using an old snapshot (maybe you are running it in a serializable transaction?) A somewhat more credible theory would revolve around corrupted indexes. If there's a corrupted index on id in the above example, a query might sometimes find one version of a row and sometimes find another; although any given search would be deterministic, apparently-unrelated changes in the index contents could change which one is found. Have you actually been able to reproduce a problem as sketched above in a single session --- ie, the update and the contradictory observations all done by the same backend? Or is this a representation of things that are happening in different sessions? What else is going on meanwhile? regards, tom lane
[GENERAL] monthly tally of new memberships
I'm trying to create a select statement that will show me the number of new memberships or an organisation by date (first of each month). The member table has a date column to reflect when the member was inserted. So far, i've gotten as far as: SELECT applied AS date_applied, count(id) AS applications FROM member WHERE applied = applied GROUP BY applied ORDER BY date_applied ASC; date_applied |applications 2006-05-21 |1 2006-05-22 |1 2006-05-23 |2 2006-05-24 | 14 2006-05-25 |5 etc. This returns the new memberships for each day, ignoring days without any. What i'd like to do though, is to select only the 1st of each month, summing the new memberships or that month, eg: month | applications 2006-05-01 | 57 2006-06-01 | 36 2006-07-01 | 72 etc. I've been fiddling with this since yesterday and am getting no closer, it seems. I know how to do this if i pass in a particular month to select from but not an aggregate for the entire month. Nothing i've tried is working but this seems as if it should be quite simple. I'll bet it's obvious, isn't it? :-\ brian ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Sylph-Searcher 1.0.0 released
On 7/18/07, Tatsuo Ishii [EMAIL PROTECTED] wrote: Hi, Hi, We are pleased to announce that Sylph-Searcher 1.0.0 is released. Sylph-Searcher is a client program for searching mail boxes. Supported mail box format is MH style, i.e. 1 mail = 1 file. Sylph-Searcher uses tsearch2 for full text search, thus searching is very fast. For example, ~300,000 mails or ~1.7GB DB is confortably searched by Sylph-Searcher. Sylph-Searcher runs on UNIX/Linux/Windows. Sylph-Searcher can be downloaded from: http://sylpheed.sraoss.jp/en/download.html#searcher Can you give a brief explanation of how sylpheed and sylph-searcher correlate? Sylpheed doesn't seem to need postgres, sylph-searcher does ... ? enjoy, -- Tatsuo Ishii Cheers, Andrej ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] it works, but is it legal to separate schema/table/field references using spaces
Frank van Vugt wrote: db=# select avg(pg_catalog . pg_stats . avg_width) from pg_stats; It seems that all whitespace between schema/table/field references is ignored? Sure, this is perfectly valid per SQL and what not. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] monthly tally of new memberships
On Jul 18, 2007, at 13:29 , brian wrote: This returns the new memberships for each day, ignoring days without any. What i'd like to do though, is to select only the 1st of each month, summing the new memberships or that month, eg: month | applications 2006-05-01 | 57 2006-06-01 | 36 2006-07-01 | 72 Try something like this: SELECT date_trunc('month', applied)::date AS date_applied , count(id) AS applications FROM member GROUP BY applied ORDER BY date_applied ASC; Note I remove the WHERE applied = applied, as this is just identity. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] [PERFORM] Parrallel query execution for UNION ALL Queries
On Wed, Jul 18, 2007 at 11:30:48AM -0500, Scott Marlowe wrote: EnterpriseDB, a commercially enhanced version of PostgreSQL can do query parallelization, but it comes at a cost, and that cost is making sure you have enough spindles / I/O bandwidth that you won't be actually slowing your system down. I think you're thinking ExtendDB. :) -- Jim Nasby [EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) pgpsJ6UTJZf9L.pgp Description: PGP signature
[GENERAL] DBI/DBD::Pg and transactions
Hopefully the original post went astray and this isn't a duplicate. I don't see it in the archive so I'll assume I have a bogus SMTP setup at work for my home account. I've tried the Pg docs, DBI book, and Google searches but haven't found anything useful on this topic. I realize this is more a DBI question so if there is a better place to ask please point me towards it. Without lots of details here is what I'm trying to determine. Will a $sth-execute() of a SELECT ... FOR UPDATE statement allow me to update the selected records using $uth-execute( ... ) if $sth and $uth are both created/prepared from the same database handle $dbh? Thanks, Rod -- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] monthly tally of new memberships
brian wrote: I'm trying to create a select statement that will show me the number of new memberships or an organisation by date (first of each month). The member table has a date column to reflect when the member was inserted. So far, i've gotten as far as: SELECT applied AS date_applied, count(id) AS applications FROM member WHERE applied = applied GROUP BY applied ORDER BY date_applied ASC; Try this instead: select to_char(applied, '-mm') as month_applied, count(id) as applications from member group by to_char(applied, '-mm') order by 1 asc; Your WHERE condition seems superfluous, unless you're using that to remove any records where applied is NULL. If that's the case, it would be much more readable and intuitive to use where applied is not null. -Jon -- Senior Systems Developer Media Matters for America http://mediamatters.org/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] monthly tally of new memberships
Michael Glaesemann wrote: On Jul 18, 2007, at 13:29 , brian wrote: This returns the new memberships for each day, ignoring days without any. What i'd like to do though, is to select only the 1st of each month, summing the new memberships or that month, eg: month | applications 2006-05-01 | 57 2006-06-01 | 36 2006-07-01 | 72 Try something like this: SELECT date_trunc('month', applied)::date AS date_applied , count(id) AS applications FROM member GROUP BY applied ORDER BY date_applied ASC; Note I remove the WHERE applied = applied, as this is just identity. Thanks, but that isn't it. I've tried that exact query, actually. The problem with that is it doesn't give me one row for the entire month. Instead, i get one row for each day there was a new membership, only the date_applied column has been changed to the 1st of that particular month. eg: 2006-02-01 |1 2006-02-01 |1 2006-02-01 |7 2006-03-01 |1 2006-03-01 |3 2006-03-01 |1 2006-03-01 |3 2006-03-01 |1 2006-03-01 |2 What i'd like to be able to do is to count all of the new member IDs that have been inserted during a particular month and return that sum along with the date for the 1st (ie '2006-03-22'). The thing is, i feel certain that i've done something very similar before. brian ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] monthly tally of new memberships
brian wrote: Michael Glaesemann wrote: On Jul 18, 2007, at 13:29 , brian wrote: This returns the new memberships for each day, ignoring days without any. What i'd like to do though, is to select only the 1st of each month, summing the new memberships or that month, eg: month | applications 2006-05-01 | 57 2006-06-01 | 36 2006-07-01 | 72 Try something like this: SELECT date_trunc('month', applied)::date AS date_applied , count(id) AS applications FROM member GROUP BY applied ORDER BY date_applied ASC; Note I remove the WHERE applied = applied, as this is just identity. Thanks, but that isn't it. I've tried that exact query, actually. The problem with that is it doesn't give me one row for the entire month. Instead, i get one row for each day there was a new membership, only the date_applied column has been changed to the 1st of that particular month. eg: Hum, you should be grouping by date_applied (also known as group by 1 because you can't use column aliases in GROUP BY) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] DBI/DBD::Pg and transactions
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Will a $sth-execute() of a SELECT ... FOR UPDATE statement allow me to update the selected records using $uth-execute( ... ) if $sth and $uth are both created/prepared from the same database handle $dbh? You are allowed to update either way, but the lock created by the first execute will be in place for the second, as long as those statement handles are from the same $dbh (database handle). In general, connecting via DBI always gives you one connection unless you explicitly create a second one by calling DBI-connect. - -- Greg Sabino Mullane [EMAIL PROTECTED] End Point Corporation PGP Key: 0x14964AC8 200707181533 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFGnmspvJuQZxSWSsgRAxjIAJ0TRN5bTs9s1/Z3/YC/rzGdpEhWiACg11Ca sdXpTplc1laTXywTrd+8nBw= =5iA0 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] monthly tally of new memberships
Alvaro Herrera wrote: brian wrote: Michael Glaesemann wrote: On Jul 18, 2007, at 13:29 , brian wrote: This returns the new memberships for each day, ignoring days without any. What i'd like to do though, is to select only the 1st of each month, summing the new memberships or that month, eg: month | applications 2006-05-01 | 57 2006-06-01 | 36 2006-07-01 | 72 Try something like this: SELECT date_trunc('month', applied)::date AS date_applied , count(id) AS applications FROM member GROUP BY applied ORDER BY date_applied ASC; Note I remove the WHERE applied = applied, as this is just identity. Thanks, but that isn't it. I've tried that exact query, actually. The problem with that is it doesn't give me one row for the entire month. Instead, i get one row for each day there was a new membership, only the date_applied column has been changed to the 1st of that particular month. eg: Hum, you should be grouping by date_applied (also known as group by 1 because you can't use column aliases in GROUP BY) Right, that works, also. I compared this to Jon Sime's suggestion: test=# EXPLAIN ANALYZE SELECT date_trunc('month', applied)::date AS date_applied, count(id) AS applications FROM member GROUP BY 1 ORDER BY date_applied ASC; QUERY PLAN --- Sort (cost=140.76..141.26 rows=200 width=8) (actual time=17.590..17.622 rows=18 loops=1) Sort Key: (date_trunc('month'::text, (applied)::timestamp with time zone))::date - HashAggregate (cost=129.12..133.12 rows=200 width=8) (actual time=17.478..17.523 rows=18 loops=1) - Seq Scan on member (cost=0.00..123.76 rows=1072 width=8) (actual time=0.035..10.684 rows=1072 loops=1) Total runtime: 17.733 ms (5 rows) test=# EXPLAIN ANALYZE SELECT to_char(applied, '-mm') AS month_applied, count(id) AS applications FROM member GROUP BY to_char(applied, '-mm') ORDER BY 1 ASC; QUERY PLAN - Sort (cost=137.58..138.08 rows=200 width=8) (actual time=13.415..13.458 rows=18 loops=1) Sort Key: to_char((applied)::timestamp with time zone, '-mm'::text) - HashAggregate (cost=126.44..129.94 rows=200 width=8) (actual time=13.273..13.314 rows=18 loops=1) - Seq Scan on member (cost=0.00..121.08 rows=1072 width=8) (actual time=0.042..10.525 rows=1072 loops=1) Total runtime: 13.564 ms (5 rows) But, getting back to your comment, i see that this (grouping by the alias) also works: test=# EXPLAIN ANALYZE SELECT to_char(applied, '-mm') AS month_applied, count(id) AS applications FROM member GROUP BY month_applied ORDER BY 1 ASC; QUERY PLAN - Sort (cost=137.58..138.08 rows=200 width=8) (actual time=44.329..44.363 rows=18 loops=1) Sort Key: to_char((applied)::timestamp with time zone, '-mm'::text) - HashAggregate (cost=126.44..129.94 rows=200 width=8) (actual time=44.190..44.229 rows=18 loops=1) - Seq Scan on member (cost=0.00..121.08 rows=1072 width=8) (actual time=0.042..41.242 rows=1072 loops=1) Total runtime: 44.477 ms (5 rows) But it seems to take longer at the cost of keeping the query tidy. brian ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
[GENERAL] Need help optimizing this query
I've got a query that's taking forever (as will be obvious when you see it and the explain output). I can't figure out what indexes to add to make this run faster. I'd appreciate any help. Pat SELECT SUM(CASE WHEN (hit IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_hits_console, SUM(CASE WHEN (hit IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_hits_remote, SUM(CASE WHEN (played IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_played_console, SUM(CASE WHEN (played IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_played_remote, SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_downloaded_console, SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_downloaded_remote, SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND logged_in IS TRUE) THEN assets.size ELSE 0 END) as download_size_console, SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND logged_in IS FALSE) THEN assets.size ELSE 0 END) as download_size_remote, videos.id, videos.title, videos.guid FROM video_views, assets, videos WHERE videos.company_id=1 AND video_views.video_id=videos.id AND video_views.asset_id=assets.id GROUP BY videos.id, videos.title, videos.guid ORDER BY count_hits_remote DESC LIMIT 100 Limit (cost=139735.51..139735.68 rows=69 width=64) - Sort (cost=139735.51..139735.68 rows=69 width=64) Sort Key: sum(CASE WHEN ((video_views.hit IS TRUE) AND (video_views.logged_in IS FALSE)) THEN 1 ELSE 0 END) - HashAggregate (cost=139731.33..139733.40 rows=69 width=64) - Hash Join (cost=1164.79..138880.04 rows=30956 width=64) Hash Cond: (video_views.asset_id = assets.id) - Hash Join (cost=324.39..137343.13 rows=30956 width=60) Hash Cond: (video_views.video_id = videos.id) - Seq Scan on video_views (cost=0.00..114500.13 rows=5922413 width=12) - Hash (cost=323.52..323.52 rows=69 width=52) - Bitmap Heap Scan on videos (cost=64.90..323.52 rows=69 width=52) Recheck Cond: (company_id = 1) - Bitmap Index Scan on complete_videos_without_deleted_at (cost=0.00..64.88 rows=69 width=0) Index Cond: (company_id = 1) - Hash (cost=645.18..645.18 rows=15618 width=12) - Seq Scan on assets (cost=0.00..645.18 rows=15618 width=12) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Sylph-Searcher 1.0.0 released
On 7/18/07, Tatsuo Ishii [EMAIL PROTECTED] wrote: Hi, Hi, We are pleased to announce that Sylph-Searcher 1.0.0 is released. Sylph-Searcher is a client program for searching mail boxes. Supported mail box format is MH style, i.e. 1 mail = 1 file. Sylph-Searcher uses tsearch2 for full text search, thus searching is very fast. For example, ~300,000 mails or ~1.7GB DB is confortably searched by Sylph-Searcher. Sylph-Searcher runs on UNIX/Linux/Windows. Sylph-Searcher can be downloaded from: http://sylpheed.sraoss.jp/en/download.html#searcher Can you give a brief explanation of how sylpheed and sylph-searcher correlate? Sylpheed doesn't seem to need postgres, sylph-searcher does ... ? Yes, sylpheed does not need PostgreSQL. sylph-searcher is an independent application and actually is consisted of two programs: syldbimport and sylph-searcher. syldbimport reads given mail files and stores them into PostgreSQL database, indexing with tsearch2. sylph-searcher does full text searching the database and shows the result. Please let me know if you have further questions. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Need help optimizing this query
On 7/18/07, Pat Maddox [EMAIL PROTECTED] wrote: I've got a query that's taking forever (as will be obvious when you see it and the explain output). I can't figure out what indexes to add to make this run faster. I'd appreciate any help. Pat SELECT SUM(CASE WHEN (hit IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_hits_console, SUM(CASE WHEN (hit IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_hits_remote, SUM(CASE WHEN (played IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_played_console, SUM(CASE WHEN (played IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_played_remote, SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_downloaded_console, SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_downloaded_remote, SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND logged_in IS TRUE) THEN assets.size ELSE 0 END) as download_size_console, SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND logged_in IS FALSE) THEN assets.size ELSE 0 END) as download_size_remote, videos.id, videos.title, videos.guid FROM video_views, assets, videos WHERE videos.company_id=1 AND video_views.video_id=videos.id AND video_views.asset_id=assets.id GROUP BY videos.id, videos.title, videos.guid ORDER BY count_hits_remote DESC LIMIT 100 Limit (cost=139735.51..139735.68 rows=69 width=64) - Sort (cost=139735.51..139735.68 rows=69 width=64) Sort Key: sum(CASE WHEN ((video_views.hit IS TRUE) AND (video_views.logged_in IS FALSE)) THEN 1 ELSE 0 END) - HashAggregate (cost=139731.33..139733.40 rows=69 width=64) - Hash Join (cost=1164.79..138880.04 rows=30956 width=64) Hash Cond: (video_views.asset_id = assets.id) - Hash Join (cost=324.39..137343.13 rows=30956 width=60) Hash Cond: (video_views.video_id = videos.id) - Seq Scan on video_views (cost=0.00..114500.13 rows=5922413 width=12) - Hash (cost=323.52..323.52 rows=69 width=52) - Bitmap Heap Scan on videos (cost=64.90..323.52 rows=69 width=52) Recheck Cond: (company_id = 1) - Bitmap Index Scan on complete_videos_without_deleted_at (cost=0.00..64.88 rows=69 width=0) Index Cond: (company_id = 1) - Hash (cost=645.18..645.18 rows=15618 width=12) - Seq Scan on assets (cost=0.00..645.18 rows=15618 width=12) Here are the indexes I already have on the table: video_views_pkey PRIMARY KEY, btree (id) index_video_views_on_asset_id btree (asset_id) index_video_views_on_video_id btree (video_id) index_video_views_on_video_id_and_asset_id_and_created_at btree (video_id, created_at, asset_id) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Feature request: Per database search_path
As far as I know, currently one can set the search path globally, or on a per role bases. I was wondering if it could be possible to have a per database search_path. I believe this would be not only convenient, but will add flexibility. What got me thinking of this was that I manually ste a search_path. Connected to another database and when I came back to the original database the search_path was gone. At first I thought perhaps search_path could be set by database and that changing database was blanking my change. After looking at the help for alter database I noticed that there is no way to set the search_path by database. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Need help optimizing this query
On 7/18/07, Pat Maddox [EMAIL PROTECTED] wrote: I've got a query that's taking forever (as will be obvious when you see it and the explain output). I can't figure out what indexes to add to make this run faster. I'd appreciate any help. I'm curious why it's choosing to use hash joins rather than taking advantage of the indexes you have on the foreign key columns. What are the table definitions? Are hit, logged_in, played, downloaded all columns of videos_views? Have you ANALYZEd these tables? What is the EXPLAIN ANALYZE output for this query? You only provided the EXPLAIN output, which doesn't compare the plan with the actual query cost. I found it quite difficult to read you query. I reformatted it and also used some SQL functions to abstract away the CASE expressions. These SQL functions will probably be inlined so there should be very little overhead. If you have a boolean column, you don't need to test IS TRUE or IS FALSE: you can just use the value itself. I also find it helpful to separate the join conditions (in the JOIN clause) from the restrictions (the WHERE clause), which I've done below. CREATE OR REPLACE FUNCTION ternary(BOOLEAN, INTEGER, INTEGER) RETURNS INTEGER LANGUAGE SQL AS $_$SELECT CASE WHEN $1 THEN $2 ELSE $3 END$_$; CREATE OR REPLACE FUNCTION value_when(BOOLEAN, INTEGER) RETURNS INTEGER LANGUAGE SQL AS $_$SELECT ternary($1,$2,0)$_$; CREATE OR REPLACE FUNCTION one_when(BOOLEAN) RETURNS INTEGER LANGUAGE SQL as $_$SELECT value_when($1,1)$_$; SELECT SUM (one_when(hit AND logged_in)) AS count_hits_console , SUM (one_when(hit AND NOT logged_in)) AS count_hits_remote , SUM (one_when(played AND logged_in)) AS count_played_console , SUM (one_when(played AND NOT logged_in)) AS count_played_remote , SUM (one_when(downloaded AND logged_in)) AS count_downloaded_console , SUM (one_when(downloaded AND NOT logged_in)) AS count_downloaded_remote , SUM (value_when((played OR downloaded) AND logged_in, assets.size)) as download_size_console , SUM (value_when((played OR downloaded) AND NOT logged_in), assets.size) as download_size_remote , videos.id , videos.title , videos.guid FROM video_views JOIN assets ON (video_views.video_id=videos.id) JOIN videos ON (video_views.asset_id=assets.id) WHERE videos.company_id=1 GROUP BY videos.id , videos.title , videos.guid ORDER BY count_hits_remote DESC LIMIT 100 Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Sylph-Searcher 1.0.0 released
On 7/19/07, Tatsuo Ishii [EMAIL PROTECTED] wrote: Yes, sylpheed does not need PostgreSQL. sylph-searcher is an independent application and actually is consisted of two programs: syldbimport and sylph-searcher. syldbimport reads given mail files and stores them into PostgreSQL database, indexing with tsearch2. sylph-searcher does full text searching the database and shows the result. Thanks - that clarifies things. Do you have plans to make an optional integration of Sylpheed w/ postgres in the long run, so one wouldn't have to do the import step before being able to query the mail? Cheers, Andrej ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Need help optimizing this query
On 7/18/07, Michael Glaesemann [EMAIL PROTECTED] wrote: On 7/18/07, Pat Maddox [EMAIL PROTECTED] wrote: I've got a query that's taking forever (as will be obvious when you see it and the explain output). I can't figure out what indexes to add to make this run faster. I'd appreciate any help. I'm curious why it's choosing to use hash joins rather than taking advantage of the indexes you have on the foreign key columns. What are the table definitions? Are hit, logged_in, played, downloaded all columns of videos_views? Have you ANALYZEd these tables? What is the EXPLAIN ANALYZE output for this query? You only provided the EXPLAIN output, which doesn't compare the plan with the actual query cost. I found it quite difficult to read you query. I reformatted it and also used some SQL functions to abstract away the CASE expressions. These SQL functions will probably be inlined so there should be very little overhead. If you have a boolean column, you don't need to test IS TRUE or IS FALSE: you can just use the value itself. I also find it helpful to separate the join conditions (in the JOIN clause) from the restrictions (the WHERE clause), which I've done below. CREATE OR REPLACE FUNCTION ternary(BOOLEAN, INTEGER, INTEGER) RETURNS INTEGER LANGUAGE SQL AS $_$SELECT CASE WHEN $1 THEN $2 ELSE $3 END$_$; CREATE OR REPLACE FUNCTION value_when(BOOLEAN, INTEGER) RETURNS INTEGER LANGUAGE SQL AS $_$SELECT ternary($1,$2,0)$_$; CREATE OR REPLACE FUNCTION one_when(BOOLEAN) RETURNS INTEGER LANGUAGE SQL as $_$SELECT value_when($1,1)$_$; SELECT SUM (one_when(hit AND logged_in)) AS count_hits_console , SUM (one_when(hit AND NOT logged_in)) AS count_hits_remote , SUM (one_when(played AND logged_in)) AS count_played_console , SUM (one_when(played AND NOT logged_in)) AS count_played_remote , SUM (one_when(downloaded AND logged_in)) AS count_downloaded_console , SUM (one_when(downloaded AND NOT logged_in)) AS count_downloaded_remote , SUM (value_when((played OR downloaded) AND logged_in, assets.size)) as download_size_console , SUM (value_when((played OR downloaded) AND NOT logged_in), assets.size) as download_size_remote , videos.id , videos.title , videos.guid FROM video_views JOIN assets ON (video_views.video_id=videos.id) JOIN videos ON (video_views.asset_id=assets.id) WHERE videos.company_id=1 GROUP BY videos.id , videos.title , videos.guid ORDER BY count_hits_remote DESC LIMIT 100 Michael Glaesemann grzm seespotcode net Michael, I tried your SQL but it didn't work - it was missing the videos table in the FROM clause. But when I add it, I get the error: ERROR: invalid reference to FROM-clause entry for table video_views LINE 20: JOIN assets ON (video_views.video_id=videos.id) ^ HINT: There is an entry for table video_views, but it cannot be referenced from this part of the query. Not really sure what that means. Here are the table definitions: twistage_development=# \d video_views Table public.video_views Column |Type | Modifiers +-+-- id | integer | not null default nextval('video_views_id_seq'::regclass) video_id | integer | created_at | timestamp without time zone | asset_id | integer | played | boolean | default false downloaded | boolean | default false hit| boolean | default false logged_in | boolean | default false Indexes: video_views_pkey PRIMARY KEY, btree (id) index_video_views_on_asset_id btree (asset_id) index_video_views_on_video_id btree (video_id) index_video_views_on_video_id_and_asset_id_and_created_at btree (video_id, created_at, asset_id) twistage_development=# \d videos Table public.videos Column |Type | Modi fiers ---+-+-- --- id| integer | not null default nextval( 'videos_id_seq'::regclass) title | character varying(255) | duration | double precision| description | text| status| character varying(255) | user_id | integer | created_at| timestamp without time zone | upload_finished | boolean | default false publisher_name| character varying(255) | company_id| integer | available_for_display | boolean | default true guid | character varying(255) |
Re: [GENERAL] Need help optimizing this query
On Jul 18, 2007, at 16:01 , Michael Glaesemann wrote: CREATE OR REPLACE FUNCTION one_when(BOOLEAN) RETURNS INTEGER LANGUAGE SQL as $_$SELECT value_when($1,1)$_$; I forgot to add that you can cast booleans to integers, so one_when (expr) is equivalent to expr::int: # SELECT (true and false)::int, (true or false)::int; int4 | int4 --+-- 0 |1 (1 row) Of course, this rewriting shouldn't affect the performance at all: it should just make it easier for you to read, which does have some value. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Will partial index creation use existing index?
Does PostgreSQL use an existing index, if possible, when creating a partial index? By way of background, we have some nightly bulk processing that includes a couple of 15-30 million row tables. Most of the processing is only looking at prior-day data (up to ~200,000 rows) and for efficiency requires several indexes. Except for this one process, the indexes are useless and I'd rather not constantly maintain them. There is an index on the timestamp column so I have considered creating the indexes on a temporary basis with something like: create index foo_bar on foo (bar) where timestamp_col current_date - interval '1 day'; (Yes this is simplified, I am aware of the Daylight Saving Time off-by-an-hour implications.) It seems that creating this partial index would be more efficient if the existing index were used but explain create index... just gives me an error and the query seems to run way too long to be processing only the one day data. For comparison, on a relatively large 225,000 row day I can create temporary table ondeay... on the same criteria and create 10 indexes and analyze the table in well under 10 seconds which is way faster than creating even a single partial index on the full table. Cheers, Steve ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Update of table lags execution of statement by 1 minute?
Erik Peterson [EMAIL PROTECTED] writes: This happens in the same session. I have an application that had been encountering this issue periodically, and I have rigged it to send me an email whenever it happens. I¹ll get that email, fire up my client, and try to update the row manually. Right after my update (autocommit is on, but it makes no difference if I do a manual commit), I¹ll issue a select statement on the same row, and it returns with the value from before the update. I¹ll keep watching that row, and eventually it will show the right value in that column. Once again, looking at the query logs, there are no conflicting updates happening. You'll have to describe in much more detail what you're doing. Send the actual session from your terminal of this happening, for example. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Need help optimizing this query
On Jul 18, 2007, at 16:12 , Pat Maddox wrote: ERROR: invalid reference to FROM-clause entry for table video_views LINE 20: JOIN assets ON (video_views.video_id=videos.id) ^ HINT: There is an entry for table video_views, but it cannot be referenced from this part of the query. It's because I mismatched the JOIN clauses during my copy-and-paste :( On 7/18/07, Michael Glaesemann [EMAIL PROTECTED] wrote: FROM video_views JOIN assets ON (video_views.video_id=videos.id) JOIN videos ON (video_views.asset_id=assets.id) This should be FROM video_views JOIN assets ON (video_views.asset_id=assets.id) JOIN videos ON (video_views.video_id=videos.id) Do you have the EXPLAIN ANALYE output of the query? Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] DBI/DBD::Pg and transactions
Greg Sabino Mullane wrote: -BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Will a $sth-execute() of a SELECT ... FOR UPDATE statement allow me to update the selected records using $uth-execute( ... ) if $sth and $uth are both created/prepared from the same database handle $dbh? You are allowed to update either way, but the lock created by the first execute will be in place for the second, as long as those statement handles are from the same $dbh (database handle). So how would I update selected rows (tuples) from the first execute a SELECT ... FOR UPDATE with the second. *The rest of the story:* This script will be run via cron. It will query a table for orders that are ready. It may take longer than the cron interval to process the orders and some of the orders may be left as ready ie. not done. If the order gets done I need to update it's row (tuple) as done. In the mean time if the script gets triggered again and the first instance isn't finished the second needs to not be able to select those records already being handled. Which then, suddenly like a smack on top of the head, makes me think of how to select using the same SELECT ... FOR UPDATE query only those non-locked records. Man this is getting deeper every time I look at it. Rod -- In general, connecting via DBI always gives you one connection unless you explicitly create a second one by calling DBI-connect. - -- Greg Sabino Mullane [EMAIL PROTECTED] End Point Corporation PGP Key: 0x14964AC8 200707181533 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFGnmspvJuQZxSWSsgRAxjIAJ0TRN5bTs9s1/Z3/YC/rzGdpEhWiACg11Ca sdXpTplc1laTXywTrd+8nBw= =5iA0 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] DBI/DBD::Pg and transactions
On Wednesday 18 July 2007 14:29, Roderick A. Anderson [EMAIL PROTECTED] wrote: In the mean time if the script gets triggered again and the first instance isn't finished the second needs to not be able to select those records already being handled. select for update won't do that. It will sit waiting for locks on the same rows the first process is handling. -- Remember when computers were frustrating because they did exactly what you told them to? That actually seems sort of quaint now. --J.D. Baldwin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Need help optimizing this query
On 7/18/07, Michael Glaesemann [EMAIL PROTECTED] wrote: On Jul 18, 2007, at 16:12 , Pat Maddox wrote: ERROR: invalid reference to FROM-clause entry for table video_views LINE 20: JOIN assets ON (video_views.video_id=videos.id) ^ HINT: There is an entry for table video_views, but it cannot be referenced from this part of the query. It's because I mismatched the JOIN clauses during my copy-and-paste :( On 7/18/07, Michael Glaesemann [EMAIL PROTECTED] wrote: FROM video_views JOIN assets ON (video_views.video_id=videos.id) JOIN videos ON (video_views.asset_id=assets.id) This should be FROM video_views JOIN assets ON (video_views.asset_id=assets.id) JOIN videos ON (video_views.video_id=videos.id) Do you have the EXPLAIN ANALYE output of the query? Michael Glaesemann grzm seespotcode net For some reason the functions you wrote are giving me trouble (there's a BIGINT involved, I tried changing the functions around but kept having issues). So here's the full query, hopefully formatted better: SELECT SUM(CASE WHEN (hit IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_hits_console, SUM(CASE WHEN (hit IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_hits_remote, SUM(CASE WHEN (played IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_played_console, SUM(CASE WHEN (played IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_played_remote, SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_downloaded_console, SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_downloaded_remote, SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND logged_in IS TRUE) THEN assets.size ELSE 0 END) as download_size_console, SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND logged_in IS FALSE) THEN assets.size ELSE 0 END) AS download_size_remote, videos.id, videos.title, videos.guid FROM video_views JOIN assets ON (video_views.asset_id=assets.id) JOIN videos on (video_views.video_id=videos.id) WHERE videos.company_id=1 GROUP BY videos.id, videos.title, videos.guid ORDER BY count_hits_remote DESC LIMIT 100 and here's the EXPLAIN ANALYZE output: Limit (cost=127072.90..127073.12 rows=87 width=64) (actual time=2636.560..2636.567 rows=20 loops=1) - Sort (cost=127072.90..127073.12 rows=87 width=64) (actual time=2636.558..2636.562 rows=20 loops=1) Sort Key: sum(CASE WHEN ((video_views.hit IS TRUE) AND (video_views.logged_in IS FALSE)) THEN 1 ELSE 0 END) - HashAggregate (cost=127067.49..127070.10 rows=87 width=64) (actual time=2636.481..2636.506 rows=20 loops=1) - Hash Join (cost=880.96..125995.46 rows=38983 width=64) (actual time=24.904..2635.719 rows=122 loops=1) Hash Cond: (video_views.asset_id = assets.id) - Hash Join (cost=195.96..124433.01 rows=39009 width=60) (actual time=8.327..2618.982 rows=122 loops=1) Hash Cond: (video_views.video_id = videos.id) - Seq Scan on video_views (cost=0.00..101352.70 rows=5998470 width=12) (actual time=0.031..1410.231 rows=5998341 loops=1) - Hash (cost=194.87..194.87 rows=87 width=52) (actual time=1.001..1.001 rows=90 loops=1) - Bitmap Heap Scan on videos (cost=4.93..194.87 rows=87 width=52) (actual time=0.111..0.840 rows=90 loops=1) Recheck Cond: (company_id = 1) - Bitmap Index Scan on index_videos_on_company_id (cost=0.00..4.90 rows=87 width=0) (actual time=0.079..0.079 rows=90 loops=1) Index Cond: (company_id = 1) - Hash (cost=487.78..487.78 rows=15778 width=12) (actual time=16.527..16.527 rows=15778 loops=1) - Seq Scan on assets (cost=0.00..487.78 rows=15778 width=12) (actual time=0.023..9.601 rows=15778 loops=1) Total runtime: 2637.043 ms (17 rows) That one runs reasonably fine, because there are only 20 videos being returned and a handful of video views associated with them. In the real query there are about 1k videos and a couple million views. That took about 80 minutes to run, according to logs. Pat ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] DBI/DBD::Pg and transactions
Alan Hodgson wrote: On Wednesday 18 July 2007 14:29, Roderick A. Anderson [EMAIL PROTECTED] wrote: In the mean time if the script gets triggered again and the first instance isn't finished the second needs to not be able to select those records already being handled. select for update won't do that. It will sit waiting for locks on the same rows the first process is handling. Let's see how do I put it ... Da[r|m]n! But I now remember reading something to this effect in one of my PG books or the on line docs. Some other method is now in order. Thanks to all for the thoughts and ideas. I'll post my solution ... when I figure one out! Rod -- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Feature request: Per database search_path
Francisco Reyes wrote: As far as I know, currently one can set the search path globally, or on a per role bases. I was wondering if it could be possible to have a per database search_path. I believe this would be not only convenient, but will add flexibility. ALTER DATABASE leia SET search_path = public,lookups; Seems to work for me on 8.2 - you'll need to disconnect and reconnect to see it take place though. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Need help optimizing this query
On 7/18/07, Pat Maddox [EMAIL PROTECTED] wrote: On 7/18/07, Michael Glaesemann [EMAIL PROTECTED] wrote: On Jul 18, 2007, at 16:12 , Pat Maddox wrote: ERROR: invalid reference to FROM-clause entry for table video_views LINE 20: JOIN assets ON (video_views.video_id=videos.id) ^ HINT: There is an entry for table video_views, but it cannot be referenced from this part of the query. It's because I mismatched the JOIN clauses during my copy-and-paste :( On 7/18/07, Michael Glaesemann [EMAIL PROTECTED] wrote: FROM video_views JOIN assets ON (video_views.video_id=videos.id) JOIN videos ON (video_views.asset_id=assets.id) This should be FROM video_views JOIN assets ON (video_views.asset_id=assets.id) JOIN videos ON (video_views.video_id=videos.id) Do you have the EXPLAIN ANALYE output of the query? Michael Glaesemann grzm seespotcode net For some reason the functions you wrote are giving me trouble (there's a BIGINT involved, I tried changing the functions around but kept having issues). So here's the full query, hopefully formatted better: SELECT SUM(CASE WHEN (hit IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_hits_console, SUM(CASE WHEN (hit IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_hits_remote, SUM(CASE WHEN (played IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_played_console, SUM(CASE WHEN (played IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_played_remote, SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0 END) AS count_downloaded_console, SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0 END) AS count_downloaded_remote, SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND logged_in IS TRUE) THEN assets.size ELSE 0 END) as download_size_console, SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND logged_in IS FALSE) THEN assets.size ELSE 0 END) AS download_size_remote, videos.id, videos.title, videos.guid FROM video_views JOIN assets ON (video_views.asset_id=assets.id) JOIN videos on (video_views.video_id=videos.id) WHERE videos.company_id=1 GROUP BY videos.id, videos.title, videos.guid ORDER BY count_hits_remote DESC LIMIT 100 and here's the EXPLAIN ANALYZE output: Limit (cost=127072.90..127073.12 rows=87 width=64) (actual time=2636.560..2636.567 rows=20 loops=1) - Sort (cost=127072.90..127073.12 rows=87 width=64) (actual time=2636.558..2636.562 rows=20 loops=1) Sort Key: sum(CASE WHEN ((video_views.hit IS TRUE) AND (video_views.logged_in IS FALSE)) THEN 1 ELSE 0 END) - HashAggregate (cost=127067.49..127070.10 rows=87 width=64) (actual time=2636.481..2636.506 rows=20 loops=1) - Hash Join (cost=880.96..125995.46 rows=38983 width=64) (actual time=24.904..2635.719 rows=122 loops=1) Hash Cond: (video_views.asset_id = assets.id) - Hash Join (cost=195.96..124433.01 rows=39009 width=60) (actual time=8.327..2618.982 rows=122 loops=1) Hash Cond: (video_views.video_id = videos.id) - Seq Scan on video_views (cost=0.00..101352.70 rows=5998470 width=12) (actual time=0.031..1410.231 rows=5998341 loops=1) - Hash (cost=194.87..194.87 rows=87 width=52) (actual time=1.001..1.001 rows=90 loops=1) - Bitmap Heap Scan on videos (cost=4.93..194.87 rows=87 width=52) (actual time=0.111..0.840 rows=90 loops=1) Recheck Cond: (company_id = 1) - Bitmap Index Scan on index_videos_on_company_id (cost=0.00..4.90 rows=87 width=0) (actual time=0.079..0.079 rows=90 loops=1) Index Cond: (company_id = 1) - Hash (cost=487.78..487.78 rows=15778 width=12) (actual time=16.527..16.527 rows=15778 loops=1) - Seq Scan on assets (cost=0.00..487.78 rows=15778 width=12) (actual time=0.023..9.601 rows=15778 loops=1) Total runtime: 2637.043 ms (17 rows) That one runs reasonably fine, because there are only 20 videos being returned and a handful of video views associated with them. In the real query there are about 1k videos and a couple million views. That took about 80 minutes to run, according to logs. Pat Okay so it doesn't normally take 80 minutes to run. Something funky just happened and it took that long once. It usually takes between 90-100 seconds. My coworker told me it takes 80 minutes but it appears that's an anomaly. One thing we were wondering is if all the aggregate calculations might be slowing it down, and if it might be faster to do six separate queries. The real problem there is sorting and merging the data sets. Merging isn't tough, but making sure that all queries obey the desired sort order is. What
Re: [GENERAL] Need help optimizing this query
On Jul 18, 2007, at 16:48 , Pat Maddox wrote: For some reason the functions you wrote are giving me trouble (there's a BIGINT involved, I tried changing the functions around but kept having issues). You might try these, if you're interested. CREATE OR REPLACE FUNCTION ternary(BOOLEAN, BIGINT, BIGINT) RETURNS BIGINT LANGUAGE SQL AS $_$SELECT CASE WHEN $1 THEN $2 ELSE $3 END$_$; CREATE OR REPLACE FUNCTION value_when(BOOLEAN, BIGINT) RETURNS BIGINT LANGUAGE SQL AS $_$SELECT ternary($1,$2,0)$_$; CREATE OR REPLACE FUNCTION one_when(BOOLEAN) RETURNS BIGINT LANGUAGE SQL as $_$SELECT value_when($1,1)$_$; So here's the full query, hopefully formatted better: I'm still curious about why the planner is choosing a hash join over using the indexes on the foreign keys, but that might be because the tables are relatively small. That one runs reasonably fine, because there are only 20 videos being returned and a handful of video views associated with them. In the real query there are about 1k videos and a couple million views. That took about 80 minutes to run, according to logs. The planner will choose different plans based on, among other things, what it estimates the size of the result to be, so while looking at a small example query might seem like a way to go about looking at what's going on, it's most likely not going to give you an accurate representation of the situation. Are you looking at two different systems (e.g., a development system versus a production system) or just choosing a smaller query on the same system? If you can't run the query on your production system, you may want to take a dump of the production system and set it up on another box. Even with a couple million rows in the video_views table, PostgreSQL shouldn't really blink too much, as long as the server is tuned properly, the hardware is adequate, and the database statistics are up to date. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Feature request: Per database search_path
Francisco Reyes [EMAIL PROTECTED] writes: As far as I know, currently one can set the search path globally, or on a per role bases. I was wondering if it could be possible to have a per database search_path. ALTER DATABASE SET search_path = ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Update of table lags execution of statement by 1 minute?
Erik Peterson [EMAIL PROTECTED] writes: I=B9m not sure the corrupted index issue is it. After updating, the attribut= e shows up as the =B3old=B2 value with selects on different columns, ie: UPDATE mytable SET myattribute=3D1 WHERE id=3D14; COMMIT; SELECT * from mytable WHERE myattribute=3D0 Would include the that row (id=3D14). So it isn=B9t just a single corrupted index, if that is indeed the issue. Hm. I still think there's something going on in your application that you're missing. While you are watching the row to see it update, would you select the system columns ctid,xmin,xmax as well as the data? That might provide some clue what's really happening. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Need help optimizing this query
On 7/18/07, Michael Glaesemann [EMAIL PROTECTED] wrote: On Jul 18, 2007, at 16:48 , Pat Maddox wrote: For some reason the functions you wrote are giving me trouble (there's a BIGINT involved, I tried changing the functions around but kept having issues). You might try these, if you're interested. CREATE OR REPLACE FUNCTION ternary(BOOLEAN, BIGINT, BIGINT) RETURNS BIGINT LANGUAGE SQL AS $_$SELECT CASE WHEN $1 THEN $2 ELSE $3 END$_$; CREATE OR REPLACE FUNCTION value_when(BOOLEAN, BIGINT) RETURNS BIGINT LANGUAGE SQL AS $_$SELECT ternary($1,$2,0)$_$; CREATE OR REPLACE FUNCTION one_when(BOOLEAN) RETURNS BIGINT LANGUAGE SQL as $_$SELECT value_when($1,1)$_$; So here's the full query, hopefully formatted better: I'm still curious about why the planner is choosing a hash join over using the indexes on the foreign keys, but that might be because the tables are relatively small. That one runs reasonably fine, because there are only 20 videos being returned and a handful of video views associated with them. In the real query there are about 1k videos and a couple million views. That took about 80 minutes to run, according to logs. The planner will choose different plans based on, among other things, what it estimates the size of the result to be, so while looking at a small example query might seem like a way to go about looking at what's going on, it's most likely not going to give you an accurate representation of the situation. Are you looking at two different systems (e.g., a development system versus a production system) or just choosing a smaller query on the same system? If you can't run the query on your production system, you may want to take a dump of the production system and set it up on another box. Even with a couple million rows in the video_views table, PostgreSQL shouldn't really blink too much, as long as the server is tuned properly, the hardware is adequate, and the database statistics are up to date. Michael Glaesemann grzm seespotcode net Sorry, I mentioned that it took 90 seconds to run the query but I didn't show that EXPLAIN ANALYZE output. Here it is, same query just with a different company_id: Limit (cost=879283.07..879283.32 rows=100 width=64) (actual time=92486.858..92486.891 rows=100 loops=1) - Sort (cost=879283.07..879297.15 rows=5632 width=64) (actual time=92486.856..92486.867 rows=100 loops=1) Sort Key: sum(CASE WHEN ((video_views.hit IS TRUE) AND (video_views.logged_in IS FALSE)) THEN 1 ELSE 0 END) - GroupAggregate (cost=803054.95..878932.21 rows=5632 width=64) (actual time=67145.471..92484.408 rows=730 loops=1) - Sort (cost=803054.95..809363.98 rows=2523610 width=64) (actual time=67076.407..75441.274 rows=5799447 loops=1) Sort Key: videos.id, videos.title, videos.guid - Hash Join (cost=1220.63..237115.16 rows=2523610 width=64) (actual time=31.230..11507.406 rows=5799447 loops=1) Hash Cond: (video_views.asset_id = assets.id) - Hash Join (cost=535.62..179627.88 rows=2525294 width=60) (actual time=13.286..7621.950 rows=5799447 loops=1) Hash Cond: (video_views.video_id = videos.id) - Seq Scan on video_views (cost=0.00..101352.70 rows=5998470 width=12) (actual time=0.023..2840.718 rows=5998341 loops=1) - Hash (cost=465.23..465.23 rows=5632 width=52) (actual time=13.216..13.216 rows=5712 loops=1) - Seq Scan on videos (cost=0.00..465.23 rows=5632 width=52) (actual time=0.038..9.060 rows=5712 loops=1) Filter: (company_id = 11) - Hash (cost=487.78..487.78 rows=15778 width=12) (actual time=17.876..17.876 rows=15778 loops=1) - Seq Scan on assets (cost=0.00..487.78 rows=15778 width=12) (actual time=0.032..10.880 rows=15778 loops=1) Total runtime: 92548.006 ms (17 rows) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] IN clause performance
Basic query optimization question- does Postgres process x IN (y1, y2) as fast as (x = y1 OR x = y2) in a function?
Re: [GENERAL] Sylph-Searcher 1.0.0 released
On 7/19/07, Tatsuo Ishii [EMAIL PROTECTED] wrote: Yes, sylpheed does not need PostgreSQL. sylph-searcher is an independent application and actually is consisted of two programs: syldbimport and sylph-searcher. syldbimport reads given mail files and stores them into PostgreSQL database, indexing with tsearch2. sylph-searcher does full text searching the database and shows the result. Thanks - that clarifies things. Do you have plans to make an optional integration of Sylpheed w/ postgres in the long run, so one wouldn't have to do the import step before being able to query the mail? Yes, we have been discussing that. I will tell your opinion to the slyph-searcher developer. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Feature request: Per database search_path
Richard Huxton writes: ALTER DATABASE leia SET search_path = public,lookups; Seems to work for me on 8.2 - you'll need to disconnect and reconnect to see it take place though. Hmm.. I must have typed something wrong when I tried.. For the archives.. If a user has a search path, that takes precedence over the database search_path Is there a way to unset a role's search_path? I had set one role with a particular search path. Now want to take that off so the user can get the database's search_path setting. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Need help optimizing this query
On Jul 18, 2007, at 17:34 , Pat Maddox wrote: - Sort (cost=803054.95..809363.98 rows=2523610 width=64) (actual time=67076.407..75441.274 rows=5799447 loops=1) Sort Key: videos.id, videos.title, videos.guid If I'm reading this right, it looks like a majority of the time (about two-thirds) is spent in this sort step. This is a naive guess, but perhaps you may want to increase your work_mem. What's your current work_mem setting? Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Can someone shed some light on following error: pg_tblspc/16763/16764/PG_VERSION is missing
Hi! Some users in our group have reported running into following error occasionally. They couldn't pinpoint a specific set of action that led to this error but don't believe they did anything catastrophic as this error suggest. Any ideas or explanation will be appreciated. FATAL: pg_tblspc/16763/16764 is not a valid data directory DETAIL: File pg_tblspc/16763/16764/PG_VERSION is missing. Thanks, Brijesh ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Can't SELECT from (INSERT ... RETURNING)
On 7/18/07, Nico Sabbi [EMAIL PROTECTED] wrote: I thought I could use the output of INSERT...RETURNING as a set of tuples for a subquery, but it seems it's not the case: nb1=# select * from (insert into m(a) values(112) returning a); ERROR: syntax error at or near into LINE 1: select * from (insert into m(a) values(112) returni... ^ Is this a bug or it's not even supposed to work in theory? Such a feature would be extremely useful to have. it's on the TODO. It's a frequently requested feature. It's more complicated than it looks on the surface. merlin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] tsearch2 on postgres8.2.4
Hi all has anybody created using Gendict generate dictionary in spanish successful ?. __ Preguntá. Respondé. Descubrí. Todo lo que querías saber, y lo que ni imaginabas, está en Yahoo! Respuestas (Beta). ¡Probalo ya! http://www.yahoo.com.ar/respuestas ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Ordering by a complex field
I have a one varchar field. I'd like to order so that records where field='2' come first, then '1', then '9', then anything but '0', then '0'. Is there anyway to do this in a standard order by clause (that is, without writing a new SQL function)?
Re: [GENERAL] Ordering by a complex field
On Jul 18, 2007, at 20:12 , Robert James wrote: I'd like to order so that records where field='2' come first, then '1', then '9', then anything but '0', then '0'. Is there anyway to do this in a standard order by clause (that is, without writing a new SQL function)? # create table whatever (a text primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index whatever_pkey for table whatever CREATE TABLE # insert into whatever (a) select a::text from generate_series(0,20) as g(a); INSERT 0 21 # SELECT a FROM whatever ORDER BY a = '2' DESC , a = '1' DESC , a = '9' DESC , a '0' DESC; a 2 1 9 5 6 7 8 10 11 12 13 14 15 16 17 18 19 20 3 4 0 (21 rows) Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Database design : international postal address
Hello guys, I am currently designing a database which has several tables (e.g. a Customer table) which include address information such as street address, city, state, country code, and zip code, and phone number information in each record. We need to make the schema for these tables sufficient to accomodate entries for international customers. I figured someone else reading this may have developed a database with similar international concerns and have some suggestions that would help us avoid any pitfalls as we move forward. Is a standard for international addresses exists in the case of database design? in representation? I want to be rock solid Any suggestions/help is appreciated! Thanks!! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Query performance strangeness..
Hi Folks, I'm new to this list, but I've been using postgresql for a few years. In general I've been able to figure things out by reading various docs. I've hit something now that I haven't been able to sort out at all. It may be that there's some document that explains all this... if so, please point it out! I have a view called ISpacePersonRoles that joins my 'people' table with spaces and roles: CCOC=# \d ISpacePersonRoles View public.ispacepersonroles Column | Type | Modifiers --+---+--- lc3key | text | personid | character varying(40) | last | character varying(80) | first| character varying(40) | middle | character varying(15) | status | character varying(30) | intermentspaceid | character varying(30) | role | character varying(30) | View definition: SELECT lower_concat3(p.last, p.first, (p.middle::text || p.personid::text)::character varying) AS lc3key, p.personid, p.last, p.first, p.middle, p.status, isp.objectid AS intermentspaceid, isr.name AS role FROM people p LEFT JOIN intermentspacepersons isp ON p.personid::text = isp.personid::text LEFT JOIN intermentspaceroles isr ON isp.roleid::text = isr.intermentspaceroleid::text; I also have a function get_cem_for_directBurial(personid) that is expensive to call, but it's also indexed, so I hoped that the index would normally be used (essentially as a cache). It returns a 'cemetery code' so I can search for folks buried in a particular cemetery. (The cemetery code was added to a different table after the 'people' table was more or less frozen.. I'd like to keep it that way if possible.) Sometimes I need to search for rows from the view that satisfy certain criteria, sorted in some specific order. Here's where the trouble starts. In the view I compute something I call 'lc3key', defined as: lower_concat3(p.last, p.first, (p.middle::text || p.personid::text)::character varying) where 'lower_concat3' just returns a lower case version of three strings all concatenated together. The string is basically lastname, firstname, middle and personid (to guarantee uniqueness). It seems like most of the time sorting by last, first, middle should be the same as sorting by lc3key (all of these things are indexed BTW). So here goes: CCOC=# explain analyze select * from ISpacePersonRoles where ('STJ' = get_cem_for_directBurial(personid) AND lc3key = lower_concat3 ('Jones', '', '') and (status = 'D' or role = 'burial') and status 'R' and status 'F') order by lc3key asc limit 100; QUERY PLAN Limit (cost=1.22..1361.55 rows=100 width=62) (actual time=2.172..90.077 rows=100 loops=1) - Nested Loop Left Join (cost=1.22..1157163.90 rows=85065 width=62) (actual time=2.167..89.682 rows=100 loops=1) Join Filter: ((outer.roleid)::text = (inner.intermentspaceroleid)::text) Filter: (((outer.status)::text = 'D'::text) OR ((inner.name)::text = 'burial'::text)) - Nested Loop Left Join (cost=0.00..1109951.60 rows=85065 width=61) (actual time=0.436..54.552 rows=374 loops=1) - Index Scan using idx_people_lower_concat3_last on people p (cost=0.00..130784.91 rows=43872 width=40) (actual time=0.366..47.016 rows=171 loops=1) Index Cond: (lower_concat3(last, first, (((middle)::text || (personid)::text))::character varying) = 'jonesA'::text) Filter: (('STJ'::text = get_cem_for_directburial (personid)) AND ((status)::text 'R'::text) AND ((status)::text 'F'::text)) - Index Scan using idx_intermentspacepersons_pers on intermentspacepersons isp (cost=0.00..22.24 rows=6 width=33) (actual time=0.024..0.030 rows=2 loops=171) Index Cond: ((outer.personid)::text = (isp.personid)::text) - Materialize (cost=1.22..1.42 rows=20 width=19) (actual time=0.002..0.042 rows=20 loops=374) - Seq Scan on intermentspaceroles isr (cost=0.00..1.20 rows=20 width=19) (actual time=0.005..0.060 rows=20 loops=1) Total runtime: 90.395 ms OK.. not too bad. If I do the same query... but ask for 'HCC' rather than 'STJ', just a different cemetery code, I get 91 seconds... about 1000 times longer! Limit (cost=0.00..10191.16 rows=100 width=62) (actual time=8.909..91584.430 rows=100 loops=1) - Nested Loop Left Join (cost=0.00..150013.78 rows=1472 width=62) (actual time=8.905..91583.951 rows=100 loops=1) Join Filter: ((outer.roleid)::text =
Re: [GENERAL] Query performance strangeness..
Steve Spicklemire wrote: I also have a function get_cem_for_directBurial(personid) that is expensive to call, but it's also indexed, so I hoped that the index would normally be used (essentially as a cache). It returns a 'cemetery code' so I can search for folks buried in a particular cemetery. (The cemetery code was added to a different table after the 'people' table was more or less frozen.. I'd like to keep it that way if possible.) How is this function defined? Is it marked Immutable or similar? The body might be interesting too. Sometimes I need to search for rows from the view that satisfy certain criteria, sorted in some specific order. Here's where the trouble starts. In the view I compute something I call 'lc3key', defined as: lower_concat3(p.last, p.first, (p.middle::text || p.personid::text)::character varying) where 'lower_concat3' just returns a lower case version of three strings all concatenated together. The string is basically lastname, firstname, middle and personid (to guarantee uniqueness). It seems like most of the time sorting by last, first, middle should be the same as sorting by lc3key (all of these things are indexed BTW). So here goes: Definitions for the three tables and their indexes would be nice to check against too. - Index Scan using idx_people_lower_concat3_last on people p (cost=0.00..130784.91 rows=43872 width=40) (actual time=0.366..47.016 rows=171 loops=1) Index Cond: (lower_concat3(last, first, (((middle)::text || (personid)::text))::character varying) = 'jonesA'::text) Filter: (('STJ'::text = get_cem_for_directburial(personid)) AND ((status)::text 'R'::text) AND ((status)::text 'F'::text)) OK.. not too bad. If I do the same query... but ask for 'HCC' rather than 'STJ', just a different cemetery code, I get 91 seconds... about 1000 times longer! - Index Scan using idx_people_lower_concat3_last on people p (cost=0.00..130784.91 rows=759 width=40) (actual time=8.722..91396.606 rows=256 loops=1) Index Cond: (lower_concat3(last, first, (((middle)::text || (personid)::text))::character varying) = 'jonesA'::text) Filter: (('HCC'::text = get_cem_for_directburial(personid)) AND ((status)::text 'R'::text) AND ((status)::text 'F'::text)) In this case, look at the actual times. There are two possibilities: 1. The first query had its data/indexes in cache whereas the second didn't. Run each three times in a row and see if the times stay roughly constant. 2. Calls to get_cem_for_directburial() can vary widely in their execution time. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Feature request: Per database search_path
Francisco Reyes wrote: Is there a way to unset a role's search_path? I had set one role with a particular search path. Now want to take that off so the user can get the database's search_path setting. ALTER ROLE ... RESET search_path; -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Unexpected crash
Hello all, I installed the latest version from rpms and everythings ok, except when I connect to a db with psql and press shift+return the backend crashes with Segmentation fault! I guess the problem is with my installation but I don't know how to debug. It's not a very disconcerning thing per se, but I wonder what other surprises might be hiding behind the scenes. test2= select version(); version -- PostgreSQL 8.2.4 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.2.320030502 (Red Hat Linux 3.2.3-58) (1 row) relevant log line: Jul 19 08:33:36 mihin postgres[24584]: [2-1] LOG: unexpected EOF on client connection Regards MP
Re: [GENERAL] Unexpected crash
Mikko Partio [EMAIL PROTECTED] writes: I installed the latest version from rpms and everythings ok, except when I connect to a db with psql and press shift+return the backend crashes with Segmentation fault! This is not a backend crash, you are SIGQUIT-ing your psql session. Check your terminal settings, because that's not the usual key combination for SIGQUIT. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Unexpected crash
Mikko Partio wrote: Hello all, I installed the latest version from rpms and everythings ok, except when I connect to a db with psql and press shift+return the backend crashes with Segmentation fault! I guess the problem is with my installation but I don't know how to debug. It's not a very disconcerning thing per se, but I wonder what other surprises might be hiding behind the scenes. Hmm. Very odd. test2= select version(); version -- PostgreSQL 8.2.4 on i686-redhat-linux-gnu, compiled by GCC gcc (GCC) 3.2.320030502 (Red Hat Linux 3.2.3-58) (1 row) OK. This is RH Enterprise Server, then? relevant log line: Jul 19 08:33:36 mihin postgres[24584]: [2-1] LOG: unexpected EOF on client connection That's not a backend crash, that's just saying the client disconnected unexpectedly. That implies a client (psql) crash. Is there a server log-line saying you have a sig-11 crash? What's puzzling me is why shift+return is different from just plain return (which presumably works). I'd suspect readline or similar. Try something like rpm -q --requires postgresql-client (you'll need to check the details, haven't used rpm much recently) to see what packages psql is depending on. Then just check they look OK for your installation. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend