Re: [GENERAL] Using Postgres to store genetic data
On Fri, Jul 10, 2009 at 4:02 PM, Steve Crawford wrote: > > Peter Hunsberger wrote: >> >> We're looking at potentially using Postgres to store a variety of molecular >> and genetic data. At this point I have a bunch of general questions... > > I don't know enough about your area of expertise to know if this is useful, > but I'd look at the Unison project to see what they are doing. There are also > videos/slides up from a recent San Francisco PUG meeting at UCSF that might > be of interest: > > http://www.vimeo.com/3732938 > http://www.slideshare.net/linuxpoet/unison-ucsf-sfpug > Looks like it may be somewhat applicable to my first question. Thanks. Area of expertise is application architecture and design, focusing on clinical systems for medical research these last 7 years or so and medical in general for the last 15. Adding the genomic and molecular world to the mix is a new initiative. At the moment I've got enough domain knowledge to be dangerous... -- Peter Hunsberger -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL]
Hello- I am working on an e-commerce system that has different lists of products which contain many of the same products, at different prices. When a user searches for a certain set of part numbers, I would like the resulting products (and prices) to come from one of the lists, according to the list's priority. Each user can have a different set of lists and priorities. Table: product_lists id | name | priority | user_id +--+--+-- 5 | General List of Products| 2| 23 3 | Different List of Products | 3| 23 150 | Customer-Specific Products | 1| 23 Table: products product_list_id | part_number | price +-+--- 3| 92298A| 123.38 5| 92298A| 111.04 3| C39207| 78.38 150 | C39207| 67.93 Below is a simplified example of the structure of the query I am working with. I realize that in this case, I could re-factor all of this into one statement, but each sub-query in the real case has a more complex set of joins that determines the price. The pricing joins from one sub-query to the next vary, so a collection of sub-queries seemed to be a logical solution. Some part numbers are found in only one of the lists, while other part numbers are repeated across lists at different prices. This is what I would *like* to say: SELECT DISTINCT ON (part_number) * FROM ( SELECT product_list_id,part_number,price,priority FROM products, product_lists WHERE product_list_id=product_lists.id AND product_list_id=150 AND (part_number='92298A' OR part_number='C39207' OR part_number=...) UNION ALL SELECT product_list_id,part_number,price,priority FROM products, product_lists WHERE product_list_id= product_lists.id AND product_list_id=5 AND (part_number='92298A' OR part_number='C39207' OR part_number=...) UNION ALL SELECT product_list_id,part_number,price,priority FROM products, product_lists WHERE product_list_id= product_lists.id AND product_list_id=3 AND (part_number='92298A' OR part_number='C39207' OR part_number=...) ) AS filter_duplicates ORDER BY priority,part_number I need to ORDER BY priority so that, in the case of duplicates, the product from the desired list is returned first. Then the purpose of DISTINCT ON is to filter out any duplicate part numbers that have a lesser priority. But, the above statement fails because the DISTINCT ON expression must match the leftmost ORDER BY expression. However, inserting the priority into the DISTINCT ON expression means that all of the resulting tuples are unique, even though the part_number is the same. If anyone could suggest a solution or alternative approach, I would greatly appreciate it. Thank you, Clark -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Idle in transaction help
We have a java web page that will give us the stack trace of all the running JDBC connections inside our system. The problem is that we currently have no way of relating those stack traces back to a PID so the programmers can get the stack trace of the hung database connection. We use the JDBC connection pooling so there's no way to be sure what stack trace goes to what PID. I gave the developers the postgres call to get that backend PID through the JDBC connection a few days ago, but they don't have the resources to get the additional call built into their programs for up to 1-2 months. I'm working on the business side to get priorities changed, but it hasn't happened yet. Mostly because I've got Xymon watching for those conditions so I can correct them before we get calls into the helpdesk. Sorry, I'm rambling. Anyway, I'm trying to attack it from the database side out since I am not a programmer and can't help with that part. I can do simple CGIs with bash, but I don't know Java or C or even Perl yet for that matter. Since you guys are the experts, I'm looking for any way to attack this problem from the database side. The tips I've gotten about the JDBC driver and commits are helpful in that it gives our programmers things to watch out for that we didn't realize, and I'm open to any suggestions from the list about how I can help attack this. Since I'm ultimately responsible for database performance and I don't like being reduced to sitting on the sidelines I'm trying to see what if anything else my skills can contribute. As for patting you on the head, I was being sincere. And trying not to come off sounding like a cocky SOB. :-) Thanks, Scot Kreienkamp -Original Message- From: Scott Marlowe [mailto:scott.marl...@gmail.com] Sent: Friday, July 10, 2009 7:02 PM To: Scot Kreienkamp Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Idle in transaction help On Fri, Jul 10, 2009 at 4:40 PM, Scot Kreienkamp wrote: > Thanks scott, but I wrote a cgi to combine all of the process info and allow > me to kill errant queries. So I know how to track down the pid. Thanks for > trying to help though. :-) So, what are you looking for, a stack trace dump from java to look through maybe? (the one that kill -1 or whatever generates? It's been a few years.) That'll usually give you the context to find out which thread is where. P.s. no need to pat me on the head like the doggie. :) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] singletons per row in table AND locking response
On Tue, Jul 07, 2009 at 06:45:36PM -0700, Dennis Gearon wrote: > When locking is involved, does a transaction wait for access to a row > or table, or does it just fail back to the calling code? Would it be > up to my PHP code to keep hammeing for access to a row/table, or could > a user defined function do that? It defaults to waiting for the lock to be released, but will fail if you tell it to. Sounds as though you want to be using row-level locks, it will allow more concurrency where possible, but table level locks are also possible. PG also makes a distinction between "share" locks (multiple transactions can have a share lock on any table or row) and "update" locks (this locks out share and other update locks). -- Sam http://samason.me.uk/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [Fwd: Re: [GENERAL] How to trace client sql requests?]
On Jul 10, 2009, at 1:31 PM, James B. Byrne wrote: On Fri, July 10, 2009 16:10, hubert depesz lubaczewski wrote: truncate. but first simple question - did you commit the inserts? But if it were done with truncate then I would see truncate in the log file, yes? Second, I am working with PG through an ORM called ActiveRecord, part of the Rails framework. I do not see a COMMIT anywhere in the log. Should I? This is a test run using the Rails test, actually cucumber, environment. I know that they do some things differently with DB connections in this environment but I believe that this is limited to transactions and rollbacks. The thing is that this problem only arises when testing the script inside the test harness. In production it runs just fine. Further, if I list the contents of the table from a call in the test harness immediately prior to executing the script under test then the data is there. I'm not sure about cucumber but I'm fairly certain that the Rails test runs don't commit their transactions. That means that if the test that is not seeing the data is perhaps being run with a different db connection from that that inserted the data then it won't see it. Erik Jones, Database Administrator Engine Yard Support, Scalability, Reliability 866.518.9273 x 260 Location: US/Pacific IRC: mage2k -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] change location of postmaster.pid file?
Hi. Thanks for the quick and definitive answers to my questions. The information you provided will save me from wasting time and energy trying to see how far I could get otherwise. Thanks very much. Janet Tom Lane wrote: > Janet Jacobsen writes: > >> Is it possible to create a database cluster on a machine that >> has write access to the shared file system, shut down the >> Postgres server on that machine, and then start up the >> Postgres server on the machine that cannot write to the >> shared file system, and thereafter, *only query* the database. >> > > No. The pid file is only the first and smallest problem you'd run into > with a read-only database filesystem. > > regards, tom lane > > > On Fri, Jul 10, 2009 at 11:17 PM, Janet Jacobsen wrote: > >> > Is it possible to create a database cluster on a machine that >> > has write access to the shared file system, shut down the >> > Postgres server on that machine, and then start up the >> > Postgres server on the machine that cannot write to the >> > shared file system, and thereafter, *only query* the database. >> > > Postgres isn't really designed to work this way. It expects to have > write access and will occasionally still write stuff to disk even for > read-only queries. > > It won't work even a little bit before 8.3. For 8.3 or later you could > maybe make it work using vacuum freeze but there's no facility to > verify that it's really frozen everything and you'll still be taken by > surprise by queries which try to use temporary space for large sorts > or commands which start transactions that you didn't realize were > necessary. > > -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Idle in transaction help
On Jul 10, 2009, at 3:34 PM, Scott Marlowe wrote: Assuming that tracking down the process that's connected might help, you can use pg_stat_activity to find the port that the client is connecting from, then on the client machine, use lsof to hunt down the process that is connecting via that port. For instance, I connect from my laptop with two connections. One I do a begin; in and in the other I look it up like so: select * from pg_stat_activity where current_query ilike '%idle%trans%' and current_query not ilike 'select%'; datid | datname | procpid | usesysid | usename | current_query | waiting | xact_start | query_start | backend_start | client_addr | client_port ---+--+-+--+-- +---+-+--- +---+--- +--+- 11511 | postgres | 24893 |16413 | smarlowe | in transaction | f | 2009-07-10 16:20:15.056385-06 | 2009-07-10 16:20:15.056385-06 | 2009-07-10 15:27:48.944738-06 | 192.168.0.74 | 48727 The client port is 48727. Now, on my laptop I can do: sudo lsof |grep 48727 and I have this line in there: psql 27964 smarlowe3u IPv41114765 TCP steamboat:48727->192.168.0.247:postgresql (ESTABLISHED) Just a little tidbit for that: you can have lsof tell you what's got that port open directly, no need for grep: lsof -i tcp:48727 that way you keep the column headers in the output. Erik Jones, Database Administrator Engine Yard Support, Scalability, Reliability 866.518.9273 x 260 Location: US/Pacific IRC: mage2k -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Idle in transaction help
On Fri, Jul 10, 2009 at 4:40 PM, Scot Kreienkamp wrote: > Thanks scott, but I wrote a cgi to combine all of the process info and allow > me to kill errant queries. So I know how to track down the pid. Thanks for > trying to help though. :-) So, what are you looking for, a stack trace dump from java to look through maybe? (the one that kill -1 or whatever generates? It's been a few years.) That'll usually give you the context to find out which thread is where. P.s. no need to pat me on the head like the doggie. :) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BR/
On Fri, July 10, 2009 18:13, Alvaro Herrera wrote: > > Is it using a different PG connection than the one doing the > insert? In that case, it won't see the new row until the > inserting transaction commits. That is almost certainly the exact problem. I will check and determine if this is so but I it seems to me unavoidable that launching a new shell for the script under test will cause another, different, connection to be used. If this proves the case then I will report back. If not then no doubt you will hear from me as well. Thank you for illuminating this for me. > > BTW it seems necessary to clarify that LOCATION lines correspond > to the LOG/NOTICE/WARNING/ERROR line immediately _above_ it, not > the one below. > So noted, with thanks. Regards, -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:byrn...@harte-lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BR/
On Fri, Jul 10, 2009 at 4:53 PM, James B. Byrne wrote: > > On Fri, July 10, 2009 18:48, Scott Marlowe wrote: >> On Fri, Jul 10, 2009 at 2:13 PM, James B. >> Byrne wrote: >>> >>> 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) >>> hll_theheart_db_admin : LOCATION: exec_simple_query, >>> postgres.c:1105 >>> 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) >>> hll_theheart_db_admin : LOG: 0: duration: 0.782 ms >>> statement: >>> SELECT * FROM "currencies" >>> >>> The client program that receives this result reports that there >>> are >>> no rows returned. So where did they go"? >> >> Maybe there were no rows to return?? >> > > Clearly there are no rows. That is the problem. The question > begging an answer is: where are the rows added in the immediately > previous INSERTS? Trigger or rule put them somewhere else or ignored them? Hard to say without \d currencies -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BR/
On Fri, July 10, 2009 18:48, Scott Marlowe wrote: > On Fri, Jul 10, 2009 at 2:13 PM, James B. > Byrne wrote: >> >> 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) >> hll_theheart_db_admin : LOCATION: exec_simple_query, >> postgres.c:1105 >> 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) >> hll_theheart_db_admin : LOG: 0: duration: 0.782 ms >> statement: >> SELECT * FROM "currencies" >> >> The client program that receives this result reports that there >> are >> no rows returned. So where did they go"? > > Maybe there were no rows to return?? > Clearly there are no rows. That is the problem. The question begging an answer is: where are the rows added in the immediately previous INSERTS? -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:byrn...@harte-lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] change location of postmaster.pid file?
Greg Stark wrote: It won't work even a little bit before 8.3. For 8.3 or later you could maybe make it work using vacuum freeze but there's no facility to verify that it's really frozen everything and you'll still be taken by surprise by queries which try to use temporary space for large sorts or commands which start transactions that you didn't realize were necessary. or pg_stats or .. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BR/
On Fri, Jul 10, 2009 at 2:13 PM, James B. Byrne wrote: > > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOCATION: exec_simple_query, > postgres.c:1105 > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOG: 0: duration: 0.782 ms statement: > SELECT * FROM "currencies" > > The client program that receives this result reports that there are > no rows returned. So where did they go"? Maybe there were no rows to return?? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] change location of postmaster.pid file?
On Fri, Jul 10, 2009 at 11:17 PM, Janet Jacobsen wrote: > Is it possible to create a database cluster on a machine that > has write access to the shared file system, shut down the > Postgres server on that machine, and then start up the > Postgres server on the machine that cannot write to the > shared file system, and thereafter, *only query* the database. Postgres isn't really designed to work this way. It expects to have write access and will occasionally still write stuff to disk even for read-only queries. It won't work even a little bit before 8.3. For 8.3 or later you could maybe make it work using vacuum freeze but there's no facility to verify that it's really frozen everything and you'll still be taken by surprise by queries which try to use temporary space for large sorts or commands which start transactions that you didn't realize were necessary. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] change location of postmaster.pid file?
Janet Jacobsen writes: > Is it possible to create a database cluster on a machine that > has write access to the shared file system, shut down the > Postgres server on that machine, and then start up the > Postgres server on the machine that cannot write to the > shared file system, and thereafter, *only query* the database. No. The pid file is only the first and smallest problem you'd run into with a read-only database filesystem. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Idle in transaction help
Thanks scott, but I wrote a cgi to combine all of the process info and allow me to kill errant queries. So I know how to track down the pid. Thanks for trying to help though. :-) - Original Message - From: Scott Marlowe To: Scot Kreienkamp Cc: pgsql-general@postgresql.org Sent: Fri Jul 10 18:34:14 2009 Subject: Re: [GENERAL] Idle in transaction help On Fri, Jul 10, 2009 at 2:05 PM, Scot Kreienkamp wrote: > Hi everyone, > > I need some help with tracking down idle in transaction problems. We have a > custom application that is leaving queries in idle in transaction status for > unknown reasons. The developers are working on ways to track it down, but > right now the options on their end are limited and it will be at least 1-2 > months until they have something working. I am trying to track down the > problem from the PG end in the meantime. Is there any way to tell what > query is hanging in idle in transaction status? Or what the current or > previous query was/is, since idle in transaction doesn’t tell me anything? > I’m kind of at a loss on what if anything I can do from the database end to > help (read push) the programmers to find and fix this problem. > > > > My first priority is helping them find and fix the problem if I can. My > second priority is finding an automated way to deal with the idle in > transactions as they are locking tables and rows, causing other transactions > to hang also. None of the timeouts appear to affect idle in transactions. Assuming that tracking down the process that's connected might help, you can use pg_stat_activity to find the port that the client is connecting from, then on the client machine, use lsof to hunt down the process that is connecting via that port. For instance, I connect from my laptop with two connections. One I do a begin; in and in the other I look it up like so: select * from pg_stat_activity where current_query ilike '%idle%trans%' and current_query not ilike 'select%'; datid | datname | procpid | usesysid | usename | current_query | waiting | xact_start | query_start | backend_start | client_addr | client_port ---+--+-+--+--+---+-+---+---+---+--+- 11511 | postgres | 24893 |16413 | smarlowe | in transaction | f | 2009-07-10 16:20:15.056385-06 | 2009-07-10 16:20:15.056385-06 | 2009-07-10 15:27:48.944738-06 | 192.168.0.74 | 48727 The client port is 48727. Now, on my laptop I can do: sudo lsof |grep 48727 and I have this line in there: psql 27964 smarlowe3u IPv41114765 TCP steamboat:48727->192.168.0.247:postgresql (ESTABLISHED) Note that 27964 is the pid of the psql command that's connected to the server. Hope that helps a little.
Re: [GENERAL] Idle in transaction help
On Fri, Jul 10, 2009 at 2:05 PM, Scot Kreienkamp wrote: > Hi everyone, > > I need some help with tracking down idle in transaction problems. We have a > custom application that is leaving queries in idle in transaction status for > unknown reasons. The developers are working on ways to track it down, but > right now the options on their end are limited and it will be at least 1-2 > months until they have something working. I am trying to track down the > problem from the PG end in the meantime. Is there any way to tell what > query is hanging in idle in transaction status? Or what the current or > previous query was/is, since idle in transaction doesn’t tell me anything? > I’m kind of at a loss on what if anything I can do from the database end to > help (read push) the programmers to find and fix this problem. > > > > My first priority is helping them find and fix the problem if I can. My > second priority is finding an automated way to deal with the idle in > transactions as they are locking tables and rows, causing other transactions > to hang also. None of the timeouts appear to affect idle in transactions. Assuming that tracking down the process that's connected might help, you can use pg_stat_activity to find the port that the client is connecting from, then on the client machine, use lsof to hunt down the process that is connecting via that port. For instance, I connect from my laptop with two connections. One I do a begin; in and in the other I look it up like so: select * from pg_stat_activity where current_query ilike '%idle%trans%' and current_query not ilike 'select%'; datid | datname | procpid | usesysid | usename | current_query | waiting | xact_start | query_start | backend_start | client_addr | client_port ---+--+-+--+--+---+-+---+---+---+--+- 11511 | postgres | 24893 |16413 | smarlowe | in transaction | f | 2009-07-10 16:20:15.056385-06 | 2009-07-10 16:20:15.056385-06 | 2009-07-10 15:27:48.944738-06 | 192.168.0.74 | 48727 The client port is 48727. Now, on my laptop I can do: sudo lsof |grep 48727 and I have this line in there: psql 27964 smarlowe3u IPv41114765 TCP steamboat:48727->192.168.0.247:postgresql (ESTABLISHED) Note that 27964 is the pid of the psql command that's connected to the server. Hope that helps a little. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] change location of postmaster.pid file?
Hi. We are looking into the possibility of running a Postgres server on an underutilized machine. This machine has very little local disk space, so we would have to create the data directory on a shared file system. The underutilized machine was set up so that it can *only read* from the shared file system, i.e., cannot write to the shared file system. Is it possible to create a database cluster on a machine that has write access to the shared file system, shut down the Postgres server on that machine, and then start up the Postgres server on the machine that cannot write to the shared file system, and thereafter, *only query* the database. Since Postgres writes the postmaster.pid file to the data directory (which would be on the shared file system), the answer would appear to be no, since the 'underutilized' machine cannot write any files to the shared file system. Would it be possible to write the postmaster.pid file to the local file system on the 'underutilized' machine even though the data directory is on the shared file system? I realize that this seems like a bad idea - given that the purpose of the postmaster.pid file as I understand it is to prevent more than one postmaster running in a data directory - but I wanted to ask whether this is a possibility. Even if it were possible to write the postmaster.pid to the local file system on the 'underutilized' machine, does Postgres write other temporary files even if only SELECT statements are being executed against the database? And where does it write those files - in subdirectories of the data directory? Thank you, Janet -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] BR/
James B. Byrne wrote: > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOG: 0: duration: 0.782 ms statement: > SELECT * FROM "currencies" > > The client program that receives this result reports that there are > no rows returned. So where did they go"? Is it using a different PG connection than the one doing the insert? In that case, it won't see the new row until the inserting transaction commits. BTW it seems necessary to clarify that LOCATION lines correspond to the LOG/NOTICE/WARNING/ERROR line immediately _above_ it, not the one below. So if you see this: LOG: foo bar LOCATION: somewhere line N ERROR: baz qux LOCATION: another line you know what to make of it, and it's not this: LOCATION: somewhere line N ERROR: baz qux -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] psql language
Hiroshi Saito wrote: > Is the state where you wish this? > > example > C:\Program Files\PostgreSQL\8.4\bin>psql -p 5433 postgres postgres > psql (8.4.0) > "help" でヘルプを表示します. > > C:\Program Files\PostgreSQL\8.4\bin>set LANG=C > > C:\Program Files\PostgreSQL\8.4\bin>psql -p 5433 postgres postgres > psql (8.4.0) > Type "help" for help. > > Have I missed something? Saito-san, No, you missed nothing :-) That is exactly what I wanted. I thought I tried setting LANG before I posted but I must have done something wrong because it works fine now. ありがとうございます! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Idle in transaction help
Scot Kreienkamp wrote: It is Java. I asked our programmers to check on the JDBC version as I had seen that on the list previously. It is using postgresql-8.2-504. Is that one of the problem versions? I had thought it was new enough that it would not be subject to that problem. well, the current JDBC for 8.2 is build 510. see http://jdbc.postgresql.org/download.html It does appear the idle-in-transaction bug I'm remembering is pretty old... it was fixed in Version 8.0-dev302 (2004-06-15), heh. your version is from 2006-12-01. I would still consider upgrading, there have been a lot of fixes and enhancements between 504 and 510... See http://jdbc.postgresql.org/changes.html for a revision history. Remember, if you have autocommit OFF, then even read-only (select-only) connections need commits, or they stay in transaction. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] is autovacuum recommended?
On Fri, Jul 10, 2009 at 2:47 PM, Willy-Bas Loos wrote: > Hi, > > Thanks for your answers! > I'm using 8.1 and 8.2 on windows2003 servers, and it's true that i could > probably configure them much better. Note that support for 8.1 on windows is gone, as it is no longer considered supportable due to design / engineering issues. Upgrading them all to 8.3 or 8.4 is probably a good idea since they are better at autovacuum and such than 8.2 and before. > We've recently moved to brand new dedicated database servers with pg8.3 on > debian in 2 projects and it has been much easier to configure these > correctly. There I don't encounter the probems that i described. Smart move. The amount of effort needed to learn debian or any other linux distro is usually less than the amount of ongoing effort to keep a production pg server happy on windows. Also, 64 bit pgsql on 64 bit unix/linux is capable of better scaling and handling more memory. > The thing is that the whole concept of autovacuum is not feeling right. > Per design, the vacuum is likely to kick off when i am doing something big. That assumes that autovacuum always runs in some mode that must interfere with db operation. If you set the autovacuum_vacuum_cost_delay, autovacuum_vacuum_cost_limit, autovacuum_vacuum_scale_factor, autovacuum_vacuum_threshold parameters properly, as well as the free space map settings large enough to hold all your dead tuples, then autovacuum should not cause a lot of issues, unless your machine is already IO bound. And if it's already IO bound and dragging butt, then the problem isn't autovacuum, but a machine without enough IO bandwidth to do its job well. > And when i am doing something big, a vacuum is the last thing i'd wish for. I don't even notice when it kicks in on my servers. > I'd wish for a vacuum when the database is doing nothing at all, but the > autovacuum will NEVER kick off in such a moment. Again, if the delay and such are set right, then autovac will use so little IO as to be unnoticeable. > That's why i feel better scheduling the vacuum at times at which i know > things will be generally quiet. For some very large tables on slow IO machines, it makes sense to remove them from the purview of autovac, I agree. Take a look at the pg_autovacuum table. it's pretty easy to see how it works. > To be honest, i am a bit surprised that all 3 reactions recommend using > autovacuum, even if it means i have to buy a new server for this purpouse. Well, autovacuum is more primitive in the versions you're running, and more likely to get in the way. 8.3, and to a greater extent 8.4, remove a lot of these issues. > I was thinking that autovacuum was just a mechanism to ensure that postgres > works well out of the box, but that it would be recommended to schedule your > own vacuum tailored to your specific needs. > I agree though, that it is a tough tailoring job and that the autovacuum > must be doing a better job than i am. It just fires at the wrong time. And that's kind of the point, that it's better to have a db that runs a little slow than one that explodes in sheets of flame. > Just a thought (to think positively..): wouldn't it be possible to let the > autovacuum wait until the load goes down, or until the end of the > transaction that triggered the autovacuum? You could set up a cron job that updated the pg_autovacuum table at certain times to accomplish this. I'm not sure how easy it would be to program autovac to do the same thing. You could certainly set the cost delay higher than normal (like 20 or 40 ms) for some tables so that autovac didn't get in the way, but then you run the risk of it never keeping up, and on 8.1 with only one thread to autovac, that could be bad. Definitely consider upgrading pg versions on your windows machines. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Using Postgres to store genetic data
Peter Hunsberger wrote: We're looking at potentially using Postgres to store a variety of molecular and genetic data. At this point I have a bunch of general questions... I don't know enough about your area of expertise to know if this is useful, but I'd look at the Unison project to see what they are doing. There are also videos/slides up from a recent San Francisco PUG meeting at UCSF that might be of interest: http://www.vimeo.com/3732938 http://www.slideshare.net/linuxpoet/unison-ucsf-sfpug -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] is autovacuum recommended?
Hi, Thanks for your answers! I'm using 8.1 and 8.2 on windows2003 servers, and it's true that i could probably configure them much better. We've recently moved to brand new dedicated database servers with pg8.3 on debian in 2 projects and it has been much easier to configure these correctly. There I don't encounter the probems that i described. The thing is that the whole concept of autovacuum is not feeling right. Per design, the vacuum is likely to kick off when i am doing something big. And when i am doing something big, a vacuum is the last thing i'd wish for. I'd wish for a vacuum when the database is doing nothing at all, but the autovacuum will NEVER kick off in such a moment. That's why i feel better scheduling the vacuum at times at which i know things will be generally quiet. To be honest, i am a bit surprised that all 3 reactions recommend using autovacuum, even if it means i have to buy a new server for this purpouse. I was thinking that autovacuum was just a mechanism to ensure that postgres works well out of the box, but that it would be recommended to schedule your own vacuum tailored to your specific needs. I agree though, that it is a tough tailoring job and that the autovacuum must be doing a better job than i am. It just fires at the wrong time. Just a thought (to think positively..): wouldn't it be possible to let the autovacuum wait until the load goes down, or until the end of the transaction that triggered the autovacuum? Cheers, WBL
Re: [GENERAL] Inserted data is disappearing
On Fri, July 10, 2009 16:20, Bill Moran wrote: > > > Also, look for a BEGIN statement that is never COMMITed. If > the client starts a transaction, INSERTs a bunch of stuff, then > disconnects without issuing a COMMIT, Postgres will rollback > the transaction, thus it will be as if the data was never > inserted. > There is one ROLLBACK statement, but it occurs after all of the problems have evidenced themselves and not before. I believe this to be the Rails test harness unrolling the transaction that it wraps all test runs in. There is one BEGIN. This is located close to the very top of the run log, which seems congruent with the one ROLLBACK just before the very end. Evidently, all this test processing takes place within a single, never completed, transaction. -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:byrn...@harte-lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Idle in transaction help
Hi John, It is Java. I asked our programmers to check on the JDBC version as I had seen that on the list previously. It is using postgresql-8.2-504. Is that one of the problem versions? I had thought it was new enough that it would not be subject to that problem. The unexplained part is why are there locks acquired, sometimes on the row level, prior to the connection going to idle in transaction status? That makes me think it's not the JDBC driver. Thanks, Scot Kreienkamp -Original Message- From: John R Pierce [mailto:pie...@hogranch.com] Sent: Friday, July 10, 2009 4:21 PM To: Scot Kreienkamp Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Idle in transaction help Scot Kreienkamp wrote: > > Hi everyone, > > I need some help with tracking down idle in transaction problems. We > have a custom application that is leaving queries in idle in > transaction status for unknown reasons. The developers are working on > ways to track it down, but right now the options on their end are > limited and it will be at least 1-2 months until they have something > working. I am trying to track down the problem from the PG end in the > meantime. Is there any way to tell what query is hanging in idle in > transaction status? Or what the current or previous query was/is, > since idle in transaction doesn't tell me anything? I'm kind of at a > loss on what if anything I can do from the database end to help (read > push) the programmers to find and fix this problem. > there is no active query, thats why its idle. they did a "BEGIN" to start a transaction, then left the connection idle. is this software, by any chance, Java based? older versions of the Postgres JDBC module had a nasty habit of doing this, as JDBC autogenerates the BEGIN if its not in autocommit mode. the older version would generate the begin immediately after a COMMIT or ROLLBACK to prepare for the next transaction, and if the app simply stopped using the connection, it was left IDLE IN TRANSACTION. The updated version postpones the BEGIN until you issue your first query. if you enable statement logging and set up a log prefix to show the Process ID (and I usually prefix with a timestamp, database name and other useful stuff), then you can grep the logs for the PID of the IDLE IN TRANSACTION process. Note logging all statements is pretty CPU and disk intensive, so likely will impact your system performance, so should only be done for debug purposes. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[Fwd: Re: [GENERAL] How to trace client sql requests?]
On Fri, July 10, 2009 16:10, hubert depesz lubaczewski wrote: > > truncate. but first simple question - did you commit the inserts? > But if it were done with truncate then I would see truncate in the log file, yes? Second, I am working with PG through an ORM called ActiveRecord, part of the Rails framework. I do not see a COMMIT anywhere in the log. Should I? This is a test run using the Rails test, actually cucumber, environment. I know that they do some things differently with DB connections in this environment but I believe that this is limited to transactions and rollbacks. The thing is that this problem only arises when testing the script inside the test harness. In production it runs just fine. Further, if I list the contents of the table from a call in the test harness immediately prior to executing the script under test then the data is there. I do not know what is going on. -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:byrn...@harte-lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:byrn...@harte-lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Idle in transaction help
Scot Kreienkamp wrote: Hi everyone, I need some help with tracking down idle in transaction problems. We have a custom application that is leaving queries in idle in transaction status for unknown reasons. The developers are working on ways to track it down, but right now the options on their end are limited and it will be at least 1-2 months until they have something working. I am trying to track down the problem from the PG end in the meantime. Is there any way to tell what query is hanging in idle in transaction status? Or what the current or previous query was/is, since idle in transaction doesn’t tell me anything? I’m kind of at a loss on what if anything I can do from the database end to help (read push) the programmers to find and fix this problem. there is no active query, thats why its idle. they did a "BEGIN" to start a transaction, then left the connection idle. is this software, by any chance, Java based? older versions of the Postgres JDBC module had a nasty habit of doing this, as JDBC autogenerates the BEGIN if its not in autocommit mode. the older version would generate the begin immediately after a COMMIT or ROLLBACK to prepare for the next transaction, and if the app simply stopped using the connection, it was left IDLE IN TRANSACTION. The updated version postpones the BEGIN until you issue your first query. if you enable statement logging and set up a log prefix to show the Process ID (and I usually prefix with a timestamp, database name and other useful stuff), then you can grep the logs for the PID of the IDLE IN TRANSACTION process. Note logging all statements is pretty CPU and disk intensive, so likely will impact your system performance, so should only be done for debug purposes. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Inserted data is disappearing
In response to "James B. Byrne" : > > This is a portion of the log for the most recent run that exhibits > the problem: > > ... > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOG: 0: duration: 0.446 ms statement: > INSERT INTO "currencies" ("is_invoicable", "is_payable", > "changed_by", "created_by", "premium_factor", "discount_factor", > "effective_from", "currency_name", "superseded_after", "changed_at", > "currency_code", "created_at") VALUES('f', 'f', E'not available', > E'not available', 0.9, 1.1, '1785-07-06 04:56:02.00', E'United > States Dollar', NULL, '2009-07-10 19:59:17', E'USD', '2009-07-10 > 19:59:17.634473') > > That seems a valid insert. > > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOCATION: exec_simple_query, > postgres.c:1105 > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOG: 0: duration: 0.172 ms statement: > SELECT currval('currencies_id_seq') > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOCATION: exec_simple_query, > postgres.c:1105 > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOG: 0: duration: 0.067 ms statement: > RELEASE SAVEPOINT active_record_1 > > This seems ok but the absence of proof does not ensure the absence > of error. If the insert failed would I see this fact reflected in a > log entry? Unless you have some really bizarre config in your postgresql.conf, then a failure of that insert would result in a logged error message. Why not just intentionally try an invalid insert statement to be sure that it logs. > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOCATION: exec_simple_query, > postgres.c:1105 > 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) > hll_theheart_db_admin : LOG: 0: duration: 0.782 ms statement: > SELECT * FROM "currencies" > > The client program that receives this result reports that there are > no rows returned. So where did they go"? What happens between the INSERT and the SELECT? Are there DELETE, TRUNCATE, or ROLLBACK statements? Also, look for a BEGIN statement that is never COMMITed. If the client starts a transaction, INSERTs a bunch of stuff, then disconnects without issuing a COMMIT, Postgres will rollback the transaction, thus it will be as if the data was never inserted. HTH. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] BR/
I am sorry for this but I do not know how else to communicate what is apparently happening: This is a portion of the log for the most recent run that exhibits the problem: ... 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) hll_theheart_db_admin : LOG: 0: duration: 0.446 ms statement: INSERT INTO "currencies" ("is_invoicable", "is_payable", "changed_by", "created_by", "premium_factor", "discount_factor", "effective_from", "currency_name", "superseded_after", "changed_at", "currency_code", "created_at") VALUES('f', 'f', E'not available', E'not available', 0.9, 1.1, '1785-07-06 04:56:02.00', E'United States Dollar', NULL, '2009-07-10 19:59:17', E'USD', '2009-07-10 19:59:17.634473') That seems a valid insert. 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) hll_theheart_db_admin : LOCATION: exec_simple_query, postgres.c:1105 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) hll_theheart_db_admin : LOG: 0: duration: 0.172 ms statement: SELECT currval('currencies_id_seq') 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) hll_theheart_db_admin : LOCATION: exec_simple_query, postgres.c:1105 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) hll_theheart_db_admin : LOG: 0: duration: 0.067 ms statement: RELEASE SAVEPOINT active_record_1 This seems ok but the absence of proof does not ensure the absence of error. If the insert failed would I see this fact reflected in a log entry? 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) hll_theheart_db_admin : LOCATION: exec_simple_query, postgres.c:1105 2009-07-10 15:59:17 EDT hll_theheart_test 216.185.71.24(49133) hll_theheart_db_admin : LOG: 0: duration: 0.782 ms statement: SELECT * FROM "currencies" The client program that receives this result reports that there are no rows returned. So where did they go"? -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:byrn...@harte-lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Using Postgres to store genetic data
We're looking at potentially using Postgres to store a variety of molecular and genetic data. At this point I have a bunch of general questions which I can take to other lists if someone can tell me where they would be most appropriate: 1) are there groups or individuals already doing this that have things they can share? I've seen the BLASTgres work and it needs rework to compile against 8.3 but might be potentially useful some of our forms of data; 2) if we do decided to implement our own data types where's the best place for discussion about such efforts? 3) Can anyone talk about what it would take to add columnar indexes to Postgres? Where would be the best place for discussion about what this would take? I can go into more details here if appropriate and if that will help... -- Peter Hunsberger
Re: [GENERAL] How to trace client sql requests?
On Fri, Jul 10, 2009 at 03:45:35PM -0400, James B. Byrne wrote: > I believe that this is what I want to examine. Is there a server > side technique that I can use which will tell me what data this > statement returned or if it found nothing? not really, sorry. > In any case, I see the INSERTS and I can find NO DELETES at all. Is > there any other way to remove some or all data from a table? truncate. but first simple question - did you commit the inserts? Best regards, depesz -- Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/ jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Idle in transaction help
Hi everyone, I need some help with tracking down idle in transaction problems. We have a custom application that is leaving queries in idle in transaction status for unknown reasons. The developers are working on ways to track it down, but right now the options on their end are limited and it will be at least 1-2 months until they have something working. I am trying to track down the problem from the PG end in the meantime. Is there any way to tell what query is hanging in idle in transaction status? Or what the current or previous query was/is, since idle in transaction doesn't tell me anything? I'm kind of at a loss on what if anything I can do from the database end to help (read push) the programmers to find and fix this problem. My first priority is helping them find and fix the problem if I can. My second priority is finding an automated way to deal with the idle in transactions as they are locking tables and rows, causing other transactions to hang also. None of the timeouts appear to affect idle in transactions. I'm on PG 8.3.7. Upgrading to 8.4 won't be an option for several months, enough time for the developers to test and develop against 8.4. Thanks for any help you can give me. Scot Kreienkamp
Re: [GENERAL] How to trace client sql requests?
On Fri, July 10, 2009 14:58, hubert depesz lubaczewski wrote: > You can enable by database: > > alter database x set log_min_duration_statement = 0; Many, many thanks. Now of course I need more help... The situation is that data inserted into the DB is not being found on a subsequent select and I am unaware of any deletes being done. So, I am hoping to find where the data is going or why the select is not working. This is the critical insert: ... 2009-07-10 15:13:00 EDT hll_theheart_test 216.185.71.24(57637) hll_theheart_db_admin : LOCATION: exec_simple_query, postgres.c:1105 2009-07-10 15:13:00 EDT hll_theheart_test 216.185.71.24(57637) hll_theheart_db_admin : LOG: 0: duration: 1.366 ms statement: INSERT INTO "currencies" ("is_invoicable", "is_payable", "changed_by", "created_by", "premium_factor", "discount_factor", "effective_from", "currency_name", "superseded_after", "changed_at", "currency_code", "created_at") VALUES('f', 'f', E'not available', E'not available', 1.0, 1.0, '1858-01-01 04:56:02.00', E'Canadian Dollar', NULL, '2009-07-10 19:13:00', E'CAD', '2009-07-10 19:13:00.151885') 2009-07-10 15:13:00 EDT hll_theheart_test 216.185.71.24(57637) hll_theheart_db_admin : LOCATION: exec_simple_query, postgres.c:1105 2009-07-10 15:13:00 EDT hll_theheart_test 216.185.71.24(57637) hll_theheart_db_admin : LOG: 0: duration: 0.379 ms statement: SELECT currval('currencies_id_seq') 2009-07-10 15:13:00 EDT hll_theheart_test 216.185.71.24(57637) hll_theheart_db_admin : LOCATION: exec_simple_query, postgres.c:1105 2009-07-10 15:13:00 EDT hll_theheart_test 216.185.71.24(57637) hll_theheart_db_admin : LOG: 0: duration: 0.073 ms statement: RELEASE SAVEPOINT active_record_1 ... This seems to have worked. Would the log show if it did not? The I see a bunch of these: 2009-07-10 15:13:02 EDT hll_theheart_test 216.185.71.24(57638) hll_theheart_db_admin : LOG: 0: duration: 0.082 ms statement: SET client_min_messages TO 'notice' 2009-07-10 15:13:02 EDT hll_theheart_test 216.185.71.24(57638) hll_theheart_db_admin : LOCATION: exec_simple_query, postgres.c:1105 2009-07-10 15:13:02 EDT hll_theheart_test 216.185.71.24(57638) hll_theheart_db_admin : LOG: 0: duration: 6.155 ms statement: SELECT a.attname, format_type(a.atttypid, a.atttypmod), d.adsrc, a.attnotnull 2009-07-10 15:13:02 EDT hll_theheart_test 216.185.71.24(57638) hll_theheart_db_admin : LOCATION: exec_simple_query, postgres.c:1105 2009-07-10 15:13:02 EDT hll_theheart_test 216.185.71.24(57638) hll_theheart_db_admin : LOG: 0: duration: 1.285 ms statement: SELECT a.attname, format_type(a.atttypid, a.atttypmod), d.adsrc, a.attnotnull and finally, I get a long list of these: 2009-07-10 15:13:02 EDT hll_theheart_test 216.185.71.24(57638) hll_theheart_db_admin : LOG: 0: duration: 1.779 ms statement: SELECT * FROM "currencies" WHERE ("currencies"."currency_code" = E'CAD') LIMIT 1 ... I believe that this is what I want to examine. Is there a server side technique that I can use which will tell me what data this statement returned or if it found nothing? In any case, I see the INSERTS and I can find NO DELETES at all. Is there any other way to remove some or all data from a table? -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:byrn...@harte-lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
> Hm, I'm not sure I believe any of that except the last bit, seeing that > he's got plenty of excess CPU capability. But the last bit fits with > the wimpy-I/O problem, and it also offers something we could test. > Dan, please see what happens when you vary the wal_buffers setting. > (Note you need a postmaster restart to change that.) > > regards, tom lane > Ok, I tried a few different values - 32kb, 64kb, 512kb, 2MB and 10MB. I'm not seeing any highly noticeable change in behaviour with any setting - it wasn't a scientific test, but I seem to have about the same size hiccup with each setting. The hiccup may be slightly shorter with the 10MB setting, but barely, if it is. Thanks, Dan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to trace client sql requests?
On Fri, Jul 10, 2009 at 01:38:57PM -0400, James B. Byrne wrote: > I have a situation with a Rails project where test data in > mysteriously "disappearing" in the middle of a test run. I would > like to see the exact SQL of all client requests issued against a > single table during a fixed time span. > How can I best accomplish this in PostgreSQL? > #client_min_messages = notice > #log_min_messages = notice > #log_min_duration_statement = -1 > ... > #log_duration = off > > Which of these, if any, should I alter; and to what? Am I I prefer to set log_min_duration_statement to 0. It will log all queries and their running time. > constrained to system wide logging or can this be enabled by > database? You can enable by database: alter database x set log_min_duration_statement = 0; Best regards, depesz -- Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/ jid/gtalk: dep...@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
On Fri, 2009-07-10 at 18:47 +0100, Greg Stark wrote: > > -- foo has a primary key > Well no, it's equivalent to SELECT DISTINCT * FROM foo; I think you missed that "foo" has a primary key. Regards, Jeff Davis -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
On Fri, Jul 10, 2009 at 11:47 AM, Greg Stark wrote: > On Fri, Jul 10, 2009 at 6:37 PM, Jeff Davis wrote: >> >> -- foo has a primary key >> SELECT * FROM foo UNION SELECT * FROM foo; >> >> That's logically equivalent to: >> >> SELECT * FROM foo; >> >> But postgresql will add a sort anyway. > > > Well no, it's equivalent to SELECT DISTINCT * FROM foo; And honestly, I'd rather see development effort go into making complex queries run faster (the things like bitmap indexes on disk etc) rather than optimising things that i can optimise myself. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] XML import with DTD
Post a snippet of the xml and xpath you are trying to use. Scott - Original Message - From: "Roy Walter" To: pgsql-general@postgresql.org Sent: Friday, July 10, 2009 7:49:00 AM GMT -08:00 US/Canada Pacific Subject: [GENERAL] XML import with DTD Hi I'm trying to use the XPath functionality of Postgres. I can populate a text field (unparsed) with XML data but as far as I can see the xpath() function [now] only works on the xml data type. When I try to populate a text field with XML data containing a DTD, however, the parser chokes. If I strip the DTD the parser chokes on undefined entities which are defined in the DTD. (I switched the app' to from MySQL to Postgres because while MySQL works it returns matches in undelimited form which is next to useless if, for example, you return multiple attributes from a node.) Does anyone know of a solution to this problem? Windows 2000 Server Postgres 8.4 Regards Roy Walter
Re: [GENERAL] Overhead of union versus union all
On Fri, Jul 10, 2009 at 6:37 PM, Jeff Davis wrote: > > -- foo has a primary key > SELECT * FROM foo UNION SELECT * FROM foo; > > That's logically equivalent to: > > SELECT * FROM foo; > > But postgresql will add a sort anyway. Well no, it's equivalent to SELECT DISTINCT * FROM foo; -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How to trace client sql requests?
I have a situation with a Rails project where test data in mysteriously "disappearing" in the middle of a test run. I would like to see the exact SQL of all client requests issued against a single table during a fixed time span. How can I best accomplish this in PostgreSQL? #client_min_messages = notice #log_min_messages = notice #log_min_duration_statement = -1 ... #log_duration = off Which of these, if any, should I alter; and to what? Am I constrained to system wide logging or can this be enabled by database? Regards, -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:byrn...@harte-lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] REINDEX "is not a btree"
On Jul 10, 2009, at 6:47 AM, Vanessa Lopez wrote: I discovered the table that was causing the error, delete it and create it again (I miss some data but at least everything else is working now) Yes, for the backup we copy everything we had under /data (the directory containing "base", "global", and so on ... we do backups every day from the server), and then we restore the whole /data directory at once ... but it did not solve the problem .. Given the problems you've had, I strongly suggest you take a pg_dump of the database, restore that dump, and use the restored copy. I bet there's probably other problems lurking in your database. -- Decibel!, aka Jim C. Nasby, Database Architect deci...@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
On Fri, 2009-07-10 at 14:22 +0100, Simon Riggs wrote: > I mean it seems possible to prove that the distinct removal step is not > necessary, by proving that the various sub-queries are already disjoint. > It's a common manual optimization, so automating it seems a reasonable > future goal. There are even simpler cases that postgresql can't optimize. Consider: -- foo has a primary key SELECT * FROM foo UNION SELECT * FROM foo; That's logically equivalent to: SELECT * FROM foo; But postgresql will add a sort anyway. There are lots of optimizations along these lines. They seem obscure, but these optimizations become much more useful when using views or complex queries where the same table appears multiple times. For instance, if you have two views that are projections of the same table, then, you join the views together, you can optimize away the join in some cases, and just scan the original table. I think a lot of these optimizations depend on knowing which tables (or subqueries) are relations in the relational theory sense; i.e. unordered, distinct, and have no NULLs in the relevant attributes. Regards, Jeff Davis -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT DISTINCT very slow
On Fri, 2009-07-10 at 01:36 +0100, Greg Stark wrote: > Arguably the missing feature here is skip-scans where we scan the > index but only pull out one record for each distinct value. I'm not > sure there's anything particularly stopping Postgres from being able > to do them, but it might be a lot of code for a narrow use case. Hypothetically, would something like a "sort distinct" operator be of any use? I wonder how much work it would save if the sort could save steps by weeding out duplicate tuples while sorting. That might make sort into a better plan in cases where don't have a good estimate of the distinct values. Regards, Jeff Davis -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] UNION question
t == t...@sss.pgh.pa.us writes: t> Brandon Metcalf writes: t> > I tried moving the last group of WHERE, GROUP BY, and ORDER BY before t> > the UNION with the query it belongs to, but that results in a t> > different syntax error. t> I think that's probably what you want to do. What you're missing is t> you need parentheses to put an ORDER BY into an arm of a UNION: t> (SELECT ... ORDER BY ...) UNION SELECT ... t> Otherwise it wants to consider the ORDER BY as applying to the UNION t> output. Indeed. It was the fact that I was trying to move the ORDER BY along with WHERE and GROUP BY that was giving me grief. Thanks. -- Brandon -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] UNION question
M == matthew.hart...@krcc.on.ca writes: M> > > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- M> > > ow...@postgresql.org] On Behalf Of Brandon Metcalf M> > > Sent: Friday, July 10, 2009 12:16 PM M> > M> > Change it to this: M> Sorry, I forgot that you need to split the GROUP BY clause as well in a M> similar manner to the WHERE clause. And unless you have duplicate rows M> to eliminate, use UNION ALL rather than UNION for a speed increase. Thanks. Got it to work. -- Brandon -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] UNION question
Brandon Metcalf writes: > I tried moving the last group of WHERE, GROUP BY, and ORDER BY before > the UNION with the query it belongs to, but that results in a > different syntax error. I think that's probably what you want to do. What you're missing is you need parentheses to put an ORDER BY into an arm of a UNION: (SELECT ... ORDER BY ...) UNION SELECT ... Otherwise it wants to consider the ORDER BY as applying to the UNION output. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] UNION question
> > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > > ow...@postgresql.org] On Behalf Of Brandon Metcalf > > Sent: Friday, July 10, 2009 12:16 PM > > Change it to this: Sorry, I forgot that you need to split the GROUP BY clause as well in a similar manner to the WHERE clause. And unless you have duplicate rows to eliminate, use UNION ALL rather than UNION for a speed increase. Matthew Hartman Programmer/Analyst Information Management, ICP Kingston General Hospital .now. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] UNION question
> -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of Brandon Metcalf > Sent: Friday, July 10, 2009 12:16 PM Change it to this: > SELECT t.name AS machine_type_name, > j.workorder, > round(sum(EXTRACT(epoch FROM(j.clockout- > j.clockin))/3600/w.quantity_made)::numeric,2) > AS avgtime > NULLAS employees > FROM jobclock j > JOIN employee e ON e.employee_id=j.employee_id > JOIN machine m ON m.machine_id=j.machine_id > JOIN machine_type t ON t.machine_type_id=m.machine_type_id > JOIN workorder wON w.workorder=j.workorder > JOIN part p ON p.part_id=w.part_id > WHERE p.part_id=379 > UNION > SELECT t.name AS machine_type_name, > NULLAS workorder, > h.time AS avgtime, > employees > FROM part_time_historical h > JOIN machine_type t ON > t.machine_type_id=h.machine_type_id > WHERE h.part_id=379 AND h.machine_type_id=1 > WHERE t.machine_type_id=1 > GROUP BY t.name,j.workorder > ORDER BY avgtime Matthew Hartman Programmer/Analyst Information Management, ICP Kingston General Hospital.now. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] UNION question
Is the following even possible? I keep getting a syntax error at the last WHERE: ERROR: syntax error at or near "WHERE" LINE 20: WHERE p.part_id=379 AND t.machine_type_id=1 The SQL is SELECT t.name AS machine_type_name, j.workorder, round(sum(EXTRACT(epoch FROM(j.clockout- j.clockin))/3600/w.quantity_made)::numeric,2) AS avgtime NULLAS employees FROM jobclock j JOIN employee e ON e.employee_id=j.employee_id JOIN machine m ON m.machine_id=j.machine_id JOIN machine_type t ON t.machine_type_id=m.machine_type_id JOIN workorder wON w.workorder=j.workorder JOIN part p ON p.part_id=w.part_id UNION SELECT t.name AS machine_type_name, NULLAS workorder, h.time AS avgtime, employees FROM part_time_historical h JOIN machine_type t ON t.machine_type_id=h.machine_type_id WHERE h.part_id=379 AND h.machine_type_id=1 WHERE p.part_id=379 AND t.machine_type_id=1 GROUP BY t.name,j.workorder ORDER BY avgtime I tried moving the last group of WHERE, GROUP BY, and ORDER BY before the UNION with the query it belongs to, but that results in a different syntax error. I'm basically looking to concatenate these two results. -- Brandon -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
Simon Riggs writes: > I think its a traffic jam. > After checkpoint in XLogInsert(), we discover that we now have to backup > a block that we didn't think so previously. So we have to drop the lock > and then re-access WALInsertLock. So every backend has to go through the > queue twice the first time it tries to write WAL immediately after a > checkpoint. Also, suddenly, every block needs to be copied to WAL, so > the CRC checks make each lock holder take longer than normal, so the > whole queue begins to backup. Then, because of wal_buffers being small > we find that the increased volume of WAL being written causes > WALInsertLock to be held across I/O. Hm, I'm not sure I believe any of that except the last bit, seeing that he's got plenty of excess CPU capability. But the last bit fits with the wimpy-I/O problem, and it also offers something we could test. Dan, please see what happens when you vary the wal_buffers setting. (Note you need a postmaster restart to change that.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
On Fri, 2009-07-10 at 10:27 -0400, Tom Lane wrote: > Simon Riggs writes: > > ISTM more likely to be a problem with checkpointing clog or subtrans. > > That would block everybody and the scale of the problem is about right. > > That's what I had been thinking too, but the log_checkpoint output > conclusively disproves it: those steps are taking less than 20msec. OK, I was looking at total -write, not total - write - sync. I think its a traffic jam. After checkpoint in XLogInsert(), we discover that we now have to backup a block that we didn't think so previously. So we have to drop the lock and then re-access WALInsertLock. So every backend has to go through the queue twice the first time it tries to write WAL immediately after a checkpoint. Also, suddenly, every block needs to be copied to WAL, so the CRC checks make each lock holder take longer than normal, so the whole queue begins to backup. Then, because of wal_buffers being small we find that the increased volume of WAL being written causes WALInsertLock to be held across I/O. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] REINDEX "is not a btree"
On Friday 10 July 2009, Vanessa Lopez wrote: > What do you mean by we can't simply take a filesystem copy of a > running database? :-O ... How should we then do the backups (so next > time I will not have the same problem again) ? There is extensive documentation on how to do backups. For filesystem backups, see PITR. You might also want to examine all your backup strategies - most running applications are not happy about being backed up without taking special steps to ensure data consistency. -- Anyone who believes exponential growth can go on forever in a finite world, is either a madman or an economist. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] REINDEX "is not a btree"
Vanessa Lopez writes: > What do you mean by we can't simply take a filesystem copy of a > running database? :-O ... How should we then do the backups (so next > time I will not have the same problem again) ? Read the fine manual ... http://www.postgresql.org/docs/8.3/static/backup.html Section 24.2 explains the pitfalls of trying to use a filesystem-level backup. It is possible to do, but you have to be very very careful to get a consistent snapshot. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SPI_ERROR_CONNECT within pl/pgsql, PG 8.4
Marek Lewczuk writes: > I have made an upgrade to PG 8.4 and following error was thrown during > execution of some pl/pgsql function: > ERROR: XX000: SPI_connect failed: SPI_ERROR_CONNECT Really? Could we see a self-contained example? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] XML import with DTD
Hi I'm trying to use the XPath functionality of Postgres. I can populate a text field (unparsed) with XML data but as far as I can see the xpath() function [now] only works on the xml data type. When I try to populate a text field with XML data containing a DTD, however, the parser chokes. If I strip the DTD the parser chokes on undefined entities which are defined in the DTD. (I switched the app' to from MySQL to Postgres because while MySQL works it returns matches in undelimited form which is next to useless if, for example, you return multiple attributes from a node.) Does anyone know of a solution to this problem? Windows 2000 Server Postgres 8.4 Regards Roy Walter
[GENERAL] SPI_ERROR_CONNECT within pl/pgsql, PG 8.4
Hello, I have made an upgrade to PG 8.4 and following error was thrown during execution of some pl/pgsql function: ERROR: XX000: SPI_connect failed: SPI_ERROR_CONNECT CONTEXT: PL/pgSQL function "price_aftertrigger" line 30 at IF SQL statement "update price set validFrom = $1 , validTo = $2 where id = $3 [ $4 ][1]::Integer" PL/pgSQL function "price_rebuildpricelistvalidity" line 54 at SQL statement SQL statement "SELECT price_rebuildPriceListValidity( $1 )" PL/pgSQL function "price_aftertrigger" line 54 at PERFORM LOCATION: plpgsql_call_handler, pl_handler.c:77 As I said it was thrown within PG8.4, but when it was executed within PG8.3.5 there were no problems at all. Second execution of same statement didn't throw an error too. Can anyone explain what it means. Best regards, ML -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] REINDEX "is not a btree"
Hello, Thanks for all your answers! I discovered the table that was causing the error, delete it and create it again (I miss some data but at least everything else is working now) Yes, for the backup we copy everything we had under /data (the directory containing "base", "global", and so on ... we do backups every day from the server), and then we restore the whole /data directory at once ... but it did not solve the problem .. What do you mean by we can't simply take a filesystem copy of a running database? :-O ... How should we then do the backups (so next time I will not have the same problem again) ? Millions of thanks again! Vanessa On 10 Jul 2009, at 04:06, decibel wrote: On Jul 4, 2009, at 8:06 AM, Craig Ringer wrote: On Fri, 2009-07-03 at 15:00 +0100, Vanessa Lopez wrote: I don't know much about postgre, I have no clue what else I can do. Please, please any help is very very much appreciated I have lots of databases and months of work in postgre (also lots of backups for the data in /data) When you say "in /data", do you mean the directory that contains the directories "pg_xlog", "base", "global", "pg_clog", etc ? Did you back up and restore the WHOLE data directory at once? Or did you restore only parts of it? And how exactly did you make the backups? You can't simply take a filesystem copy of a running database; that won't work. -- Decibel!, aka Jim C. Nasby, Database Architect deci...@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Database storage
On Fri, Jul 10, 2009 at 8:43 AM, John R Pierce wrote: > nabble.30.miller_2...@spamgourmet.com wrote: >> >> The database server is a quad core machine, so it sounds as though >> software RAID should work fine for the present setup. However, it >> sounds as though I should put some money into a hardware RAID >> controller if the database becomes more active. I had assumed RAID-5 >> would be fine, but please let me know if there is another RAID level >> more appropriate for this implementation. Thanks for the valuable >> insight! >> > > raid-5 performs very poorly on random small block writes, which is hte > majority of what databases do. raid10 is the preferred raid for databases. > > > > btw: re earlier discussion of raid controllers vs software... I'm surprised > nooone mentioned that a 'real' raid controller with battery backed writeback > cache can hugely speed up committed 8kbyte block random writes, which are > quite often the big bottleneck in a transactional database. Given that the OP's usage pattern was bulk imports and reporting queries it didn't seem very important. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Database storage
In response to nabble.30.miller_2...@spamgourmet.com: > > On Thu, Jul 9, 2009 at 7:29 PM, Greg Stark wrote: > >> On Fri, Jul 10, 2009 at 1:28 AM, Scott Marlowe > >> wrote: > >>> > >>> $750 is about what a decent RAID controller would cost you, but again > >>> it's likely that given your bulk import scenario, you're probably ok > >>> without one. In this instance, you're probably best off with software > >>> RAID than a cheap RAID card which will cost extra and probably be > >>> slower than linux software RAID. > ... > >> The main advantage of hardware raid is the error handling. When you > >> get low level errors or pull a drive a lot of consumer level > >> controllers and their drivers don't respond very well and have long > >> timeouts or keep retrying tragically unaware that the software raid > >> would be able to handle recoverying. A good server-class RAID > >> controller should handle those situations without breaking a sweat. > > > Definitely a big plus of a quality HW controller, and one of the > > reasons I don't scrimp on the HW controllers I put in our 24/7 > > servers. OTOH, if you can afford a bit of downtime to handle > > failures, linux software RAID works pretty well, and since quad core > > CPUs are now pretty much the standard, it's ok if parity calculation > > uses up a bit of one core for lower performing servers like the > > reporting server the OP was talking about. > > The database server is a quad core machine, so it sounds as though > software RAID should work fine for the present setup. However, it > sounds as though I should put some money into a hardware RAID > controller if the database becomes more active. I had assumed RAID-5 > would be fine, but please let me know if there is another RAID level > more appropriate for this implementation. Thanks for the valuable > insight! RAID 10 is pretty much the fastest RAID level for disk IO. Every pair of disks you add to a RAID-10 array makes the array faster (assuming you don't hit any controller bottlenecks) Another advantage of hardware RAID controllers (that I'm surprised nobody has mentioned) is battery-backed cache. Using said cache, you can configure the controller to lie about fsyncs, which make them essentially free from PostgreSQL's standpoint. Since the cache is backed by a battery, your concerns about data loss in the event of power failure are much less. The cache doesn't usually increase the overall throughput of the system, but it usually improves peak load performance by deferring writes until things are calmer. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Database storage
nabble.30.miller_2...@spamgourmet.com wrote: The database server is a quad core machine, so it sounds as though software RAID should work fine for the present setup. However, it sounds as though I should put some money into a hardware RAID controller if the database becomes more active. I had assumed RAID-5 would be fine, but please let me know if there is another RAID level more appropriate for this implementation. Thanks for the valuable insight! raid-5 performs very poorly on random small block writes, which is hte majority of what databases do. raid10 is the preferred raid for databases. btw: re earlier discussion of raid controllers vs software... I'm surprised nooone mentioned that a 'real' raid controller with battery backed writeback cache can hugely speed up committed 8kbyte block random writes, which are quite often the big bottleneck in a transactional database. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Database storage
> On Thu, Jul 9, 2009 at 7:29 PM, Greg Stark wrote: >> On Fri, Jul 10, 2009 at 1:28 AM, Scott Marlowe >> wrote: >>> >>> $750 is about what a decent RAID controller would cost you, but again >>> it's likely that given your bulk import scenario, you're probably ok >>> without one. In this instance, you're probably best off with software >>> RAID than a cheap RAID card which will cost extra and probably be >>> slower than linux software RAID. ... >> The main advantage of hardware raid is the error handling. When you >> get low level errors or pull a drive a lot of consumer level >> controllers and their drivers don't respond very well and have long >> timeouts or keep retrying tragically unaware that the software raid >> would be able to handle recoverying. A good server-class RAID >> controller should handle those situations without breaking a sweat. > Definitely a big plus of a quality HW controller, and one of the > reasons I don't scrimp on the HW controllers I put in our 24/7 > servers. OTOH, if you can afford a bit of downtime to handle > failures, linux software RAID works pretty well, and since quad core > CPUs are now pretty much the standard, it's ok if parity calculation > uses up a bit of one core for lower performing servers like the > reporting server the OP was talking about. The database server is a quad core machine, so it sounds as though software RAID should work fine for the present setup. However, it sounds as though I should put some money into a hardware RAID controller if the database becomes more active. I had assumed RAID-5 would be fine, but please let me know if there is another RAID level more appropriate for this implementation. Thanks for the valuable insight! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
Simon Riggs writes: > ISTM more likely to be a problem with checkpointing clog or subtrans. > That would block everybody and the scale of the problem is about right. That's what I had been thinking too, but the log_checkpoint output conclusively disproves it: those steps are taking less than 20msec. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] psql language
Hi Stuart-san. Is the state where you wish this? example C:\Program Files\PostgreSQL\8.4\bin>psql -p 5433 postgres postgres psql (8.4.0) "help" でヘルプを表示します. C:\Program Files\PostgreSQL\8.4\bin>set LANG=C C:\Program Files\PostgreSQL\8.4\bin>psql -p 5433 postgres postgres psql (8.4.0) Type "help" for help. Have I missed something? Regards, Hiroshi Saito - Original Message - From: "Stuart McGraw" To: Sent: Friday, July 10, 2009 11:00 AM Subject: [GENERAL] psql language Hello, I just installed pg-8.4 on Windows XP but ran into some unexpected problems. I am working on some tools to aid English-speaking learners of Japanese. This of course requires me to regularly display and enter Japanese text on my machine, so I have the Regional setting, "Language for non-unicode programs" set to Japanese although the locale language is English. This allows me to to work with both english and japanese text in the windows console (cmd.exe) just fine. Psql also worked fine until 8.4. With 8.4 though, psql presents messages in Japanese. Since I am still a beginner at Japanese myself, this is a problem. FWIW, I tried doing "SET LANGUAGE en_US" and with every other locale-related variable "LC_ALL", "LANG", "LC_MESSAGES", etc, I could think of, before running psql but with no effect. How can I tell psql (and any other command line tools) to use english messages? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
On Fri, 2009-07-10 at 09:46 -0400, Bruce Momjian wrote: > Simon Riggs wrote: > > or a query like this > > > > Select '1', ... > > ... > > union > > Select status, ... > > ... > > where status != '1'; > > ; > > > > then it is clear that we could automatically prove that the the distinct > > step is redundant and so we could either hash or sort. This is the same > > as replacing the UNION with UNION ALL. > > In the last example, how do you know that status != '1' produces unique > output? You don't. I was assuming that you could already prove that each subquery was distinct in itself. It's one for the TODO, that's all. I see it often, but I'm not planning to work on the code for this myself. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
Simon Riggs wrote: > or a query like this > > Select '1', ... > ... > union > Select status, ... > ... > where status != '1'; > ; > > then it is clear that we could automatically prove that the the distinct > step is redundant and so we could either hash or sort. This is the same > as replacing the UNION with UNION ALL. In the last example, how do you know that status != '1' produces unique output? I assumed UNION gave distinct for the entire output, not just remove duplicates from the two UNION branches; that's how Postgres behaves now: test=> SELECT 1 UNION (SELECT 2 UNION ALL SELECT 2); ?column? -- 1 2 (2 rows) -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
On Fri, 2009-07-10 at 09:28 -0400, Bruce Momjian wrote: > Simon Riggs wrote: > > > > On Fri, 2009-07-10 at 08:59 -0400, Bruce Momjian wrote: > > > > > > I think it should be possible to use predtest theorem proving to > > > discard > > > > the sort/hash step in cases where we can prove the sets are > > > disjoint. > > > > Often there are top-level quals that can be compared in the WHERE > > > > clauses of the sub-queries, so a shallow search could be quite > > > > profitable in allowing us to rewrite a UNION into a UNION ALL. > > > > > > I assume we would still need the distinct removal step; we just avoid > > > the sort/hash. > > > > I mean it seems possible to prove that the distinct removal step is not > > necessary, by proving that the various sub-queries are already disjoint. > > It's a common manual optimization, so automating it seems a reasonable > > future goal. > > I am confused what sub-queries produce _distinct_ output. I know there > are some that produce _ordered_ output. None, that was not my point. If you have a query like this Select ..., status, ... ... where status = '1' union Select ..., status, ... ... where status = '2'; or a query like this Select '1', ... ... union Select '2', ... ... ; or a query like this Select '1', ... ... union Select status, ... ... where status != '1'; ; then it is clear that we could automatically prove that the the distinct step is redundant and so we could either hash or sort. This is the same as replacing the UNION with UNION ALL. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
Simon Riggs wrote: > > On Fri, 2009-07-10 at 08:59 -0400, Bruce Momjian wrote: > > > > I think it should be possible to use predtest theorem proving to > > discard > > > the sort/hash step in cases where we can prove the sets are > > disjoint. > > > Often there are top-level quals that can be compared in the WHERE > > > clauses of the sub-queries, so a shallow search could be quite > > > profitable in allowing us to rewrite a UNION into a UNION ALL. > > > > I assume we would still need the distinct removal step; we just avoid > > the sort/hash. > > I mean it seems possible to prove that the distinct removal step is not > necessary, by proving that the various sub-queries are already disjoint. > It's a common manual optimization, so automating it seems a reasonable > future goal. I am confused what sub-queries produce _distinct_ output. I know there are some that produce _ordered_ output. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
On Fri, 2009-07-10 at 08:59 -0400, Bruce Momjian wrote: > > I think it should be possible to use predtest theorem proving to > discard > > the sort/hash step in cases where we can prove the sets are > disjoint. > > Often there are top-level quals that can be compared in the WHERE > > clauses of the sub-queries, so a shallow search could be quite > > profitable in allowing us to rewrite a UNION into a UNION ALL. > > I assume we would still need the distinct removal step; we just avoid > the sort/hash. I mean it seems possible to prove that the distinct removal step is not necessary, by proving that the various sub-queries are already disjoint. It's a common manual optimization, so automating it seems a reasonable future goal. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL and Poker
On 2009-07-08, Massa, Harald Armin wrote: > a quite interesting read. > > http://www.codingthewheel.com/archives/stranger-than-fiction-story-online-poker-tracker-postgresql > > > especially as an explanation of the growing number of questions from > Windows-Users of PostgreSQL > > And ... for a tag line: "PostgreSQL. Thousends bet on it." I liked this bit: . "Product Manager: So wait, let me get this straight. You want us to . force our users, who are some of the laziest people on the planet, to . install a full-fledged relational database management system??? On . their home computer??? Like what, they're going to become DBAs? And . you're calling that a feature? Well, why stop there? Why not just ship . them our source code directly and force them to compile it on the . COMMAND LINE? Every user is a programmer, right? Well? ARE YOU OUT OF . YOUR F--KING MIND??" We do that! We were using the non-interactive MSI installer, those who know what's going on seem pleased to see a real database, those who don't, trust us :) it started life as a linux-only application but then someone ported PG and GTK to windows, and NSIS to linux. now we bulld windows installer CDs on linux. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Best practices to WorkFlow design?
Hi, I will be developing a WorkFlow Application, but I don't know the best practices on how to design a WorkFlow on a Database. Can you give me some clues? Books, links on the Internet, etc... Best Regards, André.
Re: [GENERAL] Overhead of union versus union all
Simon Riggs wrote: > > On Thu, 2009-07-09 at 20:41 -0600, Scott Marlowe wrote: > > On Thu, Jul 9, 2009 at 7:58 PM, Bruce Momjian wrote: > > > Scott Bailey wrote: > > >> Alvaro Herrera wrote: > > >> > Tim Keitt wrote: > > >> >> I am combining query results that I know are disjoint. I'm wondering > > >> >> how much overhead there is in calling union versus union all. (Just > > >> >> curious really; I can't see a reason not to use union all.) > > >> > > > >> > UNION needs to uniquify the output, for which it plasters an additional > > >> > sort step, whereas UNION ALL does not need to uniquify its output and > > >> > thus it can avoid the sort step. Using UNION ALL is recommended > > >> > wherever possible. > > >> > > > >> I think I read somewhere that as of 8.4 it no longer required the sort > > >> step, due to the improvements in hashing. Here it is > > >> > > >> http://wiki.postgresql.org/wiki/WhatsNew84#Performance > > > > > > Oh, yea, hashing is used in some cases rather than sort. I assume sort > > > is still used if the hash exceeds workmem size. > > > > The important point being that it's still more expensive than a plain > > union all thought, right? > > I think it should be possible to use predtest theorem proving to discard > the sort/hash step in cases where we can prove the sets are disjoint. > Often there are top-level quals that can be compared in the WHERE > clauses of the sub-queries, so a shallow search could be quite > profitable in allowing us to rewrite a UNION into a UNION ALL. I assume we would still need the distinct removal step; we just avoid the sort/hash. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT DISTINCT very slow
On Fri, Jul 10, 2009 at 1:41 PM, Ben Harper wrote: > > Unfortunately I can't use GROUP BY, because what I'm really doing is > SELECT DISTINCT ON(unique_field) id FROM table; You could do that using GROUP BY if you define a first() aggregate. In this case that would just be SELECT first(id) AS id from (select * from table ORDER BY unique_field, ...) GROUP BY unique_field. In cases with more fields it gets tiresome fast. In this case 8.4 won't actually help you. It only uses hash aggregates for DISTINCT not DISTINCT ON. > I'm not familiar with the Postgres internals, but in my own DB system > that I have written, I do the skip-scanning thing, and for my system > it was a really trivial optimization to code. Well things get tricky quickly when you have to deal with concurrent inserts and potentially page splits from other transactions. Also consider how hard it is to prove that the query falls into this category of queries. > Inside a GIS application, the user wants to categorize the display of > some information based on, in this case, the suburb name. > He clicks a button that says "Add All Unique Categories". This is a > very common operation in this domain. That doesn't look like what this query is doing to me. It's taking one exemplar from each suburb based on some other constraint (the minimum of whatever your order by key is) and taking the id of that data point. If the order by key doesn't specify a specific data point then it's a non-deterministic record. -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SELECT DISTINCT very slow
Thanks for all the feedback. Using GROUP BY is indeed much faster (about 1 second). Unfortunately I can't use GROUP BY, because what I'm really doing is SELECT DISTINCT ON(unique_field) id FROM table; I'm not familiar with the Postgres internals, but in my own DB system that I have written, I do the skip-scanning thing, and for my system it was a really trivial optimization to code. I know, I'm always free to submit a patch, and hopefully someday I will, if it hasn't already been done by then. I can't comment on whether this skip-scan optimization is general enough to warrant the lines of code, but I might as well explain my use case: Inside a GIS application, the user wants to categorize the display of some information based on, in this case, the suburb name. He clicks a button that says "Add All Unique Categories". This is a very common operation in this domain. Again, thanks for all the feedback. I'll upgrade to 8.4 soon. Ben Harper On Fri, Jul 10, 2009 at 2:50 AM, Tom Lane wrote: > Greg Stark writes: >> Not really. The OP doesn't say how wide the record rows are but unless >> they're very wide it wouldn't pay to use an index for this even if you >> didn't have to access the heap also. It's going to be faster to scan >> the whole heap and either sort or use a hash. Currently there aren't >> many cases where a btree with 6,000 copies of 111 distinct keys is >> going to be useful. > > It was 600,000 not 6,000 ... so a skip-scan might be worth the trouble, > but as you say we haven't done it. > > In any case I think the real issue is that the OP is probably using a > pre-8.4 release which will always do SELECT DISTINCT via sort-and-unique. > Hash aggregation would be a whole lot faster for these numbers, even > if not exactly instantaneous. He could update to 8.4, or go over to > using GROUP BY as was recommended upthread. > > regards, tom lane > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ubuntu packages for 8.4
On Fri, Jul 10, 2009 at 9:22 PM, Stuart Bishop wrote: > On Fri, Jul 10, 2009 at 8:28 AM, Tim Uckun wrote: >> I don't see any ubuntu packages for 8.4 in the default repositories. >> >> Does anybody know if they will be upgrading the postgresql package to >> 8.4 or creating a new package for it. > > The postgresql-8.4 packages arrived in 9.10 (Karmic) about two hours > ago. https://launchpad.net/ubuntu/+source/postgresql-8.4 > > Assuming you are not running the 9.10 prerelease, you may want to > request a backport to whatever Ubuntu release you are running - > https://help.ubuntu.com/community/UbuntuBackport. This should be no > drama. > > Major postgresql releases always get a new package. This allows you to > have multiple major versions installed and running simultaneously. > Awesome. Thanks a lot for the information. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Checkpoint Tuning Question
On Wed, 2009-07-08 at 18:22 -0400, Tom Lane wrote: > As Greg commented upthread, we seem to be getting forced to the > conclusion that the initial buffer scan in BufferSync() is somehow > causing this. There are a couple of things it'd be useful to try > here: Not sure why you're forced to that conclusion? ISTM more likely to be a problem with checkpointing clog or subtrans. That would block everybody and the scale of the problem is about right. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ubuntu packages for 8.4
On Fri, Jul 10, 2009 at 8:28 AM, Tim Uckun wrote: > I don't see any ubuntu packages for 8.4 in the default repositories. > > Does anybody know if they will be upgrading the postgresql package to > 8.4 or creating a new package for it. The postgresql-8.4 packages arrived in 9.10 (Karmic) about two hours ago. https://launchpad.net/ubuntu/+source/postgresql-8.4 Assuming you are not running the 9.10 prerelease, you may want to request a backport to whatever Ubuntu release you are running - https://help.ubuntu.com/community/UbuntuBackport. This should be no drama. Major postgresql releases always get a new package. This allows you to have multiple major versions installed and running simultaneously. -- Stuart Bishop http://www.stuartbishop.net/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Overhead of union versus union all
On Thu, 2009-07-09 at 20:41 -0600, Scott Marlowe wrote: > On Thu, Jul 9, 2009 at 7:58 PM, Bruce Momjian wrote: > > Scott Bailey wrote: > >> Alvaro Herrera wrote: > >> > Tim Keitt wrote: > >> >> I am combining query results that I know are disjoint. I'm wondering > >> >> how much overhead there is in calling union versus union all. (Just > >> >> curious really; I can't see a reason not to use union all.) > >> > > >> > UNION needs to uniquify the output, for which it plasters an additional > >> > sort step, whereas UNION ALL does not need to uniquify its output and > >> > thus it can avoid the sort step. Using UNION ALL is recommended > >> > wherever possible. > >> > > >> I think I read somewhere that as of 8.4 it no longer required the sort > >> step, due to the improvements in hashing. Here it is > >> > >> http://wiki.postgresql.org/wiki/WhatsNew84#Performance > > > > Oh, yea, hashing is used in some cases rather than sort. I assume sort > > is still used if the hash exceeds workmem size. > > The important point being that it's still more expensive than a plain > union all thought, right? I think it should be possible to use predtest theorem proving to discard the sort/hash step in cases where we can prove the sets are disjoint. Often there are top-level quals that can be compared in the WHERE clauses of the sub-queries, so a shallow search could be quite profitable in allowing us to rewrite a UNION into a UNION ALL. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance problem with low correlation data
> > testinsert contains t values between '2009-08-01' and '2009-08-09', and > > ne_id > from 1 to 2. But only 800 out of 2 ne_id have to be read; there's no > need for a table scan! > > I guess this is a reflection of the poor "correlation" on ne_id; but, as I > said, I don't really think ne_id is so bad correlated. > > In fact, doing a "select ne_id, t from testinsert limit 10" I can see > that data is laid out pretty much by "ne_id, t", grouped by day (that is, > same > ne_id for one day, then next ne_id and so on until next day). > > How is the "correlation" calculated? Can someone explain to me why, after > > the > procedure above,correlation is so low??? > > Did you run ANALYZE after the procedure above? Yes I did; the correlation on that column stays low. Of course, I didn't expect a correlation = 1, since data is layed out (pretty much) like this: (ne_id1) (t1 day1) (ne_id1) (t2 day1) ... (ne_id1) (tn day1) (ne_id2) (t1 day1) (ne_id2) (t2 day1) ... (ne_id2) (tn day1) ... (pretty much all the ne_ids) (ne_id1) (t1 day2) (ne_id1) (t2 day2) ... (ne_id1) (tn day2) (ne_id2) (t1 day2) (ne_id2) (t2 day2) ... (ne_id2) (tn day2) ... and so on so I ne_id is not strictly incrementing, but it is pretty much the same (sequencially) for a whole whole day... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general