Re: [GENERAL] performance issues on windows with 8.3.0?
On Thu, Feb 14, 2008 at 7:56 PM, Dan Armbrust [EMAIL PROTECTED] wrote: On Thu, Feb 14, 2008 at 1:31 PM, Dave Page [EMAIL PROTECTED] wrote: You must have enabled the debugger when you installed (or didn't disable it). You can turn it back off in postgresql.conf if you like - there may be a little overhead. I see this in the postgresql.conf file: shared_preload_libraries = '$libdir/plugins/plugin_debugger.dll' # (change requires restart) Yup, just set it blank and restart. I didn't turn that on, perhaps I missed it in the installer, or it defaults to on in the installer. I commented it out, and now my performance on 8.3 on windows is in line with what I am seeing with 8.2.6 on windows. That is a lot of overhead, when a connection is made (I realize I shouldn't be making connections this often - but I wonder if the overhead is only at connection time, or if there is other overhead as well) I'm surprised there's so much overhead, but not that there is some. Any runtime overhead will be in pl/pgsql functions so if you're not using any, you won't see any difference once connected. However, the fact that it keeps appearing implies you're using lots of new (short-lived?) connections. That's particularly expensive on Windows - consider a connection pooler, of if you're using something like php, persistent connections. Your right, my connections are dropping off left and right. My performance on windows is about 75% slower than the performance on linux. It appears that I'm currently dropping connections on both windows and linux. Does that fall in line with your expectation that creating connections on windows is particularly expensive? I couldn't quote a figure, but I'm not surprised it's noticeably slower. I'll go figure out why on earth my connections are getting killed and recreated by the pooling layers. Sounds like a good plan :-) -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Oracle-compatible database company ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] using DROP in a transaction
ah, of course. the exclusive lock was preventing tty1 to read test, and when the lock was gone, so was the table. I get it. Thanks a lot. But, what about the ERROR: tuple concurrently updated ? (in TTY3) What should have happened, i guess, is ERROR: table test does not exist, upon drop table test; --4. ... Which tuple was concurrently updated? A pg_catalog entry that administers the table? WBL On Fri, Feb 15, 2008 at 5:10 AM, Chris [EMAIL PROTECTED] wrote: ==in TTY1== --11. expect result at last, value 2 only. (concurrent transaction 2 (in TTY3) completes after this, and will delete values 2 and 4 (added after select was issued) upon commit) --11. true result: ERROR: relation large nr deleted while still in use The table 'test' which tty1 was looking at (which was dropped in tty2) doesn't exist any more. Postgres doesn't look at the name, it looks at the id that is created behind the scenes. So in tty1, the id is 'x'. Then you recreate the table in tty2 which gives it id 'y'. So tty1 looking at id 'x' doesn't exist any more. -- Postgresql php tutorials http://www.designmagick.com/
Re: [GENERAL] dynamic crosstab
Joe wrote It occurs to me that it shouldn't be terribly difficult to make an alternate version of crosstab() that returns an array rather than tuples (back when crosstab() was first written, Postgres didn't support NULL array elements). Is this worth considering for 8.4? I think there should be a generic way in Postgres to return from an EAV model. Although I have no evidence on that I keep thinking that the db must be more effective at that than the application would be. I was hoping that now with PG supporting plan invalidation it would be possible to return a recordset. If there is no generic way to return a recordset than being able to return an array is much better than nothing. B. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Are indexes blown?
Hi, I have an index on the user_id field in the query below: myuser=# delete from clients where user_id like '64.22.91.%'; DELETE 22 Time: 220324.975 ms Is there any reason why it's taking 220 seconds to run this simple query? There are about 3 million rows in this table. How can I debug this? How can I check if the index is bloated or blown? From the VACUUM ANALYZE output, nothing like this is apparent. Thanks. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Are indexes blown?
On Fri, Feb 15, 2008 at 3:55 PM, Phoenix Kiula [EMAIL PROTECTED] wrote: Hi, I have an index on the user_id field in the query below: myuser=# delete from clients where user_id like '64.22.91.%'; DELETE 22 Time: 220324.975 ms Is there any reason why it's taking 220 seconds to run this simple query? There are about 3 million rows in this table. Use the *'pgstattuple'* contrib module -- http://www.postgresql.org/docs/current/static/pgstattuple.html *pgstatindex* function from the contrib module should be able to help you there. -- Shoaib Mir Fujitsu Australia Software Technology [EMAIL PROTECTED]
Re: [GENERAL] Are indexes blown?
Phoenix Kiula wrote: Hi, I have an index on the user_id field in the query below: myuser=# delete from clients where user_id like '64.22.91.%'; DELETE 22 Time: 220324.975 ms Is there any reason why it's taking 220 seconds to run this simple query? There are about 3 million rows in this table. First guess is that it's not using the index. What does EXPLAIN ANALYSE SELECT user_id FROM clients WHERE user_id LIKE '...' show? Check the list archives for locale and like and text_pattern_ops too - that's a good place to check. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Are indexes blown?
On 15/02/2008, Shoaib Mir [EMAIL PROTECTED] wrote: On Fri, Feb 15, 2008 at 3:55 PM, Phoenix Kiula [EMAIL PROTECTED] wrote: Hi, I have an index on the user_id field in the query below: myuser=# delete from clients where user_id like '64.22.91.%'; DELETE 22 Time: 220324.975 ms Is there any reason why it's taking 220 seconds to run this simple query? There are about 3 million rows in this table. Use the 'pgstattuple' contrib module -- http://www.postgresql.org/docs/current/static/pgstattuple.html pgstatindex function from the contrib module should be able to help you there. How should I install a contrib without bringing down my database, or stopping it, or doing ANYTHING to it? It's in production. I can't touch it. Will it be installed on the side and then I simply start using it? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Are indexes blown?
On 15/02/2008, Richard Huxton [EMAIL PROTECTED] wrote: First guess is that it's not using the index. What does EXPLAIN ANALYSE SELECT user_id FROM clients WHERE user_id LIKE '...' show? Check the list archives for locale and like and text_pattern_ops too - that's a good place to check. There is nothing to do with locale. The same database has been working just fine for 2 years. Why should this be an issue now? When I ran the EXPLAIN SELECT, the database was hanging. Or taking too much time (waiting for 5 minutes), or whatever. I cancelled it. That's the problem. It works, then it doesn't. Then it works again. I am guessing it could be the load, but there's nothing new in terms of load that should be causing this! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Are indexes blown?
Phoenix Kiula wrote: On 15/02/2008, Richard Huxton [EMAIL PROTECTED] wrote: First guess is that it's not using the index. What does EXPLAIN ANALYSE SELECT user_id FROM clients WHERE user_id LIKE '...' show? Check the list archives for locale and like and text_pattern_ops too - that's a good place to check. There is nothing to do with locale. The same database has been working just fine for 2 years. Why should this be an issue now? No reason, but you hadn't said this was a change in behaviour, just that it seemed slow. When I ran the EXPLAIN SELECT, the database was hanging. Or taking too much time (waiting for 5 minutes), or whatever. I cancelled it. That's the problem. It works, then it doesn't. Then it works again. I am guessing it could be the load, but there's nothing new in terms of load that should be causing this! Ah, more new information! This does seem to point to the load, particularly if it's exactly the same query each time. So what do top/vmstat etc show for these go-slow periods? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] dynamic crosstab
Erik Jones wrote: First, please stop top-posting. It makes it difficult for both me and others to know to whom/what you are replying. Sorry, I don't know much about mailing list customs - I had to look up what top-posting is. I will behave now ... I would prefer to keep the complications for when I retrieve the data rather then when I store it. I could imagine something like this though to create a crosstab as an array, but I am afraid that there is no assurance that the resulting array would contain the values in the same order for each focus: tbl(eID, aID, value) Select eID, array_accum(value) from ( (Select Distinct eID from tbl) e CROSS JOIN (Select Distinct aID from tbl) a ) ea LEFT OUTER JOIN tbl USING (eID, aID) GROUP BY eID B. ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] a newbie question on table design
Hi all, I have a large sdf file with many records of molecules and associated data items and I want to save them in a PostgreSQL database. There are about less than 40 data items for every molecule(the names of the data items fore each molecule are the same, but values differ). The number of molecules may exceed 20 million. Now I have come up with two ways to construct the table: 1) a table with about 40 columns, every data item has its corresponding column, and one molecule corresponds to one row in the table. This is direct and simple. The drawbacks is if I want to add more data types to the database, I have to modify the structure of the table. 2) a table with just 3 columns: CREATE TABLE mytable( id serial, data_name text, data_value text ); Then a single molecule will corresonds to about 40 rows in the database. If I need to add more data types to the table, I just need to add new rows with new data_name column values. The drawback of this table is it has too many rows(40 times of the former one) and waste a lot space. Which one is better, or there are some other smart ways ? I have another question. Since the data I need to save is huge, is it appropriate that I save the data value in compressed format ? Regards, ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Are indexes blown?
Actually my host has just told me that I have a number of hung semaphores in my server. And he is relating them to postgresql. I am not surprised, because this is the only utility that has issues. All the rest is working (apache, mysql, exim, etc). Any thoughts on where I should start looking for hung semaphores? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Are indexes blown?
On Fri, Feb 15, 2008 at 5:18 PM, Phoenix Kiula [EMAIL PROTECTED] wrote: How should I install a contrib without bringing down my database, or stopping it, or doing ANYTHING to it? It's in production. I can't touch it. Will it be installed on the side and then I simply start using it? You do not need to restart the database server for that purpose as all you need is the pgstattuple.so file copied to PG-HOME/lib folder. Do the following (in case you have installed server from source): - Go to the PostgreSQL-source/contrib/pgstattuple folder - run make and make install (this will copy pgstattuple.so file to the lib folder of your PostgreSQL installation) - Now from psql execute the pgstattuple.sql file for that specific database which can be found in PG-HOME/share/contrib folder - Once the sql file is executed now you can use the pgstattuple function -- Shoaib Mir Fujitsu Australia Software Technology [EMAIL PROTECTED]
Re: [GENERAL] Are indexes blown?
On Fri, Feb 15, 2008 at 6:13 PM, Phoenix Kiula [EMAIL PROTECTED] wrote: Thanks. But I had installed from rpm. Can I just download that .so file and put in the lib folder for pgsql and then start using it? Well I would say download the source for the same version you have, copy it to your desktop machine, build it and then build the .so file for contrib module using 'make' and 'make install' once that is done copy the .so from lib folder of PG to your production PG box's lib folder. -- Shoaib Mir Fujitsu Australia Software Technology [EMAIL PROTECTED]
Re: [GENERAL] Are indexes blown?
On 15/02/2008, Shoaib Mir [EMAIL PROTECTED] wrote: You do not need to restart the database server for that purpose as all you need is the pgstattuple.so file copied to PG-HOME/lib folder. Do the following (in case you have installed server from source): - Go to the PostgreSQL-source/contrib/pgstattuple folder - run make and make install (this will copy pgstattuple.so file to the lib folder of your PostgreSQL installation) - Now from psql execute the pgstattuple.sql file for that specific database which can be found in PG-HOME/share/contrib folder - Once the sql file is executed now you can use the pgstattuple function Thanks. But I had installed from rpm. Can I just download that .so file and put in the lib folder for pgsql and then start using it? ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Are indexes blown?
Phoenix Kiula wrote: On 15/02/2008, Richard Huxton [EMAIL PROTECTED] wrote: Ah, more new information! This does seem to point to the load, particularly if it's exactly the same query each time. So what do top/vmstat etc show for these go-slow periods? In included top and vmstat info in my other post yesterday, but here it is again: Ah, you had a post yesterday! (goes away, searches for previous post) http://archives.postgresql.org/pgsql-general/2008-02/msg00689.php PG quitting sporadically!! Right, OK. Firstly, stop worrying about index usage and/or bloat. You have unexplained process crashes to deal with first. There's no point in looking at indexes until you figure out what is killing your processes. Secondly, a single line from vmstat isn't useful, you want to compare what is happening when things are fine with when they aren't. Leave vmstat 10 logging to a file so you can catch it. Thirdly, have you upgraded to the latest 8.2 (8.2.6) yet? I see you've reduced work_mem, that's good. Oh, you might as well lower max_connections from 150 too, there's no way you can support that many concurrent queries anyway. The fact that you're seeing various strange socket-related problems is odd. As is the fact that logging doesn't seem to work for you. Are you sure the two sets of vmstat/top figures are from when PG was crashing/running queries slow? Everything seems idle to me in those figures. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] Trying to understand encoding.
Greetings. I'm currently using 8.3, but I've been coping with this since previous versions. I'm trying to integrate some LATIN1 and some UTF8 DBs into a single UTF8 one. To avoid the Invalid UNICODE character... error, I used iconv to convert the LATIN1 dumps to UTF8. Now I have the data into the UTF8 DB, and using graphical clients everything seems to be great. The thing is, when I query the data via psql, with \encoding UTF8 I get weird data (Neuquén for Neuquén). However, with \encoding LATIN1, everything looks fine. So, I have a UTF8 DB, (what I think is) UTF8 data, and I can only see it right by setting \encoding to LATIN1 in psql, or using a graphical client. If anyone could help me try and understand this mess, I'd really appreciate it. Ah, these are my locale settings, in case it helps. LANG=en_US.UTF-8 LC_CTYPE=C LC_NUMERIC=en_US.UTF-8 LC_TIME=en_US.UTF-8 LC_COLLATE=C LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Are indexes blown?
In article [EMAIL PROTECTED], Shoaib Mir [EMAIL PROTECTED] writes: On Fri, Feb 15, 2008 at 6:13 PM, Phoenix Kiula [EMAIL PROTECTED] wrote: Thanks. But I had installed from rpm. Can I just download that .so file and put in the lib folder for pgsql and then start using it? Well I would say download the source for the same version you have, copy it to your desktop machine, build it and then build the .so file for contrib module using 'make' and 'make install' once that is done copy the .so from lib folder of PG to your production PG box's lib folder. But you have to ensure that you build PostgreSQL on your desktop machine in exactly the same way as the RPM got built (integer_datetimes etc). ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Trying to understand encoding.
On 2/15/08, Tomás Di Doménico [EMAIL PROTECTED] wrote: Now I have the data into the UTF8 DB, and using graphical clients everything seems to be great. The thing is, when I query the data via psql, with \encoding UTF8 I get weird data (NeuquÃ(c)n for Neuquén). However, with \encoding LATIN1, everything looks fine. Maybe your terminal program doesn't support UTF8, or it's misconfigured? If you create a UTF8-encoded file and 'cat' it, is the output correct? -Doug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Are indexes blown?
On 2/15/08, Harald Fuchs [EMAIL PROTECTED] wrote: But you have to ensure that you build PostgreSQL on your desktop machine in exactly the same way as the RPM got built (integer_datetimes etc). It'd probably be much easier to just install the -contrib RPM. :) -- -Doug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Pains in upgrading to 8.3
I'm glad I didn't go from 8.2.3 to 8.3 straight! http://ogasawalrus.com/blog/node/462 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Trying to understand encoding.
Geez. My default terminal didn't support UNICODE. Shame on me :P Thanks! Douglas McNaught wrote: On 2/15/08, Tomás Di Doménico [EMAIL PROTECTED] wrote: Now I have the data into the UTF8 DB, and using graphical clients everything seems to be great. The thing is, when I query the data via psql, with \encoding UTF8 I get weird data (NeuquÃ(c)n for Neuquén). However, with \encoding LATIN1, everything looks fine. Maybe your terminal program doesn't support UTF8, or it's misconfigured? If you create a UTF8-encoded file and 'cat' it, is the output correct? -Doug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] the feasibility of sending email from stored procedure in Postgres
On 2/14/08, hewei [EMAIL PROTECTED] wrote: Can send email from stored procedure in Postgres? In principle, yes, using one of the untrusted stored function languages. pl/perl, pl/sh, pl/python, and such. I wouldn't do things that way... I would instead queue messages (or suitable information about them) in a table, and have a process outside PostgreSQL periodically poll for them. There are several benefits to that approach: 1. You're not pushing error handling of email problems inside the PostgreSQL back end. That could be rather risky. 2. You're not spawning an MTA connection every time you submit a message. This could be rather expensive. In contrast, the poll a queue approach lets something completely external deal with email problems. And it should be able to submit multiple messages more or less at once, which should improve efficiency rather a lot; no need to open up sockets to port 25 a whole bunch of times... -- http://linuxfinances.info/info/linuxdistributions.html The definition of insanity is doing the same thing over and over and expecting different results. -- assortedly attributed to Albert Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] a newbie question on table design
On Friday 15. February 2008, [EMAIL PROTECTED] wrote: Hi all, I have a large sdf file with many records of molecules and associated data items and I want to save them in a PostgreSQL database. There are about less than 40 data items for every molecule(the names of the data items fore each molecule are the same, but values differ). The number of molecules may exceed 20 million. Now I have come up with two ways to construct the table: 1) a table with about 40 columns, every data item has its corresponding column, and one molecule corresponds to one row in the table. This is direct and simple. The drawbacks is if I want to add more data types to the database, I have to modify the structure of the table. 2) a table with just 3 columns: CREATE TABLE mytable( id serial, data_name text, data_value text ); Then a single molecule will corresonds to about 40 rows in the database. This is a sound concept, but I'd rather store the data_name in a separate table with an integer key, and replace data_name in mytable with a data_name_fk REFERENCES data_names (data_name_id). That's just Occam's Razor applied to database design, aka first normal form. You'd probably store the name of the molecule in a third table. Then you have a model very similar to the classic 'book database' where a book can have multiple authors, and an author can have multiple books. There are examples for this design all over the place. If I need to add more data types to the table, I just need to add new rows with new data_name column values. The drawback of this table is it has too many rows(40 times of the former one) and waste a lot space. Which one is better, or there are some other smart ways ? I have another question. Since the data I need to save is huge, is it appropriate that I save the data value in compressed format ? That sounds a lot like premature optimization. Postgres is actually quite good at compacting data natively. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE My Jazz Jukebox: http://www.last.fm/user/leifbk/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] a newbie question on table design
If you can select stable structure (common columns) and additional columns than you can : 1. Use base class (table) with common columns + inherited tables with their own additional columns 2. We use contrib/hstore as a storage for semistructured data - we store additional columns as a key-value pairs in hstore data type. This is very flexible design. On Fri, 15 Feb 2008, Leif B. Kristensen wrote: On Friday 15. February 2008, [EMAIL PROTECTED] wrote: Hi all, I have a large sdf file with many records of molecules and associated data items and I want to save them in a PostgreSQL database. There are about less than 40 data items for every molecule(the names of the data items fore each molecule are the same, but values differ). The number of molecules may exceed 20 million. Now I have come up with two ways to construct the table: 1) a table with about 40 columns, every data item has its corresponding column, and one molecule corresponds to one row in the table. This is direct and simple. The drawbacks is if I want to add more data types to the database, I have to modify the structure of the table. 2) a table with just 3 columns: CREATE TABLE mytable( id serial, data_name text, data_value text ); Then a single molecule will corresonds to about 40 rows in the database. This is a sound concept, but I'd rather store the data_name in a separate table with an integer key, and replace data_name in mytable with a data_name_fk REFERENCES data_names (data_name_id). That's just Occam's Razor applied to database design, aka first normal form. You'd probably store the name of the molecule in a third table. Then you have a model very similar to the classic 'book database' where a book can have multiple authors, and an author can have multiple books. There are examples for this design all over the place. If I need to add more data types to the table, I just need to add new rows with new data_name column values. The drawback of this table is it has too many rows(40 times of the former one) and waste a lot space. Which one is better, or there are some other smart ways ? I have another question. Since the data I need to save is huge, is it appropriate that I save the data value in compressed format ? That sounds a lot like premature optimization. Postgres is actually quite good at compacting data natively. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] dynamic crosstab
Balázs Klein wrote: I was hoping that now with PG supporting plan invalidation it would be possible to return a recordset. Plan invalidation has nothing to do with it. In Postgres a returned recordset can be used as a row source in the FROM clause -- this requires data type information to be known at parse time. Joe ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] a newbie question on table design
On Feb 15, 2008, at 4:49 AM, [EMAIL PROTECTED] wrote: Hi all, I have a large sdf file with many records of molecules and associated data items and I want to save them in a PostgreSQL database. There are about less than 40 data items for every molecule(the names of the data items fore each molecule are the same, but values differ). The number of molecules may exceed 20 million. Now I have come up with two ways to construct the table: 1) a table with about 40 columns, every data item has its corresponding column, and one molecule corresponds to one row in the table. This is direct and simple. The drawbacks is if I want to add more data types to the database, I have to modify the structure of the table. 2) a table with just 3 columns: CREATE TABLE mytable( id serial, data_name text, data_value text ); That looks kinda like an entity-attribute-value format. I'm guessing that either there'll be another column to define the entity, or the id isn't really a serial. Then a single molecule will corresonds to about 40 rows in the database. If I need to add more data types to the table, I just need to add new rows with new data_name column values. The drawback of this table is it has too many rows(40 times of the former one) and waste a lot space. Which one is better, or there are some other smart ways ? Somebody will, shortly, leap out and explain at great length why EAV is evil, evil, evil and would never be used by any right-thinking person. Don't take them too seriously. EAV is sometimes appropriate. This is probably not one of those times, though. You're likely to get much more benefit from the power of SQL by putting one molecule per row of the table. 40 columns isn't excessive, and modifying the structure of the table to add or modify columns isn't really a problem, especially in postgresql, where you can take advantage of DDL being transactional. I have another question. Since the data I need to save is huge, is it appropriate that I save the data value in compressed format ? Not on your first implementation. Postgresql does some basic compression and out-of-line storage of data automatically. If you have huge blobs of opaque data that you don't plan on querying from within the database you might consider doing some client-side compression of them. Maybe. By the third or fourth iteration, if benchmarks suggest it'd be worthwhile. Right now, though, don't even consider it. Cheers, Steve ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Are indexes blown?
On 2/15/08, Phoenix Kiula [EMAIL PROTECTED] wrote: LOG: could not receive data from client: Connection reset by peer LOG: unexpected EOF on client connection LOG: could not receive data from client: Connection reset by peer LOG: unexpected EOF on client connection This means your client processes are dying or getting killed (possibly due to memory shortages?). Are these running on the same machine as Postgres? Are there any logs you can look at to see what might be going wrong? If this is Linux, are there any OOM-killer messages in the syslogs? -Doug ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Pains in upgrading to 8.3
On Fri, Feb 15, 2008 at 8:21 AM, Phoenix Kiula [EMAIL PROTECTED] wrote: I'm glad I didn't go from 8.2.3 to 8.3 straight! ither way, you need to update to 8.2.6 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] dynamic crosstab
On Feb 15, 2008, at 6:29 AM, Balázs Klein wrote: Erik Jones wrote: First, please stop top-posting. It makes it difficult for both me and others to know to whom/what you are replying. Sorry, I don't know much about mailing list customs - I had to look up what top-posting is. I will behave now ... It's cool, now you know :) I would prefer to keep the complications for when I retrieve the data rather then when I store it. Really? When do you think users notice performance hits the most? I'd think, given that answers for a questionnaire are stored as a batch, people running reports on will be the ones to notice, i.e. at retrieval time. I could imagine something like this though to create a crosstab as an array, but I am afraid that there is no assurance that the resulting array would contain the values in the same order for each focus: tbl(eID, aID, value) Select eID, array_accum(value) from ( (Select Distinct eID from tbl) e CROSS JOIN (Select Distinct aID from tbl) a ) ea LEFT OUTER JOIN tbl USING (eID, aID) GROUP BY eID That's cool. I still don't see why you're so set on an EAV, but it's your setup. Watch out, though, big questionnaires will turn into queries with an inordinate amount of joins and performance on those will suck. If you just used arrays directly you could pull all of the answers for a given person and/or questionnaire with pretty simple query. Erik Jones DBA | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate market in style. Visit us online at http://www.myemma.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] dynamic crosstab
given that answers for a questionnaire are stored as a batch Not in our setup - for all sorts of reasons (preserving responses on a connection failure or restart, monitoring response latency in real time, creating adaptive/branching questionnaires) we send each response separately. people running reports on will be the ones to notice, i.e. at retrieval time. I am not sure - different responses are aggregated into different attributes in different ways - those properties need to be retrieved during scoring/report generation, so being able to create a join directly on a response is a good thing for me. But report generation - in our case it must be a DTP quality PDF - is such a beast anyway that db times dwarf compared to pdf generation. The problem comes when I need to present the responses themselves in a human-friendly way - as an export or display or report. Do you think there is a way to ensure that the order of the values in the array below is the same for each person? tbl(eID, aID, value) Select eID, array_accum(value) from ( (Select Distinct eID from tbl) e CROSS JOIN (Select Distinct aID from tbl) a ) ea LEFT OUTER JOIN tbl USING (eID, aID) GROUP BY eID Thx for the help. B. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] dynamic crosstab
On Feb 14, 2008, at 8:19 PM, Joe Conway wrote: Erik Jones wrote: See how postgres handles filling the NULLs for you? What you'd really want to do with this would be to define some functions for setting and getting a person's answers to a given question or set of questions so that you could implement some kind of data integrity with regards to question ids and indices into the answers arrays such as in the example above you'd want to prevent an entry at index 7 when there is no entry in the questions table for question_id=7. It occurs to me that it shouldn't be terribly difficult to make an alternate version of crosstab() that returns an array rather than tuples (back when crosstab() was first written, Postgres didn't support NULL array elements). Is this worth considering for 8.4? That's a great idea. At the very least someone (you? me?) could start work on it and if it doesn't go into the main contrib package it could be made available on pgfoundry. Erik Jones DBA | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate market in style. Visit us online at http://www.myemma.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] dynamic crosstab
Balázs Klein wrote: I was hoping that now with PG supporting plan invalidation it would be possible to return a recordset. Plan invalidation has nothing to do with it. In Postgres a returned recordset can be used as a row source in the FROM clause -- this requires data type information to be known at parse time. Joe I thought that it includes that the return type can be changed/redefined at runtime. No luck there than. Thx. B. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] dynamic crosstab
On Feb 15, 2008, at 9:56 AM, Balázs Klein wrote: given that answers for a questionnaire are stored as a batch Not in our setup - for all sorts of reasons (preserving responses on a connection failure or restart, monitoring response latency in real time, creating adaptive/branching questionnaires) we send each response separately. people running reports on will be the ones to notice, i.e. at retrieval time. I am not sure - different responses are aggregated into different attributes in different ways - those properties need to be retrieved during scoring/report generation, so being able to create a join directly on a response is a good thing for me. But report generation - in our case it must be a DTP quality PDF - is such a beast anyway that db times dwarf compared to pdf generation. The problem comes when I need to present the responses themselves in a human-friendly way - as an export or display or report. Do you think there is a way to ensure that the order of the values in the array below is the same for each person? tbl(eID, aID, value) Select eID, array_accum(value) from ( (Select Distinct eID from tbl) e CROSS JOIN (Select Distinct aID from tbl) a ) ea LEFT OUTER JOIN tbl USING (eID, aID) GROUP BY eID The only way to ever guarantee a particular order is via an ORDER BY clause. Erik Jones DBA | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate market in style. Visit us online at http://www.myemma.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Pains in upgrading to 8.3
Phoenix Kiula wrote: I'm glad I didn't go from 8.2.3 to 8.3 straight! http://ogasawalrus.com/blog/node/462 Going from 8.2.4 and 8.2.6 to 8.3.0 has been painless for me. However, unlike the blogger you cite, I read the directions before, not after, attempting it. Paul ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Are indexes blown?
On Fri, Feb 15, 2008 at 8:36 AM, Phoenix Kiula [EMAIL PROTECTED] wrote: No. They are the vmstat figures from when I was replying to your email. What will vmstat tell me and how should I set it up to do vmstat 10 logging? Something like vmstat 10 vmstat.log LOG: could not receive data from client: Connection reset by peer LOG: unexpected EOF on client connection LOG: could not receive data from client: Connection reset by peer LOG: unexpected EOF on client connection Now I don't know what is wrong or even where I should look. Postgresql is often taking quite a bit of memory and CPU resources. I've reduced work_mem to 10MB and Max_connections to 100. (Anyway, the old values were working just fine until recently!) The biggest problem: when I su into postgres user and do a psql to get into the PG console in my SSH, it takes a whole lot of time to come up! It used to come up in a jiffy earlier!!! It now shows me this error: How many pgsql processes are there when this happens? Try something like ps axu|grep postgres to see. use ps axu|grep postgres|wc -l to get a rough count. I'm guessing that your web service layer is keeping old connections open. could be something as ugly as php's pg_pconnect or a buggy jdbc driver, etc... ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Are indexes blown?
Phoenix Kiula wrote: On 15/02/2008, Richard Huxton [EMAIL PROTECTED] wrote: Are you sure the two sets of vmstat/top figures are from when PG was crashing/running queries slow? Everything seems idle to me in those figures. No. They are the vmstat figures from when I was replying to your email. What will vmstat tell me and how should I set it up to do vmstat 10 logging? I'd write a small script and call it e.g. trackusage.sh and save it in /tmp/ #!/bin/sh while (/bin/true) do date /tmp/vmstat_figures.txt vmstat 10 60 /tmp/vmstat_figures.txt done Then, set the execute flag on it and do something like: nohup /tmp/trackusage.sh That should run even when you disconnect (don't forget to kill it once this is fixed). It will log a timestamp every 10 minutes and vmstat activity between. [snip logging fragment] Now I don't know what is wrong or even where I should look. Postgresql is often taking quite a bit of memory and CPU resources. Just checking - this is a real machine and not a virtual one, isn't it? I've reduced work_mem to 10MB and Max_connections to 100. (Anyway, the old values were working just fine until recently!) The biggest problem: when I su into postgres user and do a psql to get into the PG console in my SSH, it takes a whole lot of time to come up! It used to come up in a jiffy earlier!!! It now shows me this error: ~ psql: could not connect to server: Connection timed out Is the server running on host localhost and accepting TCP/IP connections on port 5432? Then, five minutes later, I can connect again! In less than a second! What gives? Hopefully vmstat will show us. Finally, very simple queries like this one: select url, disable_in_statistics, id, user_known from links where alias = '1yqw7' and status = 'Y' limit 1 Which used to be server in 5 ms (0.005 seconds) are now taking upwards of 200 seconds! Same symptom. I'd have guessed the machine is running out of memory and swapping, but the vmstat/top stuff all look fine. Your suggestion to Explain Analyze -- =# explain analyze select url, disable_in_statistics, id, user_known from links where alias = '1yqw7' and status = 'Y' limit 1 ; QUERY PLAN Limit (cost=0.00..8.74 rows=1 width=113) (actual time=9.639..9.643 rows=1 loops=1) - Index Scan using links2_alias_key on links (cost=0.00..8.74 rows=1 width=113) (actual time=9.630..9.630 rows=1 loops=1) Index Cond: ((alias)::text = '1yqw7'::text) Filter: (status = 'Y'::bpchar) Total runtime: 16.425 ms Fine - it's nothing to do with the planner, indexes or anything else. This is system-related, and vmstat should point us in the right direction. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Approaches for Lookup values (codes) in OLTP application
I am considering how best to handle the issue of attribute encoding for an OLTP application conversion. The existing system, which does not employ a relational DBMS in the commonly accepted sense, uses a system_table to validate system codes. This dataset uses concatenated fields to form a unique key. The fields are table_name, table_column_name, and value_as_char. The conversion project framework is Ruby on Rails which embeds the practice of arbitrary integer primary keys assigned by sequencers rather than so-called natural keys or predicates that define the unique portion of the table-row. My questions revolve around how best to implement this in postgresql given the expectations of Rails. Is it best that I create a table with the three key columns and an additional id then have a unique index on the three values but store the id in the referential row? Do I store the code value in the referential row and use the implied table_name, table_column_name and stored value to preform a lookup on the system_table? Is there another approach that I am not aware of that is superior to both of these? Comments most welcome. -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:[EMAIL PROTECTED] Harte Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Approaches for Lookup values (codes) in OLTP application
James B. Byrne wrote: I am considering how best to handle the issue of attribute encoding for an OLTP application conversion. [snip] The conversion project framework is Ruby on Rails which embeds the practice of arbitrary integer primary keys assigned by sequencers rather than so-called natural keys or predicates that define the unique portion of the table-row. I'm not a Rails guy, but everything I've read about it suggests if you're going to gain any advantage from it, then you should follow its way of doing things. That means not converting anything, but rather writing a rails app that does the same as your current app (if I'm making myself clear). -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Pains in upgrading to 8.3
On Fri, Feb 15, 2008 at 4:21 PM, Tony Caduto [EMAIL PROTECTED] wrote: paul rivers wrote: Going from 8.2.4 and 8.2.6 to 8.3.0 has been painless for me. However, unlike the blogger you cite, I read the directions before, not after, attempting it. The blogger has a point about pg_dump and restore, it could be much better, for example the backup process could be part of the server core and instead of having a fat client where most of the process is running on the client, a API could be used where the backup is generated on the server and then have options where it could be left on the server or transferred to the clients PC. Not really an option - the reason it's recommended to use the new pg_dump version with the older server when upgrading is to allow the dump to be made in the way most compatible with the new server, effectively doing some of the upgrade process as part of the dump operation. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Oracle-compatible database company ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] Pains in upgrading to 8.3
Tony Caduto wrote: paul rivers wrote: Going from 8.2.4 and 8.2.6 to 8.3.0 has been painless for me. However, unlike the blogger you cite, I read the directions before, not after, attempting it. The blogger has a point about pg_dump and restore, it could be much better, for example the backup process could be part of the server core and instead of having a fat client where most of the process is running on the client, a API could be used where the backup is generated on the server and then have options where it could be left on the server or transferred to the clients PC. Using pg_dump remotely is becoming a pain because it's not really backwards compatible with earlier releases, so you end up having to have multiple copies laying around to use on different server versions. While Firebird is mostly inferior, it's backup system is much nicer that PostgreSQL's system. Firebird uses a backup API, so if you backup remotely there is no fat client needed and it eliminates all the dependency issues on the client side. The client access library implements the API and that's it. You of course could hack something similar on PGSQL by using SSH and remotely executing pg_dump on the server, but that does not really help on windows servers where SSH is not a common thing. The backup data is coming back to the client regardless, so why not just return it as a result set? Just my opinion on the matter, no flames please. I agree with you 100% it would be nice if this weren't necessary, so no flames intended! It's just if the blogger is going to use a software package, it's in his/her best interests to rtfm. It's no good to write, say, a lot of tricky SQL that depends on transactional control and properties of certain isolation levels, and then be surprised when in MySQL I get odd results, especially when my tables span storage engine types. If I did that, I would blame myself, not MySQL, even if I also thought MySQL should reconsider the behavior. MySQL did warn me after all, in the docs. I do agree it would be nice to change this aspect, and no, I've no clue how hard it would be. As a model of ease and flexibility, Microsoft's SQL Server is very good in this respect, probably the easiest I've ever worked with (at least from v2000 - v2005, prior version upgrades were a little rockier). Hot backups of full databases via T-SQL commands, in-place upgrades that convert page structures as necessary, turn archive log mode on/off dynamically, differential vs incremental backups, backups by tablespace, etc. All in all, they got that part of their engine mostly right, excepting from problems in 2000 with relocating master database files (and got a nice head-start that direction from Sybase). Paul ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Approaches for Lookup values (codes) in OLTP application
On Fri, February 15, 2008 12:38, Richard Huxton wrote: I'm not a Rails guy, but everything I've read about it suggests if you're going to gain any advantage from it, then you should follow its way of doing things. That means not converting anything, but rather writing a rails app that does the same as your current app (if I'm making myself clear). Rails is Opinionated software but it will allow non-arbitrary keys. I realize that I am not expressing myself well but this is in large measure due to transitioning from a non-RBMS environment to relational technology and having at the same time move from a host based application to a web-based n-tier application. So, I tend to get muddled from time to time. To restate my original query in more straight-forward terms: What is considered appropriate RBMS practice to deal with encoded information which has to be validated on input? One always has the option of just putting a table of values into the application itself, but I have not found much to recommend in this approach. I can over-ride Rails assumptions and force a primary key formed by multiple columns which will have a unique index automatically created for the previously described system_values_table. My question still hinges upon what to put into the referential table, a foreign key lookup or just the encoded value and let the application do the reference checking? Consider the example of ISO 3166 country codes. There are at least two ways to handle this: 1. Have a table just for country codes and have the code the primary key 2. Have a systems value table having a code prefix column and the code value concatenated into a key (table_prefix = country_codes + table_value =CA for example) For something externally provided and widely used like country codes then option one is attractive and possibly the most sensible and robust solution. But consider things like transaction status codes. Perhaps an invoice transaction has five possible codes and a credit-note has only three, but one of those three is not valid for invoices. Where does one put such things? What is the generally accepted best practice? Does one construct a separate code table for every transaction type? Is it good practice to have a transaction_type table, a code_table, and a transaction_code_union table and lookup against the union? This is perhaps a very minor and basic issue for this list's audience, but I am going to live with these decisions a very long time and I would prefer to have some idea of what is considered appropriate RBMS treatment for application validation data as opposed to business state data. -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:[EMAIL PROTECTED] Harte Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Pains in upgrading to 8.3
Dave Page wrote: On Fri, Feb 15, 2008 at 4:21 PM, Tony Caduto [EMAIL PROTECTED] wrote: paul rivers wrote: Going from 8.2.4 and 8.2.6 to 8.3.0 has been painless for me. However, unlike the blogger you cite, I read the directions before, not after, attempting it. The blogger has a point about pg_dump and restore, it could be much better, for example the backup process could be part of the server core and instead of having a fat client where most of the process is running on the client, a API could be used where the backup is generated on the server and then have options where it could be left on the server or transferred to the clients PC. Not really an option - the reason it's recommended to use the new pg_dump version with the older server when upgrading is to allow the dump to be made in the way most compatible with the new server, effectively doing some of the upgrade process as part of the dump operation. For the case of upgrading, it wouldn't work. But there are certainly other cases where it would help. Say from your central pgadmin console administering 10 servers from 3 different major release trees :-( It can be done with commandline pg_dump, but it means you have to have three different installs on your management or backup or whatever machine. Those cases would certainly be easier if you could just call a backup API on the server that would feed you the data... (yes, there are ways to do it with ssh tunneling and whatever, but that's yet another external service that has to be set up and configured) I'm not saying it's worth the work and potential downsides, just that there are clear upsides :-) //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Are indexes blown?
On 15/02/2008, Richard Huxton [EMAIL PROTECTED] wrote: Phoenix Kiula wrote: On 15/02/2008, Richard Huxton [EMAIL PROTECTED] wrote: Ah, more new information! This does seem to point to the load, particularly if it's exactly the same query each time. So what do top/vmstat etc show for these go-slow periods? In included top and vmstat info in my other post yesterday, but here it is again: Ah, you had a post yesterday! (goes away, searches for previous post) http://archives.postgresql.org/pgsql-general/2008-02/msg00689.php PG quitting sporadically!! Right, OK. Firstly, stop worrying about index usage and/or bloat. You have unexplained process crashes to deal with first. There's no point in looking at indexes until you figure out what is killing your processes. Secondly, a single line from vmstat isn't useful, you want to compare what is happening when things are fine with when they aren't. Leave vmstat 10 logging to a file so you can catch it. Thirdly, have you upgraded to the latest 8.2 (8.2.6) yet? I see you've reduced work_mem, that's good. Oh, you might as well lower max_connections from 150 too, there's no way you can support that many concurrent queries anyway. The fact that you're seeing various strange socket-related problems is odd. As is the fact that logging doesn't seem to work for you. Are you sure the two sets of vmstat/top figures are from when PG was crashing/running queries slow? Everything seems idle to me in those figures. No. They are the vmstat figures from when I was replying to your email. What will vmstat tell me and how should I set it up to do vmstat 10 logging? Btw, postgresql logging is working. But here're the kind of things I have in there: LOG: test message did not get through on socket for statistics collector LOG: disabling statistics collector for lack of working socket LOG: database system was shut down at 2008-02-15 06:12:10 CST LOG: checkpoint record is at 8/E785304C LOG: redo record is at 8/E785304C; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 0/296892698; next OID: 97929 LOG: next MultiXactId: 1; next MultiXactOffset: 0 LOG: database system is ready LOG: could not receive data from client: Connection reset by peer LOG: unexpected EOF on client connection LOG: could not receive data from client: Connection reset by peer LOG: unexpected EOF on client connection Now I don't know what is wrong or even where I should look. Postgresql is often taking quite a bit of memory and CPU resources. I've reduced work_mem to 10MB and Max_connections to 100. (Anyway, the old values were working just fine until recently!) The biggest problem: when I su into postgres user and do a psql to get into the PG console in my SSH, it takes a whole lot of time to come up! It used to come up in a jiffy earlier!!! It now shows me this error: ~ psql: could not connect to server: Connection timed out Is the server running on host localhost and accepting TCP/IP connections on port 5432? Then, five minutes later, I can connect again! In less than a second! What gives? Finally, very simple queries like this one: select url, disable_in_statistics, id, user_known from links where alias = '1yqw7' and status = 'Y' limit 1 Which used to be server in 5 ms (0.005 seconds) are now taking upwards of 200 seconds! Your suggestion to Explain Analyze -- =# explain analyze select url, disable_in_statistics, id, user_known from links where alias = '1yqw7' and status = 'Y' limit 1 ; QUERY PLAN Limit (cost=0.00..8.74 rows=1 width=113) (actual time=9.639..9.643 rows=1 loops=1) - Index Scan using links2_alias_key on links (cost=0.00..8.74 rows=1 width=113) (actual time=9.630..9.630 rows=1 loops=1) Index Cond: ((alias)::text = '1yqw7'::text) Filter: (status = 'Y'::bpchar) Total runtime: 16.425 ms (5 rows) Now this is only when I have connected to the psql console, of course. Still, these queries are intermittently very slow! ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] the feasibility of sending email from stored procedure in Postgres
I would instead queue messages (or suitable information about them) in a table, and have a process outside PostgreSQL periodically poll for them Why poll when you can wait? http://www.postgresql.org/docs/8.2/interactive/sql-notify.html ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Pains in upgrading to 8.3
Tony Caduto [EMAIL PROTECTED] writes: paul rivers wrote: However, unlike the blogger you cite, I read the directions before, not after, attempting it. The blogger has a point about pg_dump and restore, Does he? He claims it didn't work, but there's no details about what went wrong. He also seems entirely misinformed on the difference between portable and PG-specific pg_dump output. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Pains in upgrading to 8.3
On Fri, Feb 15, 2008 at 10:21:16PM +0800, Phoenix Kiula wrote: http://ogasawalrus.com/blog/node/462 Reading more carefully sounds like it was the first read to me. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Pains in upgrading to 8.3
On Fri, 15 Feb 2008, Tom Lane wrote: He claims it didn't work, but there's no details about what went wrong. He also seems entirely misinformed on the difference between portable and PG-specific pg_dump output. I just left a note on this and related subjects on the blog. If you search for postgresql upgrade 8.3 on Google that comes back as hit #5 already and it would be good to shut down some of the misunderstandings there (PostgreSQL doesn't recreate the databases during the restore process?) before they get any more publicity. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] PG quitting sporadically!!
Phoenix Kiula escribió: Thanks. Comments below. (PS: I am still unable to connect to postgresql even in SSH! I see this message: psql: could not connect to server: Connection timed out Is the server running on host localhost and accepting TCP/IP connections on port 5432? Yes of course the localhost is running the pgsql server and that port is allowed! My educated guess is that your network is behaving funny, or your firewall is crazy. Or your kernel has bugs. The fact that the connections work sometimes is one clue; this is the other: LOG: test message did not get through on socket for statistics collector LOG: disabling statistics collector for lack of working socket This is a bad sign. For one thing, it means autovacuum, if enabled, is not really working at all (which can, in turn, explain slowness). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[GENERAL] returning a resultset from a function
Hi, How do I return a result set? Is there a better way in 'plpgsql' than the one described below? I do not want to make a select from a function(which pretty useful in many cases): SELECT * FROM getfoo(); , but I want to just call the function with SELECT getfoo(); --DROP FUNCTION getfoo(); --DROP type compfoo; CREATE TYPE compfoo AS (f1 integer,f2 integer); CREATE OR REPLACE FUNCTION getfoo() RETURNS SETOF compfoo AS $BODY$ declare ret_row record; BEGIN FOR ret_row IN SELECT id,mun_id FROM mytable LOOP RETURN next ret_row; END LOOP; RETURN; END; $BODY$ LANGUAGE 'plpgsql';
Re: [GENERAL] PostgreSQL 8.3 on Debian, Ubuntu
Greg Smith wrote: I recall a couple of people asking about when 8.3 would be available for Debian and Ubuntu. Here's an update now that some useful packages have come out this week. Thanks for the summary, Greg. Colin ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] Why isn't an index being used when selecting a distinct value?
Version: Postgres 8.1.4 Platform: RHEL Given this scenario with the indexes in place, when I ask for the distinct field1_id values, why does the optimizer choose a sequential scan instead of just reading from the kda_log_fid_cre_20080123_idx index? The time it takes to perform the sequential scan against 20+ million records is way too slow. CREATE TABLE kda_log_20080213 ( field1 character varying(255), field character varying(100), value bigint, period integer DEFAULT 60, created timestamp with time zone DEFAULT ('now'::text)::timestamp(6) with time zone, customer_id integer, field1_id integer ); CREATE INDEX kda_log_cid_cre_fld_20080213_idx ON kda_log_20080213 USING btree (customer_id, created, field1); CREATE INDEX kda_log_fid_cre_20080213_idx ON kda_log_20080213 USING btree (field1_id, created); keaton=# explain select distinct field1_id into temp kda_temp from kda_log_20080213; QUERY PLAN -- Unique (cost=5759201.93..5927827.87 rows=8545 width=4) - Sort (cost=5759201.93..5843514.90 rows=33725188 width=4) Sort Key: field1_id - Seq Scan on kda_log_20080213 (cost=0.00..748067.88 rows=33725188 width=4) (4 rows) Thanks, Keaton -- End of Forwarded Message
Re: [GENERAL] returning a resultset from a function
On Feb 15, 2008, at 2:56 PM, Anton Andreev wrote: Hi, How do I return a result set? Is there a better way in 'plpgsql' than the one described below? I do not want to make a select from a function(which pretty useful in many cases): SELECT * FROM getfoo(); , but I want to just call the function with SELECT getfoo(); This article covers a way to do that: http://www.postgresonline.com/ journal/index.php?/categories/9-advanced. Erik Jones DBA | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate market in style. Visit us online at http://www.myemma.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
[GENERAL] pg_restore, search_path and operator class
Hi, I've searched the archives for this issue but I could not find an answer. I apologize if this has been beaten to death already. Postgresql version: 8.1.2 on Linux The issue: === I've got a user defined data type that has been defined in the public schema. I use pg_dump to dump a table that has a column of this type: create myschema.mytable (id public.mytype primary key, name varchar); pg_dump -U user --schema myschema --table mytable -f mytable.dump mydb When I try to restore this table with psql psql -U user -d mydb -f mytable.dump I get an error SET SET SET SET SET SET CREATE TABLE ALTER TABLE psql:mytable.dump:48: ERROR: data type public.mytype has no default operator class for access method btree HINT: You must specify an operator class for the index or define a default operator class for the data type. This error is not correct because mytype does have a default operator for btree: CREATE OPERATOR CLASS public.mytype_ops_btree DEFAULT FOR TYPE public.mytype USING btree AS... I've included the content of the dump file at the bottom of this email. Note that, at line 11, there is a SET search_path statement, which does not contain public. If I change the search_path to include public Set search_path = myschema, public, pg_catalog; everything works fine. Is there a way to force pg_dump to include public? How should I change my operator classes or data type to make this work? (Moving the data type to pg_catalog works but we've got a lot of data out there to migrate.) Thanks for the help! Jozsef Szalay The dump file == -- -- PostgreSQL database dump -- -- Started on 2008-02-15 21:30:48 UTC SET client_encoding = 'SQL_ASCII'; SET check_function_bodies = false; SET client_min_messages = warning; SET search_path = myschema, pg_catalog; SET default_tablespace = ''; SET default_with_oids = false; -- -- TOC entry 2970 (class 1259 OID 69852) -- Dependencies: 45 422 -- Name: mytable; Type: TABLE; Schema: myschema; Owner: user; Tablespace: -- CREATE TABLE mytable ( id public.mytype NOT NULL, name character varying ); ALTER TABLE myschema.mytable OWNER TO user; -- -- TOC entry 3300 (class 0 OID 69852) -- Dependencies: 2970 -- Data for Name: mytable; Type: TABLE DATA; Schema: myschema; Owner: user -- COPY mytable (id, name) FROM stdin; \. -- -- TOC entry 3299 (class 2606 OID 69858) -- Dependencies: 2970 2970 -- Name: mytable_pkey; Type: CONSTRAINT; Schema: myschema; Owner: user; Tablespace: -- ALTER TABLE ONLY mytable ADD CONSTRAINT mytable_pkey PRIMARY KEY (id); -- Completed on 2008-02-15 21:30:48 UTC -- -- PostgreSQL database dump complete -- ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Approaches for Lookup values (codes) in OLTP application
On Fri, Feb 15, 2008 at 12:12 PM, James B. Byrne [EMAIL PROTECTED] wrote: I can over-ride Rails assumptions and force a primary key formed by multiple columns which will have a unique index automatically created for the previously described system_values_table. My question still hinges upon what to put into the referential table, a foreign key lookup or just the encoded value and let the application do the reference checking? Consider the example of ISO 3166 country codes. There are at least two ways to handle this: 1. Have a table just for country codes and have the code the primary key 2. Have a systems value table having a code prefix column and the code value concatenated into a key (table_prefix = country_codes + table_value =CA for example) Generally speaking, I tend towards using the real value as the key and foreign key in lookup tables, but occasionally using an artificial numeric key is a better choice. If you'll generally always need to know the actual value, you should use it, because then it will be stored in the main table as well. But, if you access that value only 1 time for every 100 accesses, it will likely be faster to have it be on the other end of an int value, which usually takes up less space. For something externally provided and widely used like country codes then option one is attractive and possibly the most sensible and robust solution. But consider things like transaction status codes. Perhaps an invoice transaction has five possible codes and a credit-note has only three, but one of those three is not valid for invoices. Where does one put such things? You could use a simple multi-part check constraint for that, or, if it needs to be more fluid than that, you could use some kind of multi-key table that points to a valid tx type list on a 1 to many basis, and when you insert you FK check the two values against that table. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Approaches for Lookup values (codes) in OLTP application
On Fri, February 15, 2008 14:43, Scott Marlowe wrote: For something externally provided and widely used like country codes then option one is attractive and possibly the most sensible and robust solution. But consider things like transaction status codes. Perhaps an invoice transaction has five possible codes and a credit- note has only three, but one of those three is not valid for invoices. Where does one put such things? You could use a simple multi-part check constraint for that, or, if it needs to be more fluid than that, you could use some kind of multi-key table that points to a valid tx type list on a 1 to many basis, and when you insert you FK check the two values against that table. Is this to say that one should establish a table with the code as the non-unique index and then have as its dependent values the usage contexts which are applied as filters? I do not comprehend what you mean by a valid tx type list on a 1 to many basis. If employed then an fk check presumably has to resolve to a unique entry in the case of code validation. I should rather think that one should set up a uniqueness constraint for a particular code/context combination. Then one select might by code value and context as a where clause parameter. Say : CREATE TABLE system_values_table ( value_as_char char(8), value_context char(30), PRIMARY KEY (value_as_char, value_context) ) I understand from the PostgreSQL documentation (CREATE TABLE) that PRIMARY KEY implies UNIQUE, NOT NULL, and INDEX. Is this correct? Presuming a table entry having value_as_char =ACTV and value_context = INVOICE then when I do a SELECT I would pass the code value (as char) together with the context thus? SELECT * FROM system_values WHERE value_as_char = input_code_as_char, value_context = INVOICE I presume that the decision to place the code value first or the context value first in the primary key construct depends upon whether one foresees the need to span selects based on the context. So, for example, if I intended to provide the UI with a drop down list populated with the available codes then it would be better to have: ... PRIMARY KEY (value_context, value_as_char) ... and I could then populate the selection list with a select having the form: ... SELECT * FROM system_values WHERE value_context = INVOICE ... The DBMS can then decide how to get the qualifying rows back and the index would be usable in this case, whereas if the code value came first in the composite key then the index would be useless for this query. Have I got this more or less straight? -- *** E-Mail is NOT a SECURE channel *** James B. Byrnemailto:[EMAIL PROTECTED] Harte Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Pains in upgrading to 8.3
paul rivers wrote: Going from 8.2.4 and 8.2.6 to 8.3.0 has been painless for me. However, unlike the blogger you cite, I read the directions before, not after, attempting it. The blogger has a point about pg_dump and restore, it could be much better, for example the backup process could be part of the server core and instead of having a fat client where most of the process is running on the client, a API could be used where the backup is generated on the server and then have options where it could be left on the server or transferred to the clients PC. Using pg_dump remotely is becoming a pain because it's not really backwards compatible with earlier releases, so you end up having to have multiple copies laying around to use on different server versions. While Firebird is mostly inferior, it's backup system is much nicer that PostgreSQL's system. Firebird uses a backup API, so if you backup remotely there is no fat client needed and it eliminates all the dependency issues on the client side. The client access library implements the API and that's it. You of course could hack something similar on PGSQL by using SSH and remotely executing pg_dump on the server, but that does not really help on windows servers where SSH is not a common thing. The backup data is coming back to the client regardless, so why not just return it as a result set? Just my opinion on the matter, no flames please. Thanks, Tony ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] dynamic crosstab
-Original Message- Do youthink there is a way to ensure that the order of the values in the array below is the same for each person? tbl(eID, aID, value) Select eID, array_accum(value) from ( (Select Distinct eID from tbl) e CROSS JOIN (Select Distinct aID from tbl) a ) ea LEFT OUTER JOIN tbl USING (eID, aID) GROUP BY eID The only way to ever guarantee a particular order is via an ORDER BY clause. Sure. I just didn’t know where to put it - most aggregates don't care about the row order, but for this one it is important. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Pains in upgrading to 8.3
On Fri, Feb 15, 2008 at 8:21 AM, Phoenix Kiula [EMAIL PROTECTED] wrote: I'm glad I didn't go from 8.2.3 to 8.3 straight! http://ogasawalrus.com/blog/node/462 If only he were on debian or ubuntu, he could run pg_upgradecluster and he'd have been done. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Why isn't an index being used when selecting a distinct value?
Keaton Adams [EMAIL PROTECTED] writes: Version: Postgres 8.1.4 Platform: RHEL Given this scenario with the indexes in place, when I ask for the distinct field1_id values, why does the optimizer choose a sequential scan instead of just reading from the kda_log_fid_cre_20080123_idx index? The time it takes to perform the sequential scan against 20+ million records is way too slow. Try (temporarily) doing: SET enable_seqscan = off; keaton=# explain select distinct field1_id into temp kda_temp from kda_log_20080213; If the database is right that will be even slower. Using a full index scan requires a lot of random access seeks, generally the larger the table the *more* likely a sequential scan and sort is a better approach than using an index. If it's wrong and it's faster then you have to consider whether it's only faster because you've read the table into cache already. Will it be in cache in production? If so then you migth try raising effective_cache_size or lowering random_page_cost. Another thing to try is using GROUP BY instead of DISTINCT. This is one case where the postgres optimizer doesn't handle the two equivalent cases in exactly the same way and there are some plans available in one method that aren't in the other. That's only likely to help if you have relative few values of field1_id but it's worth trying. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] dynamic crosstab
On Fri, Feb 15, 2008 at 9:56 AM, Balázs Klein [EMAIL PROTECTED] wrote: given that answers for a questionnaire are stored as a batch Not in our setup - for all sorts of reasons (preserving responses on a connection failure or restart, monitoring response latency in real time, creating adaptive/branching questionnaires) we send each response separately. people running reports on will be the ones to notice, i.e. at retrieval time. I am not sure - different responses are aggregated into different attributes in different ways - those properties need to be retrieved during scoring/report generation, so being able to create a join directly on a response is a good thing for me. But report generation - in our case it must be a DTP quality PDF - is such a beast anyway that db times dwarf compared to pdf generation. Also, if you need to you can probably add a slony machine to your setup to run the reports on, and it doesn't matter how many reports you run, your production system will only have to run the user interfacing side. This allows for all kinds of optimizing indexing on the reporting server that you might not want to have on the production server. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Pains in upgrading to 8.3
Magnus Hagander wrote: For the case of upgrading, it wouldn't work. But there are certainly other cases where it would help. Say from your central pgadmin console administering 10 servers from 3 different major release trees :-( It can be done with commandline pg_dump, but it means you have to have three different installs on your management or backup or whatever machine. Those cases would certainly be easier if you could just call a backup API on the server that would feed you the data... (yes, there are ways to do it with ssh tunneling and whatever, but that's yet another external service that has to be set up and configured) I'm not saying it's worth the work and potential downsides, just that there are clear upsides :-) Exactly, I didn't necessarily mean the blogger had a point about upgrades in general, just that pg_dump had room for improvement. Hey maybe a backup API is something for the Google Summer of Code thing, it would be really nice to have, and make general backups much easier from a admin point of view. Later, Tony ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Strict-typing benefits/costs
Jeff Davis wrote: If postgresql were to revert to 8.2 implicit casting behavior, would that actually improve compatibility with other DBMSs? Every DBMS probably has it's own rules for implicit casting, different from every other DBMS. So are you sure it wouldn't just introduce more compatibility problems somewhere else? Or worse, it could hide the problems during migration/testing, and they could surface after you put it into production. In my opinion the autocasting behaviors of the database are probably more consistent and fined tuned, than their explicit cast function. Or in the least, they may actually *save* mistakes that (lay) programmers would make (by adding casts for the sake of PG). ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Strict-typing benefits/costs
Tom Lane wrote: It's possible to special-case any particular function you really feel you need this behavior for. We did special-case || (the string concatenation operator), and there was some discussion of also putting in a built-in special case for LIKE, but we desisted from sliding any further down that slippery slope. Since it's possible for users to install such hacks for themselves, as in the example here, http://archives.postgresql.org/pgsql-general/2007-11/msg00538.php there didn't seem to be a convincing case for weakening the type checking for everybody. Tom, is it accurate to assume that newer PG versions will further tighten type-strictness (say, '2008-01-01' presently being comparable to a datetime)? Also, do you know of any other vendors that are heading in this direction (removing by default the autocasts)? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [GENERAL] PG quitting sporadically!!
On Fri, 15 Feb 2008, Phoenix Kiula wrote: I am not sure what checkpoint stuff means. But I added that entry, and now my log has ONLY this: LOG: test message did not get through on socket for statistics collector LOG: disabling statistics collector for lack of working socket If you're getting that, as already suggested you should be chasing down whatever is going on there before you touch anything else. I was throwing out a list of possible things that might cause your problems on a fully working system, but you don't have one of those right now and that's really strange. Have you ever run top and hit the C key to see what all the processes were doing? It labels the major PostgreSQL processes more usefully if you do that. I'd be curious just what is gobbling up resources on your machine, this socket error is somewhat disturbing. Since it sounds like a fairly critical machine you've got going here, if I were you I'd be thinking a bit about whether it might make sense to purchase an hour or two of consulting time from someone who really knows this area. You're doing the right thing asking for help here, but I wonder whether there's something else going on that would be obvious to an expert if they logged into your system and poked around a bit. (This is certainly not an ad for me--I'm not doing consulting right now). I can do that, but the upgrade process is not very simple or automated and will take backup of database and all that rigmarole...Is there an easy RPM method of upgrading postgresql without backingup/restoring etc? I am on CentOS 4. First off: I wouldn't want to introduce another variable here until there's a better understanding of what's wrong with your existing system. You certainly should do a true backup here oriented at disaster recovery before upgrading given the weirdness involved. But the upgrade itself doesn't require one, just installing new packages. If you've already installed the PGDG RPMs on your system (I don't know how else you'd have gotten 8.2.3 onto Centos 4 via RPM) you should be able to download the new ones for the latest 8.2, put them all into a directory, and do rpm -Uvh *.rpm to get the new ones replacing the old (with the server shutdown!). *Should* only take a few minutes. I wrote a little guide to sorting through more complicated upgrades if it comes to that you can find at http://www.westnet.com/~gsmith/content/postgresql/pgrpm.htm , but if the packages are from the same underlying source it's unlikely you'll have that large of a mess. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] PL/PGSql function within a view definition
Due to limitations (perceived or real) within my client application I am trying to return a complex dataset which I am assembling using an expensive PL/PGSql function which I would like to wrap in a writeable view. I have written the function as both a row-level function which returns a ROWTYPE and as a table level function which returns a SETOF ROWTYPES. In both cases I have encountered issues. Please keep in mind that my end goal is to create a writeable view that my client app will treat as a simple table. OPTION ONE - ROWTYPE --this works correctly. Select my_func(1); --and this works correctly Select my_table.a, my_func(my_table.a) Where my_table.a in (1,2,3); --works great. --however when i create the following view and use the following query... Create view my_view as select my_table.a as a, my_func(my_table.a) from my_table; Select * from my_view where a in (1,2,3); --the function appears to be run on each row of my_table which is not tolerable due to the size of my_table and the cost of my_func. Any suggestions on how to force the selection of my_table records prior to executing the function? OPTION TWO SETOF ROWTYPE --this works correctly. Select * from my_func2(1); --however Select * from my_table, my_func(my_table.a) where my_table.a in (1,2,3); --appears to be an illegal construct within postgres which prevents me from creating the following view. Create view my_view as select a, b.* from my_table, my_func(my_table.a) as b; --to be used in the following manner Select * from my_view where a in (1,2,3); Any suggestions on either of these two potential solutions or suggestions as to other methods are greatly appreciated. - Never miss a thing. Make Yahoo your homepage.
Re: [GENERAL] Approaches for Lookup values (codes) in OLTP application
On Feb 15, 2008 5:25 PM, Scott Marlowe [EMAIL PROTECTED] wrote: On Feb 15, 2008 3:31 PM, James B. Byrne [EMAIL PROTECTED] wrote: On Fri, February 15, 2008 14:43, Scott Marlowe wrote: For something externally provided and widely used like country codes then option one is attractive and possibly the most sensible and robust solution. But consider things like transaction status codes. Perhaps an invoice transaction has five possible codes and a credit- note has only three, but one of those three is not valid for invoices. Where does one put such things? You could use a simple multi-part check constraint for that, or, if it needs to be more fluid than that, you could use some kind of multi-key table that points to a valid tx type list on a 1 to many basis, and when you insert you FK check the two values against that table. Is this to say that one should establish a table with the code as the non-unique index and then have as its dependent values the usage contexts which are applied as filters? I do not comprehend what you mean by a valid tx type list on a 1 to many basis. If employed then an fk check presumably has to resolve to a unique entry in the case of code validation. No, I was saying you should have a multi-value key in your lookup table that gives the relation of something like:: create table tx_type_check (tx_type text, codes text, primary key (tx_type, codes)); You populate it with all your possible value combinations, and then in your master table have a FK to the tx_type_check table. Does that make sense? Here's what I had in mind, a simple example: -- Create and load the lookup table: create table tx_type_check (tx_type text, codes text, primary key (tx_type,codes)); insert into tx_type_check values ('invoice','inv1'); insert into tx_type_check values ('invoice','inv2'); insert into tx_type_check values ('invoice','inv3'); insert into tx_type_check values ('invoice','shr1'); insert into tx_type_check values ('invoice','shr2'); insert into tx_type_check values ('credit','shr1'); insert into tx_type_check values ('credit','shr2'); insert into tx_type_check values ('credit','crd1'); -- Create a master table that references this lookup table: create table txm (id serial primary key, tx_type text, tx_code text, foreign key (tx_type,tx_code) references tx_type_check (tx_type,codes)); -- test it insert into txm (tx_type, tx_code) values ('invoice','inv1'); INSERT 0 1 insert into txm (tx_type, tx_code) values ('invoice','shr1'); INSERT 0 1 insert into txm (tx_type, tx_code) values ('invoice','crd1'); ERROR: insert or update on table txm violates foreign key constraint txm_tx_type_fkey DETAIL: Key (tx_type,tx_code)=(invoice,crd1) is not present in table tx_type_check. and we can't insert invalid combinations of the two. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Approaches for Lookup values (codes) in OLTP application
On Feb 15, 2008 3:31 PM, James B. Byrne [EMAIL PROTECTED] wrote: On Fri, February 15, 2008 14:43, Scott Marlowe wrote: For something externally provided and widely used like country codes then option one is attractive and possibly the most sensible and robust solution. But consider things like transaction status codes. Perhaps an invoice transaction has five possible codes and a credit- note has only three, but one of those three is not valid for invoices. Where does one put such things? You could use a simple multi-part check constraint for that, or, if it needs to be more fluid than that, you could use some kind of multi-key table that points to a valid tx type list on a 1 to many basis, and when you insert you FK check the two values against that table. Is this to say that one should establish a table with the code as the non-unique index and then have as its dependent values the usage contexts which are applied as filters? I do not comprehend what you mean by a valid tx type list on a 1 to many basis. If employed then an fk check presumably has to resolve to a unique entry in the case of code validation. No, I was saying you should have a multi-value key in your lookup table that gives the relation of something like:: create table tx_type_check (tx_type text, codes text, primary key (tx_type, codes)); You populate it with all your possible value combinations, and then in your master table have a FK to the tx_type_check table. Does that make sense? ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] pg_restore, search_path and operator class
Jozsef Szalay [EMAIL PROTECTED] writes: 8.1.2 on Linux I've got a user defined data type that has been defined in the public schema. I use pg_dump to dump a table that has a column of this type: create myschema.mytable (id public.mytype primary key, name varchar); ... psql:mytable.dump:48: ERROR: data type public.mytype has no default operator class for access method btree Yeah, this is a known bug; it was fixed in ... umm ... 8.1.3 actually. 2006-02-10 14:01 tgl * src/: backend/catalog/namespace.c, backend/commands/indexcmds.c, backend/utils/cache/typcache.c, include/catalog/namespace.h, include/commands/defrem.h (REL8_1_STABLE): Change search for default operator classes so that it examines all opclasses regardless of the current schema search path. Since CREATE OPERATOR CLASS only allows one default opclass per datatype regardless of schemas, this should have minimal impact, and it fixes problems with failure to find a desired opclass while restoring dump files. Per discussion at http://archives.postgresql.org/pgsql-hackers/2006-02/msg00284.php. Remove now-redundant-or-unused code in typcache.c and namespace.c, and backpatch as far as 8.0. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] performance issues on windows with 8.3.0?
Thanks for all the help. Performance is back where I thought it should be, after I fixed our pooling bug. I didn't think that postgres would be released with performance issues like that - its just too good :) Thanks, Dan ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] Are indexes blown?
On 15/02/2008, Richard Huxton [EMAIL PROTECTED] wrote: Phoenix Kiula wrote: On 15/02/2008, Richard Huxton [EMAIL PROTECTED] wrote: Are you sure the two sets of vmstat/top figures are from when PG was crashing/running queries slow? Everything seems idle to me in those figures. No. They are the vmstat figures from when I was replying to your email. What will vmstat tell me and how should I set it up to do vmstat 10 logging? I'd write a small script and call it e.g. trackusage.sh and save it in /tmp/ #!/bin/sh while (/bin/true) do date /tmp/vmstat_figures.txt vmstat 10 60 /tmp/vmstat_figures.txt done Then, set the execute flag on it and do something like: nohup /tmp/trackusage.sh Thanks Richard! The script you suggested doesn't work: tmp ./trackusage.sh -bash: ./trackusage.sh: /bin/sh: bad interpreter: Permission denied Anyway, I did the vmstat command. I was running it while the system was ok, then not ok, then ok...and so on. So I hope these numbers have captured what the issue is: tmp vmstat 10 60 procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 1 0 3380 331140 114344 2992304003134 2073 2 1 93 3 0 0 3380 414412 114352 299229600 0 100 1105 286 1 1 96 2 0 0 3380 430356 114380 299226800 0 133 1103 280 1 1 95 3 0 0 3380 418988 114392 299225600 2 120 1098 277 1 2 93 4 0 0 3380 347996 114408 299224000 085 1081 134 1 0 97 2 0 0 3380 293236 11 299220400 040 1076 138 0 0 97 2 0 0 3380 252860 114456 299219200 080 1086 141 0 0 97 2 0 0 3380 141340 114480 299216800 254 1078 145 1 0 97 2 0 0 3380 119940 114504 299214400 031 1079 143 1 1 97 1 0 0 3380 104252 114524 299212400 064 1087 182 1 1 96 2 0 0 3380 90556 114560 299208800 059 1087 144 1 0 97 2 0 0 3380 132476 115088 29954600052 174 1130 447 2 1 92 4 0 1 3380 280628 115124 29956840031 220 1144 479 4 2 91 4 0 0 3380 361340 115152 299565600 0 147 1135 338 2 1 94 3 0 0 3380 382028 115180 299562800 2 113 1109 253 1 1 96 2 0 0 3380 369740 115220 299558800 3 200 1107 260 1 1 93 4 0 0 3380 323140 115248 299556000 060 1097 153 1 0 97 2 0 0 3380 280260 115272 299553600 077 1087 133 1 0 98 1 0 0 3380 200580 115296 299551200 265 1089 140 1 0 97 2 0 0 3380 81916 115392 2995676001782 1089 188 2 1 94 2 0 0 3380 16980 98072 29742560048 122 1102 190 2 1 95 3 1 0 3380 21588 73160 29547080086 274 1128 276 2 2 88 8 0 0 3380 52692 57860 293204800 1 128 1106 211 2 1 95 3 0 0 3380 184748 57960 293194800 6 219 1128 451 2 1 92 5 0 0 3380 342996 58016 293189200 0 140 1122 465 2 1 94 3 0 0 3380 452020 58068 293210000 1 122 1114 268 1 1 95 2 0 0 3380 478044 58132 293203600 0 106 1099 294 1 1 95 3 0 0 3380 447540 58224 293194400 1 238 1098 319 2 2 91 5 0 0 3380 392524 58284 293188400 071 1078 134 0 1 97 2 0 0 3380 299684 58340 293182800 188 1079 150 1 0 97 2 0 0 3380 231652 58388 293178000 040 1076 135 1 1 97 1 0 0 3380 139012 58432 293173600 042 1076 145 1 0 97 2 0 0 3380 117884 58472 293169600 167 1092 151 1 0 96 2 0 0 3380 129460 58528 293164000 059 1097 190 1 1 96 2 0 0 3380 179892 58584 293158400 042 1100 158 1 1 97 2 0 0 3380 272900 58648 293152000 0 111 1114 308 1 1 95 3 0 0 3380 399100 58704 293172400 0 132 1128 352 1 1 95 2 0 0 3380 484556 58748 293168000 076 269 1 1 96 2 0 0 3380 501180 58804 293188400 093 1103 249 1 1 96 2 0 0 3380 492636 58864 293182400 0 138 1094 259 1 1 95 3 1 1 3380 428380 58912 293203600 044 1088 142 1 0 98 1 0 0 3380 362340 58996 293195200 145 1085 138 1 0 97 2 0 0 3380 292708 59072 293187600 071 1082 138 1 1 97 2 0 0 3380 179292 59172 293177600 065 1089 149 1 0 97 2 0 0 3380 127292 59236 293171200 038 1090 149 1 0 97 1 0 0 3380 101940 59304 2931904002272 1097 186 1 1 96 2 0 0 3380 134068 59340 293186800 074 1100 148
[GENERAL] Query output into a space delimited/location sensitive file
What would the command be to have a query result be put into a location/space sensitive file [position 1 through 5 would be one thing where position 6 through 10 would be the next field, 11 through 16 another, etc]? Is this even possible with Postgres? ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Are indexes blown?
On 16/02/2008, Greg Smith [EMAIL PROTECTED] wrote: On Sat, 16 Feb 2008, Phoenix Kiula wrote: The script you suggested doesn't work: tmp ./trackusage.sh -bash: ./trackusage.sh: /bin/sh: bad interpreter: Permission denied Try changing the first line to #!/bin/bash Thanks Greg. Same problem with that too. I guess my tmp folder is secured and doesn't allow for executables? I put it in another folder and it's working. ...snip Looks like the worst spot was in the middle here. Something gobbled up over 300MB of memory in 40 seconds, enough to force the OS to blow away almost half its disk buffers just to keep working memory free. Not so bad that it went to swap or invoked the OOM killer but enough to push the I/O block out (bo) up. I would guess the other ugly spots were the later portions where the bo spiked 100. But without knowing more about what the processing using this memory and generating the output I/O are doing it's hard to say why. That's why I suggested you watch top with the command lines turned on for a bit, to see what process(es) are jumping around during the bad periods. Happy to do that, but top keeps changing before I can copy text from it. I think most of the connections seem to be httpd which is Apache 2.2.6. I checked the netstat commands and the server is not under DDOS or anything. My hosting provider tells me that the Postgresql server is taking up a lot of memory but I've been running the same db with the same config for over 2 years. Yes we have been growing but what happened in the last 3 days to warrant a sudden spike in memory consumption??!! Anyway, I want to go back to them with some hard data that postgresql is NOT the one that is causing my server to load. The indexes are all in place (and I've REINDEXed my big tables anyway) so the performance of pg itself is not an issue. I just don't know where to get this hard data. The top output shows httpd on top, and sometimes postmaster, but I don't know how to repeatedly capture it. Any suggestions? ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Are indexes blown?
On Sat, 16 Feb 2008, Phoenix Kiula wrote: The script you suggested doesn't work: tmp ./trackusage.sh -bash: ./trackusage.sh: /bin/sh: bad interpreter: Permission denied Try changing the first line to #!/bin/bash Anyway, I did the vmstat command. I was running it while the system was ok, then not ok, then ok...and so on. So I hope these numbers have captured what the issue is: tmp vmstat 10 60 procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 0 0 3380 323140 115248 299556000 060 1097 153 1 0 97 2 0 0 3380 280260 115272 299553600 077 1087 133 1 0 98 1 0 0 3380 200580 115296 299551200 265 1089 140 1 0 97 2 0 0 3380 81916 115392 2995676001782 1089 188 2 1 94 2 0 0 3380 16980 98072 29742560048 122 1102 190 2 1 95 3 1 0 3380 21588 73160 29547080086 274 1128 276 2 2 88 8 0 0 3380 52692 57860 293204800 1 128 1106 211 2 1 95 3 0 0 3380 184748 57960 293194800 6 219 1128 451 2 1 92 5 0 0 3380 342996 58016 293189200 0 140 1122 465 2 1 94 3 Looks like the worst spot was in the middle here. Something gobbled up over 300MB of memory in 40 seconds, enough to force the OS to blow away almost half its disk buffers just to keep working memory free. Not so bad that it went to swap or invoked the OOM killer but enough to push the I/O block out (bo) up. I would guess the other ugly spots were the later portions where the bo spiked 100. But without knowing more about what the processing using this memory and generating the output I/O are doing it's hard to say why. That's why I suggested you watch top with the command lines turned on for a bit, to see what process(es) are jumping around during the bad periods. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] Are indexes blown?
On Feb 15, 2008 10:38 PM, Phoenix Kiula [EMAIL PROTECTED] wrote: My hosting provider tells me that the Postgresql server is taking up a lot of memory but I've been running the same db with the same config for over 2 years. Yes we have been growing but what happened in the last 3 days to warrant a sudden spike in memory consumption??!! OK, you've repeated this a few times. Unless your load has been the same the whole time, this statement means little. If the same config works for 2 years at load x, but fails in 1 day at load 3x then the problem might have been there all along, and you just weren't running the system hard enough to find the problem. Just because PostgreSQL is exhibiting problems doesn't mean it's all postgresql's fault. 150 or 100 connections is a LOT for a postgresql server, but especially so if you went from actually using 5 or 10 to using 98. The setting's the same, but the number is use is vastly different and will have vastly different results on how postgresql runs. Hanging connections could EASILY cause the problem you're seeing. If the network loses your connection from your app tier to your database, your database might have 100 connections open doing nothing but sitting idle in transaction holding data in memory until the tcp_keepalive kicks in and kills them. The earlier host connection errors point to that problem as well. So, do you have mysteriously crashing or disappearing apache child processes? What do the error logs for apache have to say? Can you tell what your load was when the system worked and what it is now by trawling through the logs or something? (apache or pgsql as long as their equivalent for both time periods.) If you start leaving hanging connections to the database then you are in fact DOSing the database server. Not all DOS attacks are intentional, and a crashing apache - php can do it even without persistent connections. I'd say you haven't proven where the problem is yet, and should look at the app tier. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [GENERAL] Are indexes blown?
On Sat, 16 Feb 2008, Phoenix Kiula wrote: The top output shows httpd on top, and sometimes postmaster, but I don't know how to repeatedly capture it. Any suggestions? Try this: top -bc | tee topdata That will save everything to a file called topdata while also letting you watch it scroll by. Not as easy to catch the bad periods that way, the output is going to be a huge data file, but you'll have a log to sort through of everything. Control-C to get out of there when you're bored. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend