Re: [GENERAL] Postgresql segmentation fault at slot_deform_tuple
The table definition is: CREATE TABLE opr_track_and_trace ( id serial NOT NULL, item_no character varying(100) NOT NULL, carrier_account_code character varying(50) NOT NULL, source_id bigint, source_desc character varying(50), carrier_id character varying(20), carrier_desc character varying(50), reference_code character varying(50), date_processed character varying(50), track_and_trace_id bigint, consignment_no character varying(100), date_assigned timestamp without time zone NOT NULL, track_and_trace_type_id bigint, track_and_trace_desc character varying(100), event_code character varying(100), event_description character varying(100), work_type character varying(50), source_name character varying(50), login_name character varying(25), user_name character varying(50), site_code character varying(50), site_id character varying(50), site_name character varying(50), details character varying(255), note character varying(255), signatory character varying(100), signatory_date character varying(100), associated_xml character varying(1000), associated_data_id character varying(255), associated_data_name character varying(255), associated_data_barcode character varying(255), associated_data_text character varying(255), product_type_name character varying(255), event_transaction_id character varying(100), created_datetime timestamp without time zone DEFAULT now(), modified_datetime timestamp without time zone, modified_by character varying(30), external_group_id character varying(30), consignment_reference character varying(255), customer_account_number character varying(255), success_failure character varying(20), CONSTRAINT opr_track_and_trace_pkey PRIMARY KEY (id ) ) WITH ( OIDS=FALSE ); From: Pham Ngoc Hai pngoc...@yahoo.com To: pgsql-general@postgresql.org pgsql-general@postgresql.org Sent: Monday, May 14, 2012 12:12 PM Subject: [GENERAL] Postgresql segmentation fault at slot_deform_tuple I'm running PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3), 64-bit on CentOS release 6.2 (Final) Recently we encountered postmaster segfault, what the core dump gives me is: === gdb /usr/pgsql-9.1/bin/postmaster core.27986 GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as x86_64-redhat-linux-gnu. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /usr/pgsql-9.1/bin/postmaster...Reading symbols from /usr/lib/debug/usr/pgsql-9.1/bin/postgres.debug...done. done. [New Thread 27986] Missing separate debuginfo for Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install /usr/lib/debug/.build-id/b4/1d7cd75a11fc1c03b68d7f9a3ef1ea1edbf200 Reading symbols from /usr/lib64/libxml2.so.2...(no debugging symbols found)...done. Loaded symbols for /usr/lib64/libxml2.so.2 Reading symbols from /lib64/libpam.so.0...(no debugging symbols found)...done. Loaded symbols for /lib64/libpam.so.0 Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols found)...done. Loaded symbols for /usr/lib64/libssl.so.10 Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging symbols found)...done. Loaded symbols for /usr/lib64/libcrypto.so.10 Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libgssapi_krb5.so.2 Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libdl.so.2 Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done. Loaded symbols for /lib64/libm.so.6 Reading symbols from /lib64/libldap-2.4.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libldap-2.4.so.2 Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. Loaded symbols for /lib64/libc.so.6 Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols found)...done. Loaded symbols for /lib64/libkrb5.so.3 Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols found)...done. Loaded symbols for /lib64/libcom_err.so.2 Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done. Loaded symbols for /lib64/libz.so.1 Reading symbols from /lib64/libaudit.so.1...(no debugging symbols found)...done. Loaded symbols for /lib64/libaudit.so.1 Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols found)...done. Loaded symbols for /lib64/libcrypt.so.1 Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols found)...done. Loaded symbols
Re: [GENERAL] Non inheritable check constraint
2012/3/5 Clodoaldo Neto clodoaldo.pinto.n...@gmail.com I have gone through the issue of making a parent table not writable. After discussing it (1) I adopted the trigger solution. But I think that a trigger is just an invisible layer over the database model and so I'm naively proposing a new syntax to postgresql. It would inform that the check constraint is not inheritable: create table t (c integer check NOT INHERITABLE (false)); It will be in 9.2: - Allow CHECK constraints to be declared NO INHERIT (Nikhil Sontakke, Alex Hunsaker) This makes them enforceable only on the parent table, not on child tables. I don't know if it was already there when I asked but ! Thank You ! Clodoaldo I found a model visible solution but it only prevents non null values from being inserted. Or only prevents all inserts if that column is not null: create table tfk (c integer unique check(false)); create table t (c integer, foreign key (c) references tfk(c)); Regards, Clodoaldo (1) http://stackoverflow.com/questions/9545783/how-to-prevent-inserts-in-a-table
Re: [GENERAL] Postgres 9.0 Streaming Replication and Load Balancing?
Hello all! Still haven't found any solution to this problem. Having a Postgres 9.0 with assynchronous streaming replication to a hot-standby slave, both with CentOs 5.6, how can I use both DB instances for query load balancing? I've tried with pgPool-II but the pooling mechanism is disruptive with the existing pool on the application servers. Has anyone had this issue before? Any suggestions? Best regards, Paulo Correia On 13/04/12 14:38, Paulo Correia wrote: Hi! I have a Postgres 9.0 on CentOs 5.6 with a functional asynchronous streaming replication to a hot-standby slave (over a TCP connection). At the moment, the slave only makes the replication, but it accepts read-only queries. I need to load-balance the DB requests to both servers and was trying pgPool-II but it seems it has some problems if we already have connection pooling elsewhere. I have some application servers that already have their own connection pool functionalities and I wonder if anyone found a solution for that. Imagine I have some different deploys on a Glassfish or a Tomcat server, using the server connection pooling facilities and from that we would access the database, or the balancer. Has anyone managed to do this with pgPool-II? Any other options? Best regards, Paulo Correia -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] COPY from CSV, passing in default value?
Assuming I have a table structured like so: CREATE TABLE salaries ( Town varchar(30), County varchar(30), Supervisor varchar(30), StartDate date, Salary int, Benefits int ); If I have a CSV with only three of those fields, I can import like this: COPY salaries (Town, Supervisor, Salary) FROM 'C:\salaries.csv' WITH (FORMAT CSV); However, what if I wanted to assign a default value during import to populate the County field? In my dreams it would be something like this (which does not work): COPY salaries (Town, 'County Name', Supervisor, Salary) FROM 'C:\salaries.csv' WITH (FORMAT CSV); Any ideas? Thanks. -- View this message in context: http://postgresql.1045698.n5.nabble.com/COPY-from-CSV-passing-in-default-value-tp5708672.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Performance of PostgreSQL B+-tree algorithm
I spent some time last week staring at the code for the PostgreSQL B+-tree implementation. What I hoped to find, and was not immediately able to determine, was the Knuth order for the PostgreSQL B+-tree implementation. It is entirely possible that I simply got lost in the wrong C file. My goal is to make an informed assertion about the performance of a PostgreSQL B+-tree index as the quantity of records in our database grows more or less unbounded. To use a common reference, wikipedia states the following: Bayer McCreight (1972), Comer (1979), and others define the order of B-tree as the minimum number of keys in a non-root node. Folk Zoellick (1992) points out that terminology is ambiguous because the maximum number of keys is not clear. An order 3 B-tree might hold a maximum of 6 keys or a maximum of 7 keys. (Knuth 1998, p. 483) avoids the problem by defining the order to be maximum number of children (which is one more than the maximum number of keys). http://en.wikipedia.org/wiki/B-tree I would be happy to refer to an academic publication if it contains a clear analysis of the PostgreSQL B+-tree implementation. Thanks much, --Kyle -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] COPY from CSV, passing in default value?
Le 2012-05-14 à 13:31, adebarros a écrit : However, what if I wanted to assign a default value during import to populate the County field? In my dreams it would be something like this (which does not work): COPY salaries (Town, 'County Name', Supervisor, Salary) FROM 'C:\salaries.csv' WITH (FORMAT CSV); Any ideas? Import to a temp table, fill in the default value, then copy to the final table, something like this: CREATE TEMPORARY TABLE salaries_import(LIKE (salaries) ); COPY salaries_import(town, supervisor, salary) FROM '...', WITH (format csv); INSERT salaries(town, country, supervisor, salary) SELECT town, 'County Name', supervisor, salary FROM salaries_import; Hope that helps! François Beausoleil
Re: [GENERAL] Performance of PostgreSQL B+-tree algorithm
Kyle Lanclos lanc...@ucolick.org writes: I spent some time last week staring at the code for the PostgreSQL B+-tree implementation. What I hoped to find, and was not immediately able to determine, was the Knuth order for the PostgreSQL B+-tree implementation. It is entirely possible that I simply got lost in the wrong C file. My goal is to make an informed assertion about the performance of a PostgreSQL B+-tree index as the quantity of records in our database grows more or less unbounded. To use a common reference, wikipedia states the following: Bayer McCreight (1972), Comer (1979), and others define the order of B-tree as the minimum number of keys in a non-root node. Folk Zoellick (1992) points out that terminology is ambiguous because the maximum number of keys is not clear. An order 3 B-tree might hold a maximum of 6 keys or a maximum of 7 keys. (Knuth 1998, p. 483) avoids the problem by defining the order to be maximum number of children (which is one more than the maximum number of keys). Well, that would depend on the data type being indexed, which you did not specify; and if it's a variable-length type then it's really hard to give a concrete answer. For integer or int8 keys the answer is typically about 400, though, depending on whether you're talking about a 32- or 64-bit platform. Basically it's 4 bytes for line pointer, plus 8 bytes for index tuple header, plus maxalign'ed size of the index key, divided into page size (less a couple dozen bytes for page header). You could increase the result by building with a page size of more than the default 8K, though I've seen no recent experiments suggesting that doing so is likely to be a win. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance of PostgreSQL B+-tree algorithm
Tom Lane wrote: Well, that would depend on the data type being indexed, which you did not specify; and if it's a variable-length type then it's really hard to give a concrete answer. Thanks for the quick reply; I did not appreciate that the Knuth order would vary according to the data being indexed. In my specific case, I have an index on (text, double). There are individual indexes on (text) and (double) that are of some interest, but the main interest is the two-column index. The text column in question typically does not contain values longer than ten characters. Basically it's 4 bytes for line pointer, plus 8 bytes for index tuple header, plus maxalign'ed size of the index key, divided into page size (less a couple dozen bytes for page header). So, it is the size of the index key that varies depending on the column type? You could increase the result by building with a page size of more than the default 8K, though I've seen no recent experiments suggesting that doing so is likely to be a win. I'm thinking it would have to be a very large increase in page size for it to have an impact. I'm guessing you would also pay a fixed cost (log (Knuth order)) to traverse a leaf node once you get there. One can probably produce graphs that show how many records one needs in a database table before the page size increase starts to make sense. --Kyle -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Performance of PostgreSQL B+-tree algorithm
Kyle Lanclos lanc...@ucolick.org writes: Tom Lane wrote: Well, that would depend on the data type being indexed, which you did not specify; and if it's a variable-length type then it's really hard to give a concrete answer. In my specific case, I have an index on (text, double). There are individual indexes on (text) and (double) that are of some interest, but the main interest is the two-column index. The text column in question typically does not contain values longer than ten characters. Basically it's 4 bytes for line pointer, plus 8 bytes for index tuple header, plus maxalign'ed size of the index key, divided into page size (less a couple dozen bytes for page header). So, it is the size of the index key that varies depending on the column type? Yeah. You could probably safely assume that the text column occupies at most 16 bytes (and because of the alignment requirement for the double, it's unlikely to be much less either). So that gives 4+8+16+8 = 36 bytes per index entry for this case, so you could expect to fit at least 220 or so entries per index page. BTW, I'm unsure that that's a representative number in practice. The traditional wisdom for btree indexes on changing data is that the fill factor averages only around 2/3rds, which would mean you'd really find maybe 150 or so entries on a typical index page. Not sure if the model you're planning to use accounts for index slack space separately or not. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Re: Move the postgreSQL database from Drive C to Map Network Drive (Called Z)
Thanks. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Move-the-postgreSQL-database-from-Drive-C-to-Map-Network-Drive-Called-Z-tp5684058p5708675.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Encryption - searching and sorting
On Thu, May 03, 2012 at 15:42:00 +0200, David Welton dav...@dedasys.com wrote: Thoughts? Peter Wayner wrote a book Translucent Databases that has some techniques for helping solve problems like this. It won't magically solve your problem, but might give you some more ideas on how you can do it. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Global Named Prepared Statements
Hi, Does postgresql support Global Prepared Statements, which are prepared only once per server and not per every connection? I see a discussion about this in the pgsql-hacker archives but it does not have any conclusion; further, that discussion also deviates a bit from my question by proposing to cache any arbitrary statements when used too often based on statistical analyses. here is the original thread : http://archives.postgresql.org/pgsql-hackers/2008-03/msg01228.php I noticed that a similar feature request is made in mysql community as well; link here: http://bugs.mysql.com/bug.php?id=38732 Problem with per-connection prepared statements is that the onus of preparing those statements for each connection lies with the client which makes those connections. Ideally, the performance of an application must be decided by the Server that hosts and not by the client that uses the service. Hence, it would be great if from a connection C1 i can prepare the statement: PREPARE GLOBAL fooplan (int, text, bool, numeric) AS INSERT INTO foo VALUES($1, $2, $3, $4); And From Connections C2-Cn, I can execute the same statement with bind parameters: EXECUTE fooplan(1, 'Hunter Valley', 't', 200.00); This would help DBA to define the important and costly but fine-tuned queries and expose only the prepared statement names to the application developers. This will avoid scenarios like the developers forgetting to prepare all the required statements per each connection on the one hand and to make sure that they do not try to prepare it again and again which would be causing errors and if not handled properly may cause functionality to be broken at unexpected places. Rather, if one can prepare named statements globally at once and then reuse them through the entire uptime of the server, would that not be a lot more beneficial? If it is observed that a particular prepared statement is not behaving properly, then it can be deallocated and fixed and then prepared again. I'm not that much sure whether such a feature is already implemented in postgres or not hence posting it to general mailing list; if folks feel that it ought to go to hackers list, then please guide me so. Thanks and Regards, Samba http://bugs.mysql.com/bug.php?id=38732
[GENERAL] dblink.sql and Linux
I've become a big fan of DBLink lately, but I'm curious where it lives on Linux installations. On my Windows 9.0.0 installation, there's a contrib\dblink.sql file that I just run and everything is setup for me. I have a few Linux installations as well (I think they're all 9.1 though) and there's no dblink.sql file anywhere to be found, nor does the dblink() function exist in any schema out of the box. Is there somewhere to download the installation script, or is there another method to install this support? Thanks! Mike -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] dblink.sql and Linux
Hi, On Mon, 2012-05-14 at 18:05 -0700, Mike Christensen wrote: I've become a big fan of DBLink lately, but I'm curious where it lives on Linux installations. Which Linux? Which package/installer? It mostly ships with the -contrib package. -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [GENERAL] Global Named Prepared Statements
Samba saas...@gmail.com writes: Does postgresql support Global Prepared Statements, which are prepared only once per server and not per every connection? No. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] dblink.sql and Linux
I just installed Ubuntu 12.04 today. Postgres was not listed in the Ubuntu Software Center, so I downloaded the apt installer from: http://www.openscg.com/se/oscg_home_download.jsp Mike On Mon, May 14, 2012 at 6:10 PM, Devrim GÜNDÜZ dev...@gunduz.org wrote: Hi, On Mon, 2012-05-14 at 18:05 -0700, Mike Christensen wrote: I've become a big fan of DBLink lately, but I'm curious where it lives on Linux installations. Which Linux? Which package/installer? It mostly ships with the -contrib package. -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Cannot find installers for 9.2 Beta
Today’s announcement of 9.2 beta said installers were available… — Get PostgreSQL 9.2 beta, including binaries and installers for Windows, Linux and Mac from our download page: http://www.postgresql.org/download — But I cannot find any installers at all for the beta. Specifically I’d like Mac ones, but I don’t see any. Can someone point me in the right direction? —Basil Bourque -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Retrieving multiple columns from a subquery
Nothing? Are subqueries just not meant to be used this way? On Wed, May 9, 2012 at 9:42 AM, Chris Hanks christopher.m.ha...@gmail.comwrote: Hello - I have two tables: CREATE TABLE users ( id serial NOT NULL, created_at timestamp with time zone NOT NULL, last_seen_at timestamp with time zone NOT NULL, -- some other columns... ) CREATE TABLE emails ( user_id integer NOT NULL, address text NOT NULL, created_at timestamp with time zone NOT NULL, confirmed_at timestamp with time zone, -- some other columns... CONSTRAINT emails_user_id_fkey FOREIGN KEY (user_id) REFERENCES users (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE ) The gist is that a single user can be related to multiple emails, and some email addresses are confirmed (they've clicked a link I've sent there, so I know it's valid) and some aren't. Routinely, when I'm fetching users from the db I also want to get the best email address for each user. That is, the email address that they've confirmed the most recently, or failing that, the one that they created the most recently. I've been doing this via a subselect: SELECT *, (SELECT address FROM emails WHERE (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS best_email FROM users I like the subquery approach because I can use my ORM to easily append it to whatever query I'm running against the users table (whether I'm looking up one user or many), without having to do an explicit join and trim out the unnecessary rows. Also, in the future I'm planning on adding additional subqueries to get (for example) each user's current subscription status, and I'm afraid that the joins will get ungainly. Besides, I find subqueries much easier to reason about than joins. My problem is that now I need to get not only the best email's address, but whether it is confirmed (whether confirmed_at is not null). My first attempt was to simply repeat the subquery: SELECT *, (SELECT address FROM emails WHERE (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS best_email, ((SELECT confirmed_at FROM emails WHERE (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) IS NOT NULL) AS best_email_confirmed FROM users I had hoped Postgres would recognize that the two subqueries were identical, but judging from the explain output from my development database it's not, and it's simply running the subquery twice instead: Seq Scan on users (cost=0.00..333.65 rows=13 width=81) SubPlan 1 - Limit (cost=12.79..12.79 rows=1 width=48) - Sort (cost=12.79..12.80 rows=5 width=48) Sort Key: public.emails.confirmed_at, public.emails.created_at - Bitmap Heap Scan on emails (cost=4.29..12.76 rows=5 width=48) Recheck Cond: (user_id = users.id) - Bitmap Index Scan on emails_pkey (cost=0.00..4.29 rows=5 width=0) Index Cond: (user_id = users.id) SubPlan 2 - Limit (cost=12.79..12.79 rows=1 width=16) - Sort (cost=12.79..12.80 rows=5 width=16) Sort Key: public.emails.confirmed_at, public.emails.created_at - Bitmap Heap Scan on emails (cost=4.29..12.76 rows=5 width=16) Recheck Cond: (user_id = users.id) - Bitmap Index Scan on emails_pkey (cost=0.00..4.29 rows=5 width=0) Index Cond: (user_id = users.id) It would be ideal if I could pull both results from the same subquery, something like: SELECT *, (SELECT address, confirmed_at IS NOT NULL FROM emails WHERE (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS (best_email, best_email_confirmed) FROM users But this isn't valid syntax. I tried putting the subquery under a FROM clause, but it won't work with my user_id = id condition, and throws ERROR: subquery in FROM cannot refer to other relations of same query level. I think CTEs might be an answer, but I'm stuck on 8.3 for the foreseeable future, which doesn't support them. Does anyone have any suggestions?
Re: [GENERAL] Retrieving multiple columns from a subquery
On Wed, May 9, 2012 at 11:42 AM, Chris Hanks christopher.m.ha...@gmail.com wrote: Hello - I have two tables: CREATE TABLE users ( id serial NOT NULL, created_at timestamp with time zone NOT NULL, last_seen_at timestamp with time zone NOT NULL, -- some other columns... ) CREATE TABLE emails ( user_id integer NOT NULL, address text NOT NULL, created_at timestamp with time zone NOT NULL, confirmed_at timestamp with time zone, -- some other columns... CONSTRAINT emails_user_id_fkey FOREIGN KEY (user_id) REFERENCES users (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE ) The gist is that a single user can be related to multiple emails, and some email addresses are confirmed (they've clicked a link I've sent there, so I know it's valid) and some aren't. Routinely, when I'm fetching users from the db I also want to get the best email address for each user. That is, the email address that they've confirmed the most recently, or failing that, the one that they created the most recently. I've been doing this via a subselect: SELECT *, (SELECT address FROM emails WHERE (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS best_email FROM users I like the subquery approach because I can use my ORM to easily append it to whatever query I'm running against the users table (whether I'm looking up one user or many), without having to do an explicit join and trim out the unnecessary rows. Also, in the future I'm planning on adding additional subqueries to get (for example) each user's current subscription status, and I'm afraid that the joins will get ungainly. Besides, I find subqueries much easier to reason about than joins. My problem is that now I need to get not only the best email's address, but whether it is confirmed (whether confirmed_at is not null). My first attempt was to simply repeat the subquery: SELECT *, (SELECT address FROM emails WHERE (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS best_email, ((SELECT confirmed_at FROM emails WHERE (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) IS NOT NULL) AS best_email_confirmed FROM users I had hoped Postgres would recognize that the two subqueries were identical, but judging from the explain output from my development database it's not, and it's simply running the subquery twice instead: Seq Scan on users (cost=0.00..333.65 rows=13 width=81) SubPlan 1 - Limit (cost=12.79..12.79 rows=1 width=48) - Sort (cost=12.79..12.80 rows=5 width=48) Sort Key: public.emails.confirmed_at, public.emails.created_at - Bitmap Heap Scan on emails (cost=4.29..12.76 rows=5 width=48) Recheck Cond: (user_id = users.id) - Bitmap Index Scan on emails_pkey (cost=0.00..4.29 rows=5 width=0) Index Cond: (user_id = users.id) SubPlan 2 - Limit (cost=12.79..12.79 rows=1 width=16) - Sort (cost=12.79..12.80 rows=5 width=16) Sort Key: public.emails.confirmed_at, public.emails.created_at - Bitmap Heap Scan on emails (cost=4.29..12.76 rows=5 width=16) Recheck Cond: (user_id = users.id) - Bitmap Index Scan on emails_pkey (cost=0.00..4.29 rows=5 width=0) Index Cond: (user_id = users.id) It would be ideal if I could pull both results from the same subquery, something like: SELECT *, (SELECT address, confirmed_at IS NOT NULL FROM emails WHERE (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS (best_email, best_email_confirmed) FROM users But this isn't valid syntax. I tried putting the subquery under a FROM clause, but it won't work with my user_id = id condition, and throws ERROR: subquery in FROM cannot refer to other relations of same query level. I think CTEs might be an answer, but I'm stuck on 8.3 for the foreseeable future, which doesn't support them. Does anyone have any suggestions? try this. sometimes, but not always, it will avoid the extra subplans: SELECT u.*, (emails).* FROM ( SELECT *, (SELECT e FROM emails e WHERE (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS emails, FROM users u ) q merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Retrieving multiple columns from a subquery
Chris Hanks christopher.m.ha...@gmail.com writes: Nothing? Are subqueries just not meant to be used this way? The SQL standard says not ;-). You could approximate it like this: select ..., (select row(x,y,z) from ...), ... from ...; as long as you don't mind pulling the composite-value output syntax apart. This avoids the single-output-column syntactic restriction by cramming all the values into one column. [ thinks for a bit... ] It seems like you ought to be able to get PG to pull the composite values apart again, with something like select ..., (x).*, ... from (select ..., (select row(x,y,z) from ...) as x, ... from ... offset 0) ss; but when I try this I get ERROR: record type has not been registered That's a bug, probably, but dunno how hard to fix. In the meantime you could work around it by casting the row() expression to a named composite type; which might be a good idea anyway since there's no other obvious way to control the column names that will be exposed by the (x).* expansion. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general