Re: [GENERAL] Postgresql segmentation fault at slot_deform_tuple

2012-05-14 Thread Pham Ngoc Hai
The table definition is:

CREATE TABLE opr_track_and_trace
(
  id serial NOT NULL,
  item_no character varying(100) NOT NULL,
  carrier_account_code character varying(50) NOT NULL,
  source_id bigint,
  source_desc character varying(50),
  carrier_id character varying(20),
  carrier_desc character varying(50),
  reference_code character varying(50),
  date_processed character varying(50),
  track_and_trace_id bigint,
  consignment_no character varying(100),
  date_assigned timestamp without time zone NOT NULL,
  track_and_trace_type_id bigint,
  track_and_trace_desc character varying(100),
  event_code character varying(100),
  event_description character varying(100),
  work_type character varying(50),
  source_name character varying(50),
  login_name character varying(25),
  user_name character varying(50),
  site_code character varying(50),
  site_id character varying(50),
  site_name character varying(50),
  details character varying(255),
  note character varying(255),
  signatory character varying(100),
  signatory_date character varying(100),
  associated_xml character varying(1000),
  associated_data_id character varying(255),
  associated_data_name character varying(255),
  associated_data_barcode character varying(255),
  associated_data_text character varying(255),
  product_type_name character varying(255),
  event_transaction_id character varying(100),
  created_datetime timestamp without time zone DEFAULT now(),
  modified_datetime timestamp without time zone,
  modified_by character varying(30),
  external_group_id character varying(30),
  consignment_reference character varying(255),
  customer_account_number character varying(255),
  success_failure character varying(20),
  CONSTRAINT opr_track_and_trace_pkey PRIMARY KEY (id )
)
WITH (
  OIDS=FALSE
);









 From: Pham Ngoc Hai pngoc...@yahoo.com
To: pgsql-general@postgresql.org pgsql-general@postgresql.org 
Sent: Monday, May 14, 2012 12:12 PM
Subject: [GENERAL] Postgresql segmentation fault at slot_deform_tuple
 

I'm running PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 
4.4.6 20110731 (Red Hat 4.4.6-3), 64-bit 

on CentOS release 6.2 (Final)
Recently we encountered postmaster segfault, what the core dump gives me is:
===
gdb /usr/pgsql-9.1/bin/postmaster core.27986 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-50.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show copying
and show warranty for details.
This GDB was configured as x86_64-redhat-linux-gnu.
For bug reporting
 instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /usr/pgsql-9.1/bin/postmaster...Reading symbols from 
/usr/lib/debug/usr/pgsql-9.1/bin/postgres.debug...done.
done.
[New Thread 27986]
Missing separate debuginfo for 
Try: yum --disablerepo='*' --enablerepo='*-debuginfo' install 
/usr/lib/debug/.build-id/b4/1d7cd75a11fc1c03b68d7f9a3ef1ea1edbf200
Reading symbols from /usr/lib64/libxml2.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib64/libxml2.so.2
Reading symbols from /lib64/libpam.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib64/libpam.so.0
Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib64/libssl.so.10
Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging symbols 
found)...done.
Loaded symbols for /usr/lib64/libcrypto.so.10
Reading symbols from
 /lib64/libgssapi_krb5.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgssapi_krb5.so.2
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libldap-2.4.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libldap-2.4.so.2
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkrb5.so.3
Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols 
found)...done.
Loaded symbols for /lib64/libcom_err.so.2
Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for
 /lib64/libz.so.1
Reading symbols from /lib64/libaudit.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libaudit.so.1
Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols 
found)...done.
Loaded symbols 

Re: [GENERAL] Non inheritable check constraint

2012-05-14 Thread Clodoaldo Neto
2012/3/5 Clodoaldo Neto clodoaldo.pinto.n...@gmail.com

 I have gone through the issue of making a parent table not writable. After
 discussing it (1) I adopted the trigger solution. But I think that a
 trigger is just an invisible layer over the database model and so I'm
 naively proposing a new syntax to postgresql. It would inform that the
 check constraint is not inheritable:

 create table t (c integer check NOT INHERITABLE (false));


It will be in 9.2:

   -

   Allow CHECK constraints to be declared NO INHERIT (Nikhil Sontakke, Alex
   Hunsaker)

   This makes them enforceable only on the parent table, not on child
   tables.

I don't know if it was already there when I asked but ! Thank You !

Clodoaldo

I found a model visible solution but it only prevents non null values from
 being inserted. Or only prevents all inserts if that column is not null:

 create table tfk (c integer unique check(false));
 create table t (c integer, foreign key (c) references tfk(c));

 Regards, Clodoaldo

 (1)
 http://stackoverflow.com/questions/9545783/how-to-prevent-inserts-in-a-table



Re: [GENERAL] Postgres 9.0 Streaming Replication and Load Balancing?

2012-05-14 Thread Paulo Correia

Hello all!

Still haven't found any solution to this problem.

Having a Postgres 9.0 with assynchronous streaming replication to a 
hot-standby slave, both with CentOs 5.6, how can I use both DB instances 
for query load balancing?
I've tried with pgPool-II but the pooling mechanism is disruptive with 
the existing pool on the application servers.


Has anyone had this issue before? Any suggestions?

Best regards,
Paulo Correia


On 13/04/12 14:38, Paulo Correia wrote:

Hi!

I have a Postgres 9.0 on CentOs 5.6 with a functional asynchronous 
streaming replication to a hot-standby slave (over a TCP connection).


At the moment, the slave only makes the replication, but it accepts 
read-only queries.


I need to load-balance the DB requests to both servers and was trying 
pgPool-II but it seems it has some problems if we already have 
connection pooling elsewhere.
I have some application servers that already have their own connection 
pool functionalities and I wonder if anyone found a solution for that.


Imagine I have some different deploys on a Glassfish or a Tomcat 
server, using the server connection pooling facilities and from that 
we would access the database, or the balancer.


Has anyone managed to do this with pgPool-II? Any other options?

Best regards,
Paulo Correia





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] COPY from CSV, passing in default value?

2012-05-14 Thread adebarros
Assuming I have a table structured like so:

CREATE TABLE salaries (
Town varchar(30),
County varchar(30),
Supervisor varchar(30),
StartDate date,
Salary int,
Benefits int
);

If I have a CSV with only three of those fields, I can import like this:

COPY salaries (Town, Supervisor, Salary)
FROM 'C:\salaries.csv'
WITH (FORMAT CSV);

However, what if I wanted to assign a default value during import to
populate the County field? In my dreams it would be something like this
(which does not work):

COPY salaries (Town, 'County Name', Supervisor, Salary)
FROM 'C:\salaries.csv'
WITH (FORMAT CSV);

Any ideas?

Thanks.


--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/COPY-from-CSV-passing-in-default-value-tp5708672.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Performance of PostgreSQL B+-tree algorithm

2012-05-14 Thread Kyle Lanclos
I spent some time last week staring at the code for the PostgreSQL
B+-tree implementation. What I hoped to find, and was not immediately
able to determine, was the Knuth order for the PostgreSQL B+-tree
implementation. It is entirely possible that I simply got lost in the
wrong C file.

My goal is to make an informed assertion about the performance of
a PostgreSQL B+-tree index as the quantity of records in our database
grows more or less unbounded.

To use a common reference, wikipedia states the following:

Bayer  McCreight (1972), Comer (1979), and
others define the order of B-tree as the
minimum number of keys in a non-root node.
Folk  Zoellick (1992) points out that terminology
is ambiguous because the maximum number of keys
is not clear. An order 3 B-tree might hold a
maximum of 6 keys or a maximum of 7 keys.
(Knuth 1998, p. 483) avoids the problem by defining
the order to be maximum number of children (which
is one more than the maximum number of keys).

http://en.wikipedia.org/wiki/B-tree

I would be happy to refer to an academic publication if it contains a
clear analysis of the PostgreSQL B+-tree implementation.

Thanks much,

--Kyle

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] COPY from CSV, passing in default value?

2012-05-14 Thread François Beausoleil

Le 2012-05-14 à 13:31, adebarros a écrit :

 However, what if I wanted to assign a default value during import to
 populate the County field? In my dreams it would be something like this
 (which does not work):
 
 COPY salaries (Town, 'County Name', Supervisor, Salary)
 FROM 'C:\salaries.csv'
 WITH (FORMAT CSV);
 
 Any ideas?

Import to a temp table, fill in the default value, then copy to the final 
table, something like this:

CREATE TEMPORARY TABLE salaries_import(LIKE (salaries) );
COPY salaries_import(town, supervisor, salary)
FROM '...',
WITH (format csv);
INSERT salaries(town, country, supervisor, salary)
 SELECT town, 'County Name', supervisor, salary
 FROM salaries_import;

Hope that helps!
François Beausoleil

Re: [GENERAL] Performance of PostgreSQL B+-tree algorithm

2012-05-14 Thread Tom Lane
Kyle Lanclos lanc...@ucolick.org writes:
 I spent some time last week staring at the code for the PostgreSQL
 B+-tree implementation. What I hoped to find, and was not immediately
 able to determine, was the Knuth order for the PostgreSQL B+-tree
 implementation. It is entirely possible that I simply got lost in the
 wrong C file.

 My goal is to make an informed assertion about the performance of
 a PostgreSQL B+-tree index as the quantity of records in our database
 grows more or less unbounded.

 To use a common reference, wikipedia states the following:

   Bayer  McCreight (1972), Comer (1979), and
   others define the order of B-tree as the
   minimum number of keys in a non-root node.
   Folk  Zoellick (1992) points out that terminology
   is ambiguous because the maximum number of keys
   is not clear. An order 3 B-tree might hold a
   maximum of 6 keys or a maximum of 7 keys.
   (Knuth 1998, p. 483) avoids the problem by defining
   the order to be maximum number of children (which
   is one more than the maximum number of keys).

Well, that would depend on the data type being indexed, which you did
not specify; and if it's a variable-length type then it's really hard to
give a concrete answer.  For integer or int8 keys the answer is
typically about 400, though, depending on whether you're talking about
a 32- or 64-bit platform.  Basically it's 4 bytes for line pointer,
plus 8 bytes for index tuple header, plus maxalign'ed size of the index
key, divided into page size (less a couple dozen bytes for page header).
You could increase the result by building with a page size of more than
the default 8K, though I've seen no recent experiments suggesting that
doing so is likely to be a win.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance of PostgreSQL B+-tree algorithm

2012-05-14 Thread Kyle Lanclos
Tom Lane wrote:
 Well, that would depend on the data type being indexed, which you did
 not specify; and if it's a variable-length type then it's really hard to
 give a concrete answer.

Thanks for the quick reply; I did not appreciate that the Knuth order
would vary according to the data being indexed.

In my specific case, I have an index on (text, double). There are individual
indexes on (text) and (double) that are of some interest, but the main
interest is the two-column index.

The text column in question typically does not contain values longer than
ten characters.

 Basically it's 4 bytes for line pointer, plus 8 bytes for index tuple
 header, plus maxalign'ed size of the index key, divided into page size
 (less a couple dozen bytes for page header).

So, it is the size of the index key that varies depending on the column
type?

 You could increase the result by building with a page size of more than
 the default 8K, though I've seen no recent experiments suggesting that
 doing so is likely to be a win.

I'm thinking it would have to be a very large increase in page size for
it to have an impact. I'm guessing you would also pay a fixed cost
(log (Knuth order)) to traverse a leaf node once you get there. One can
probably produce graphs that show how many records one needs in a database
table before the page size increase starts to make sense.

--Kyle

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Performance of PostgreSQL B+-tree algorithm

2012-05-14 Thread Tom Lane
Kyle Lanclos lanc...@ucolick.org writes:
 Tom Lane wrote:
 Well, that would depend on the data type being indexed, which you did
 not specify; and if it's a variable-length type then it's really hard to
 give a concrete answer.

 In my specific case, I have an index on (text, double). There are individual
 indexes on (text) and (double) that are of some interest, but the main
 interest is the two-column index.
 The text column in question typically does not contain values longer than
 ten characters.

 Basically it's 4 bytes for line pointer, plus 8 bytes for index tuple
 header, plus maxalign'ed size of the index key, divided into page size
 (less a couple dozen bytes for page header).

 So, it is the size of the index key that varies depending on the column
 type?

Yeah.  You could probably safely assume that the text column occupies at
most 16 bytes (and because of the alignment requirement for the double,
it's unlikely to be much less either).  So that gives 4+8+16+8 = 36
bytes per index entry for this case, so you could expect to fit at least
220 or so entries per index page.

BTW, I'm unsure that that's a representative number in practice.  The
traditional wisdom for btree indexes on changing data is that the fill
factor averages only around 2/3rds, which would mean you'd really find
maybe 150 or so entries on a typical index page.  Not sure if the model
you're planning to use accounts for index slack space separately or not.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: Move the postgreSQL database from Drive C to Map Network Drive (Called Z)

2012-05-14 Thread EllyR
Thanks.

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Move-the-postgreSQL-database-from-Drive-C-to-Map-Network-Drive-Called-Z-tp5684058p5708675.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Encryption - searching and sorting

2012-05-14 Thread Bruno Wolff III

On Thu, May 03, 2012 at 15:42:00 +0200,
  David Welton dav...@dedasys.com wrote:


Thoughts?


Peter Wayner wrote a book Translucent Databases that has some techniques
for helping solve problems like this. It won't magically solve your
problem, but might give you some more ideas on how you can do it.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Global Named Prepared Statements

2012-05-14 Thread Samba
Hi,

Does postgresql support Global Prepared Statements, which are prepared only
once per server and not per every connection?

I see a discussion about this in the pgsql-hacker archives but it does not
have any conclusion; further, that discussion also deviates a bit from my
question by proposing to cache any arbitrary statements when used too often
based on statistical analyses.

here is the original thread :
http://archives.postgresql.org/pgsql-hackers/2008-03/msg01228.php

I noticed that a similar feature request is made in mysql community as
well; link here: http://bugs.mysql.com/bug.php?id=38732

Problem with per-connection prepared statements is that the onus of
preparing those statements for each connection lies with the client which
makes those connections. Ideally, the performance of an application must be
decided by the Server that hosts and not by the client that uses the
service.

Hence, it would be great if from a connection C1  i can prepare the
statement:


PREPARE GLOBAL fooplan (int, text, bool, numeric) AS
INSERT INTO foo VALUES($1, $2, $3, $4);


And From Connections C2-Cn, I can execute the same statement with bind
parameters:


EXECUTE fooplan(1, 'Hunter Valley', 't', 200.00);

This would help DBA to define the important and costly but fine-tuned
queries and expose only the prepared statement names to the application
developers. This will avoid scenarios like the developers forgetting to
prepare all the required statements per each connection on the one hand and
to make sure that they do not try to prepare it again and again which would
be causing errors and if not handled properly may cause functionality to be
broken at unexpected places.

Rather, if one can prepare named statements globally at once and then reuse
them through the entire uptime of the server, would that not be a lot more
beneficial?

If it is observed that a particular prepared statement is not behaving
properly, then it can be deallocated and fixed and then prepared again.


I'm not that much sure whether such a feature is already implemented in
postgres or not hence posting it to general mailing list; if folks feel
that it ought to go to hackers list, then please guide me so.

Thanks and Regards,
Samba



http://bugs.mysql.com/bug.php?id=38732


[GENERAL] dblink.sql and Linux

2012-05-14 Thread Mike Christensen
I've become a big fan of DBLink lately, but I'm curious where it lives
on Linux installations.

On my Windows 9.0.0 installation, there's a contrib\dblink.sql file
that I just run and everything is setup for me.

I have a few Linux installations as well (I think they're all 9.1
though) and there's no dblink.sql file anywhere to be found, nor does
the dblink() function exist in any schema out of the box.  Is there
somewhere to download the installation script, or is there another
method to install this support?  Thanks!

Mike

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] dblink.sql and Linux

2012-05-14 Thread Devrim GÜNDÜZ
Hi,

On Mon, 2012-05-14 at 18:05 -0700, Mike Christensen wrote:
 I've become a big fan of DBLink lately, but I'm curious where it lives
 on Linux installations.

Which Linux? Which package/installer?

It mostly ships with the -contrib package.

-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Global Named Prepared Statements

2012-05-14 Thread Tom Lane
Samba saas...@gmail.com writes:
 Does postgresql support Global Prepared Statements, which are prepared only
 once per server and not per every connection?

No.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] dblink.sql and Linux

2012-05-14 Thread Mike Christensen
I just installed Ubuntu 12.04 today.  Postgres was not listed in the
Ubuntu Software Center, so I downloaded the apt installer from:

http://www.openscg.com/se/oscg_home_download.jsp

Mike

On Mon, May 14, 2012 at 6:10 PM, Devrim GÜNDÜZ dev...@gunduz.org wrote:
 Hi,

 On Mon, 2012-05-14 at 18:05 -0700, Mike Christensen wrote:
 I've become a big fan of DBLink lately, but I'm curious where it lives
 on Linux installations.

 Which Linux? Which package/installer?

 It mostly ships with the -contrib package.

 --
 Devrim GÜNDÜZ
 Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
 PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
 Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
 http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Cannot find installers for 9.2 Beta

2012-05-14 Thread Basil Bourque
Today’s announcement of 9.2 beta said installers were available…
—
Get PostgreSQL 9.2 beta, including binaries and installers for Windows, Linux 
and Mac from our download page: http://www.postgresql.org/download
—

But I cannot find any installers at all for the beta. Specifically I’d like Mac 
ones, but I don’t see any.

Can someone point me in the right direction?

—Basil Bourque



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Retrieving multiple columns from a subquery

2012-05-14 Thread Chris Hanks
Nothing? Are subqueries just not meant to be used this way?


On Wed, May 9, 2012 at 9:42 AM, Chris Hanks
christopher.m.ha...@gmail.comwrote:

 Hello -

 I have two tables:

 CREATE TABLE users
 (
   id serial NOT NULL,
   created_at timestamp with time zone NOT NULL,
   last_seen_at timestamp with time zone NOT NULL,
   -- some other columns...
 )

 CREATE TABLE emails
 (
   user_id integer NOT NULL,
   address text NOT NULL,
   created_at timestamp with time zone NOT NULL,
   confirmed_at timestamp with time zone,
   -- some other columns...
   CONSTRAINT emails_user_id_fkey FOREIGN KEY (user_id)
   REFERENCES users (id) MATCH SIMPLE
   ON UPDATE CASCADE ON DELETE CASCADE
 )

 The gist is that a single user can be related to multiple emails, and some
 email addresses are confirmed (they've clicked a link I've sent there, so I
 know it's valid) and some aren't.

 Routinely, when I'm fetching users from the db I also want to get the best
 email address for each user. That is, the email address that they've
 confirmed the most recently, or failing that, the one that they created the
 most recently. I've been doing this via a subselect:

 SELECT *,
   (SELECT address FROM emails WHERE (user_id = id) ORDER BY
 confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS best_email
 FROM users

 I like the subquery approach because I can use my ORM to easily append it
 to whatever query I'm running against the users table (whether I'm looking
 up one user or many), without having to do an explicit join and trim out
 the unnecessary rows. Also, in the future I'm planning on adding additional
 subqueries to get (for example) each user's current subscription status,
 and I'm afraid that the joins will get ungainly. Besides, I find subqueries
 much easier to reason about than joins.

 My problem is that now I need to get not only the best email's address,
 but whether it is confirmed (whether confirmed_at is not null). My first
 attempt was to simply repeat the subquery:

 SELECT *,
   (SELECT address FROM emails WHERE (user_id = id) ORDER BY
 confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS best_email,
   ((SELECT confirmed_at FROM emails WHERE (user_id = id) ORDER BY
 confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) IS NOT NULL) AS
 best_email_confirmed
 FROM users

 I had hoped Postgres would recognize that the two subqueries were
 identical, but judging from the explain output from my development database
 it's not, and it's simply running the subquery twice instead:

 Seq Scan on users  (cost=0.00..333.65 rows=13 width=81)
   SubPlan 1
 -  Limit  (cost=12.79..12.79 rows=1 width=48)
   -  Sort  (cost=12.79..12.80 rows=5 width=48)
 Sort Key: public.emails.confirmed_at,
 public.emails.created_at
 -  Bitmap Heap Scan on emails  (cost=4.29..12.76 rows=5
 width=48)
   Recheck Cond: (user_id = users.id)
   -  Bitmap Index Scan on emails_pkey
  (cost=0.00..4.29 rows=5 width=0)
 Index Cond: (user_id = users.id)
   SubPlan 2
 -  Limit  (cost=12.79..12.79 rows=1 width=16)
   -  Sort  (cost=12.79..12.80 rows=5 width=16)
 Sort Key: public.emails.confirmed_at,
 public.emails.created_at
 -  Bitmap Heap Scan on emails  (cost=4.29..12.76 rows=5
 width=16)
   Recheck Cond: (user_id = users.id)
-  Bitmap Index Scan on emails_pkey
  (cost=0.00..4.29 rows=5 width=0)
 Index Cond: (user_id = users.id)

 It would be ideal if I could pull both results from the same subquery,
 something like:

 SELECT *,
   (SELECT address, confirmed_at IS NOT NULL FROM emails WHERE
 (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at
 DESC LIMIT 1) AS (best_email, best_email_confirmed)
 FROM users

 But this isn't valid syntax. I tried putting the subquery under a FROM
 clause, but it won't work with my user_id = id condition, and throws
 ERROR:  subquery in FROM cannot refer to other relations of same query
 level. I think CTEs might be an answer, but I'm stuck on 8.3 for the
 foreseeable future, which doesn't support them.

 Does anyone have any suggestions?



Re: [GENERAL] Retrieving multiple columns from a subquery

2012-05-14 Thread Merlin Moncure
On Wed, May 9, 2012 at 11:42 AM, Chris Hanks
christopher.m.ha...@gmail.com wrote:
 Hello -

 I have two tables:

 CREATE TABLE users
 (
   id serial NOT NULL,
   created_at timestamp with time zone NOT NULL,
   last_seen_at timestamp with time zone NOT NULL,
   -- some other columns...
 )

 CREATE TABLE emails
 (
   user_id integer NOT NULL,
   address text NOT NULL,
   created_at timestamp with time zone NOT NULL,
   confirmed_at timestamp with time zone,
   -- some other columns...
   CONSTRAINT emails_user_id_fkey FOREIGN KEY (user_id)
       REFERENCES users (id) MATCH SIMPLE
       ON UPDATE CASCADE ON DELETE CASCADE
 )

 The gist is that a single user can be related to multiple emails, and some
 email addresses are confirmed (they've clicked a link I've sent there, so I
 know it's valid) and some aren't.

 Routinely, when I'm fetching users from the db I also want to get the best
 email address for each user. That is, the email address that they've
 confirmed the most recently, or failing that, the one that they created the
 most recently. I've been doing this via a subselect:

 SELECT *,
   (SELECT address FROM emails WHERE (user_id = id) ORDER BY
 confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS best_email
 FROM users

 I like the subquery approach because I can use my ORM to easily append it to
 whatever query I'm running against the users table (whether I'm looking up
 one user or many), without having to do an explicit join and trim out the
 unnecessary rows. Also, in the future I'm planning on adding additional
 subqueries to get (for example) each user's current subscription status, and
 I'm afraid that the joins will get ungainly. Besides, I find subqueries much
 easier to reason about than joins.

 My problem is that now I need to get not only the best email's address, but
 whether it is confirmed (whether confirmed_at is not null). My first attempt
 was to simply repeat the subquery:

 SELECT *,
   (SELECT address FROM emails WHERE (user_id = id) ORDER BY
 confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS best_email,
   ((SELECT confirmed_at FROM emails WHERE (user_id = id) ORDER BY
 confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) IS NOT NULL) AS
 best_email_confirmed
 FROM users

 I had hoped Postgres would recognize that the two subqueries were identical,
 but judging from the explain output from my development database it's not,
 and it's simply running the subquery twice instead:

 Seq Scan on users  (cost=0.00..333.65 rows=13 width=81)
   SubPlan 1
     -  Limit  (cost=12.79..12.79 rows=1 width=48)
           -  Sort  (cost=12.79..12.80 rows=5 width=48)
                 Sort Key: public.emails.confirmed_at,
 public.emails.created_at
                 -  Bitmap Heap Scan on emails  (cost=4.29..12.76 rows=5
 width=48)
                       Recheck Cond: (user_id = users.id)
                       -  Bitmap Index Scan on emails_pkey
  (cost=0.00..4.29 rows=5 width=0)
                             Index Cond: (user_id = users.id)
   SubPlan 2
     -  Limit  (cost=12.79..12.79 rows=1 width=16)
           -  Sort  (cost=12.79..12.80 rows=5 width=16)
                 Sort Key: public.emails.confirmed_at,
 public.emails.created_at
                 -  Bitmap Heap Scan on emails  (cost=4.29..12.76 rows=5
 width=16)
                       Recheck Cond: (user_id = users.id)
                       -  Bitmap Index Scan on emails_pkey
  (cost=0.00..4.29 rows=5 width=0)
                             Index Cond: (user_id = users.id)

 It would be ideal if I could pull both results from the same subquery,
 something like:

 SELECT *,
   (SELECT address, confirmed_at IS NOT NULL FROM emails WHERE
 (user_id = id) ORDER BY confirmed_at DESC NULLS LAST, created_at
 DESC LIMIT 1) AS (best_email, best_email_confirmed)
 FROM users

 But this isn't valid syntax. I tried putting the subquery under a FROM
 clause, but it won't work with my user_id = id condition, and throws
 ERROR:  subquery in FROM cannot refer to other relations of same query
 level. I think CTEs might be an answer, but I'm stuck on 8.3 for the
 foreseeable future, which doesn't support them.

 Does anyone have any suggestions?

try this.  sometimes, but not always, it will avoid the extra subplans:

SELECT u.*, (emails).* FROM
(
   SELECT *,
   (SELECT e FROM emails e WHERE (user_id = id) ORDER BY
confirmed_at DESC NULLS LAST, created_at DESC LIMIT 1) AS emails,
  FROM users u
) q

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Retrieving multiple columns from a subquery

2012-05-14 Thread Tom Lane
Chris Hanks christopher.m.ha...@gmail.com writes:
 Nothing? Are subqueries just not meant to be used this way?

The SQL standard says not ;-).

You could approximate it like this:

select ..., (select row(x,y,z) from ...), ... from ...;

as long as you don't mind pulling the composite-value output syntax
apart.  This avoids the single-output-column syntactic restriction
by cramming all the values into one column.

[ thinks for a bit... ]  It seems like you ought to be able to get PG
to pull the composite values apart again, with something like

select ..., (x).*, ... from
(select ..., (select row(x,y,z) from ...) as x, ...
 from ... offset 0) ss;

but when I try this I get
ERROR:  record type has not been registered
That's a bug, probably, but dunno how hard to fix.  In the meantime you
could work around it by casting the row() expression to a named
composite type; which might be a good idea anyway since there's no other
obvious way to control the column names that will be exposed by the
(x).* expansion.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general