[SQL] is there a distinct function for comma lists ?

2010-09-07 Thread Andreas

 Hi,
is there a distinct function for comma separated lists ?

I sometimes need to update tables where I got a set of IDs, like:

update mytable
   set someattribute = 42
where mytable.id in
(  1, 2, 3, 5, 7, 11, 3, 6, 13, 13, 3, 11 ... )

So there are double entries in the list but in this case its just 
overhead but no problem.


But for calculated values this would not allways be desirable.

update mytable
   set someattribute = someattribute + 1
where mytable.id in
(  1, 2, 3, 5, 7, 11, 3, 6, 13, 13, 3, 11 ... )

How could I get a distinct list? Those lists can have 2000-3000 IDs 
sometimes.


One solution was as follows but perhaps there is something more elegant?

update mytable
   set someattribute = someattribute + 1
where mytable.id in
 ( select distinct id from mytable where id in (  1, 2, 3, 5, 7, 11, 3, 
6, 13, 13, 3, 11 ... ) )



And as bonus ... is there a way to find IDs that are in the list but not 
in the table without creating a temporary table and use a join?


--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] is there a distinct function for comma lists ?

2010-09-07 Thread Andreas Gaab
Hi,

For the problem 1 perhaps something like

select distinct unnest(ARRAY[ 1, 2, 3, 5, 7, 11, 3, 6, 13, 13, 3, 11 ])

Regards,
Andreas

-Ursprüngliche Nachricht-
Von: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org] Im 
Auftrag von Andreas
Gesendet: Dienstag, 7. September 2010 13:52
An: pgsql-sql@postgresql.org
Betreff: [SQL] is there a distinct function for comma lists ?

  Hi,
is there a distinct function for comma separated lists ?

I sometimes need to update tables where I got a set of IDs, like:

update mytable
set someattribute = 42
where mytable.id in
(  1, 2, 3, 5, 7, 11, 3, 6, 13, 13, 3, 11 ... )

So there are double entries in the list but in this case its just 
overhead but no problem.

But for calculated values this would not allways be desirable.

update mytable
set someattribute = someattribute + 1
where mytable.id in
(  1, 2, 3, 5, 7, 11, 3, 6, 13, 13, 3, 11 ... )

How could I get a distinct list? Those lists can have 2000-3000 IDs 
sometimes.

One solution was as follows but perhaps there is something more elegant?

update mytable
set someattribute = someattribute + 1
where mytable.id in
  ( select distinct id from mytable where id in (  1, 2, 3, 5, 7, 11, 3, 
6, 13, 13, 3, 11 ... ) )


And as bonus ... is there a way to find IDs that are in the list but not 
in the table without creating a temporary table and use a join?

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


[SQL] Sequential scan evaluating function for each row, seemingly needlessly

2010-09-07 Thread Bryce Nesbitt
On psql 8.3.9, I ran a limited query limited to 5 results.  There was a 
moderately expensive function call
which I expected to be called 5 times, but was apparently called for 
each row of the sequential scan.  Why?




preproduction=> explain analyze select url(context_key) from extractq 
order by add_date desc limit 5;

  QUERY PLAN
---
 Limit  (cost=19654.53..19654.54 rows=5 width=12) (actual 
time=10001.976..10001.990 rows=5 loops=1)
   ->  Sort  (cost=19654.53..19826.16 rows=68651 width=12) (actual 
time=10001.972..10001.976 rows=5 loops=1)

 Sort Key: add_date
 Sort Method:  top-N heapsort  Memory: 25kB
 ->  Seq Scan on extractq  (cost=0.00..18514.26 rows=68651 
width=12) (actual time=19.145..9770.689 rows=73550 loops=1)

 Total runtime: 10002.150 ms
(6 rows)


preproduction=> explain analyze select context_key from extractq order 
by add_date desc limit 5;

 QUERY PLAN

 Limit  (cost=2491.78..2491.79 rows=5 width=12) (actual 
time=250.188..250.203 rows=5 loops=1)
   ->  Sort  (cost=2491.78..2663.41 rows=68651 width=12) (actual 
time=250.184..250.188 rows=5 loops=1)

 Sort Key: add_date
 Sort Method:  top-N heapsort  Memory: 25kB
 ->  Seq Scan on extractq  (cost=0.00..1351.51 rows=68651 
width=12) (actual time=0.015..145.432 rows=73557 loops=1)

 Total runtime: 250.450 ms
(6 rows)




preproduction=> select version();
version

 PostgreSQL 8.3.9 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real 
(Debian 4.3.2-1.1) 4.3.2



--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] Sequential scan evaluating function for each row, seemingly needlessly

2010-09-07 Thread Tom Lane
Bryce Nesbitt  writes:
> On psql 8.3.9, I ran a limited query limited to 5 results.  There was a 
> moderately expensive function call
> which I expected to be called 5 times, but was apparently called for 
> each row of the sequential scan.  Why?

Given the plan:

>   Limit  (cost=19654.53..19654.54 rows=5 width=12) (actual 
> time=10001.976..10001.990 rows=5 loops=1)
> ->  Sort  (cost=19654.53..19826.16 rows=68651 width=12) (actual 
> time=10001.972..10001.976 rows=5 loops=1)
>   Sort Key: add_date
>   Sort Method:  top-N heapsort  Memory: 25kB
>   ->  Seq Scan on extractq  (cost=0.00..18514.26 rows=68651 
> width=12) (actual time=19.145..9770.689 rows=73550 loops=1)
>   Total runtime: 10002.150 ms
> (6 rows)

any interesting work is going to be done at the seqscan level.  Sort
just sorts, and Limit just limits; neither do any user-defined
calculations.  So yeah, your functions got run for every row of the
table.  (This isn't totally a PG aberration, btw: if you read the SQL
spec closely you'll discover that ORDER BY is defined to happen after
any calculations specified in the SELECT list.)

You could try something like

select my_expensive_function(...), etc, etc from
(select * from some-tables order by foo limit n) ss;

where the inner select list just pulls the columns you'll need in
the outer calculations.

regards, tom lane

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql


Re: [SQL] is there a distinct function for comma lists ?

2010-09-07 Thread Lew

On 09/07/2010 07:52 AM, Andreas wrote:

Hi,
is there a distinct function for comma separated lists ?

I sometimes need to update tables where I got a set of IDs, like:

update mytable
set someattribute = 42
where mytable.id in
( 1, 2, 3, 5, 7, 11, 3, 6, 13, 13, 3, 11 ... )

So there are double entries in the list but in this case its just
overhead but no problem.

But for calculated values this would not allways be desirable.

update mytable
set someattribute = someattribute + 1
where mytable.id in
( 1, 2, 3, 5, 7, 11, 3, 6, 13, 13, 3, 11 ... )

How could I get a distinct list? Those lists can have 2000-3000 IDs
sometimes.

One solution was as follows but perhaps there is something more elegant?

update mytable
set someattribute = someattribute + 1
where mytable.id in
( select distinct id from mytable where id in ( 1, 2, 3, 5, 7, 11, 3, 6,
13, 13, 3, 11 ... ) )


I am not clear on what you're asking here.  From what you say, there's nothing 
to do.  The two forms of the SQL you show have the same result.


The fact that 11 or 13 or whatever appear in the IN list more than once 
doesn't affect the result of the query; 13 is in the IN list no matter how 
many times (> 0) that 13 appears in the IN list.  So a row from mytable with 
id=13 is selected regardless.  It's not like the row will be selected more 
than once.


From the manual:
'The result of IN is "true" if any equal subquery row is found.'


It's still true of more than one equal subquery row is found.  It's not true 
multiple times, it's just true.


If mytable.id is not unique, then every row with that value will be selected, 
but adding DISTINCT to the IN list won't change that either.


--
Lew

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql