[SQL] SELECT DISTINCT

2008-02-19 Thread Shavonne Marietta Wijesinghe
Hello

I have 2 records with the same value. Even when i do a select with DISTINCT, it 
returns me both the records instead of one.

SELECT DISTINCT ON (TE_COGNOME) TE_COGNOME, TE_NOME, N_GEN_TEST, TE_SESSO, 
TE_ATTNASC, TE_LUONASC, TE_INDI, TE_DTNASC, TE_PROVSTATO,  TE_PROV, 
TE_PATERNITA, TE_RICHIESTA FROM MOD48_02 WHERE TE_COGNOME LIKE 'WIJ%' AND 
TRIM(DELETED) IS NULL ORDER BY TE_COGNOME, N_GEN_TEST DESC;

What should I do??


Thanks

Shavonne



[SQL] Am I wasting my time with partitions?

2008-02-19 Thread Stuart Brooks
It seems to me that postgresql doesn't use indexes when being asked for 
an ordered result sets from a partitioned table. I have an application 
where this is critical, but I was hoping to use partitions because of 
the ease of rotating out old rows.


Simply put, I have a table called LineItems which I need to be able to 
page from and so I need to be able to ask for N rows ordered on a 
certain index (with possible constraints).


eg. SELECT * FROM T ORDER BY col1,col2 LIMIT 10;

This works fine and is quick on a single table:


>>>
metadb=> \d lineitems
 Table "test2.lineitems"
   Column|  Type  | Modifiers
--++---
lineitem_key | bigint | not null
time | timestamp(6) without time zone | not null
description  | text   | not null
barcode  | text   | not null
amount   | bigint | not null
Indexes:
   "lineitems_amount_index" btree (amount, lineitem_key)

metadb=> explain select * from lineitems order by amount,lineitem_key 
limit 10;

  QUERY PLAN

Limit  (cost=0.00..0.74 rows=10 width=49)
  ->  Index Scan using lineitems_amount_index on lineitems  
(cost=0.00..39791.76 rows=535500 width=49)

(2 rows)
>>>



If I partition the table by creating a top level table L, and inherited 
tables L1, L2 and issue the same request it does sequential scans on all 
the tables and takes orders of magnitude longer (see below).


In the example below I would have hoped that it would have used an index 
scan on each of the tables returning 10 rows each and then done a merge 
on them. Am I asking too much? Should I just use a single table and take 
the hits on deletes and vacuums?


Regards
Stuart



>>>
metadb=> \d L
 Table "test2.l"
   Column|  Type  | Modifiers
--++---
lineitem_key | bigint | not null
time | timestamp(6) without time zone | not null
description  | text   | not null
barcode  | text   | not null
amount   | bigint | not null
Indexes:
   "l_amount_index" btree (amount, lineitem_key)

metadb=> \d L1
Table "test2.l1"
   Column|  Type  | Modifiers
--++---
lineitem_key | bigint | not null
time | timestamp(6) without time zone | not null
description  | text   | not null
barcode  | text   | not null
amount   | bigint | not null
Indexes:
   "l1_amount_index" btree (amount, lineitem_key)
Inherits: l

metadb=> \d L2
Table "test2.l2"
   Column|  Type  | Modifiers
--++---
lineitem_key | bigint | not null
time | timestamp(6) without time zone | not null
description  | text   | not null
barcode  | text   | not null
amount   | bigint | not null
Indexes:
   "l2_amount_index" btree (amount, lineitem_key)
Inherits: l

metadb=> explain select * from l order by amount,lineitem_key limit 10;
QUERY PLAN
-
Limit  (cost=22207.70..22207.72 rows=10 width=88)
  ->  Sort  (cost=22207.70..23548.09 rows=536156 width=88)
Sort Key: test2.l.amount, test2.l.lineitem_key
->  Result  (cost=0.00..10621.56 rows=536156 width=88)
  ->  Append  (cost=0.00..10621.56 rows=536156 width=88)
->  Seq Scan on l  (cost=0.00..16.90 rows=690 width=88)
->  Seq Scan on l1 l  (cost=0.00..4951.00 
rows=25 width=49)
->  Seq Scan on l2 l  (cost=0.00..5653.66 
rows=285466 width=49)

(8 rows)


NB. Just addressing one of the inherited tables works fine.

metadb=> explain select * from l1 order by amount,lineitem_key limit 10;
   QUERY PLAN
--
Limit  (cost=0.00..0.74 rows=10 width=49)
  ->  Index Scan using l1_amount_index on l1  (cost=0.00..18554.20 
rows=25 width=49)

(2 rows)
>>>

---(end of broadca

[SQL] UPDATE with ORDER BY

2008-02-19 Thread Robins Tharakan
Hi,

I know this kind of a question is asked earlier, but I couldn't find an
answer there (in the previous round of posting).

Instead of wanting to update the first record in an UPDATE .. ORDER BY
condition, (because of triggers that act downward) what I want is that all
records be updated, but in a given order, for e.g. ascending on date.

Is a FOR LOOP my best bet or is something like UPDATE  ORDER BY in the
offing ?

Thanks
*Robins*


Re: [SQL] Am I wasting my time with partitions?

2008-02-19 Thread Richard Huxton

Stuart Brooks wrote:
It seems to me that postgresql doesn't use indexes when being asked for 
an ordered result sets from a partitioned table. I have an application 
where this is critical, but I was hoping to use partitions because of 
the ease of rotating out old rows.



metadb=> explain select * from l order by amount,lineitem_key limit 10;
QUERY PLAN
->  Seq Scan on l  (cost=0.00..16.90 rows=690 width=88)
->  Seq Scan on l1 l  (cost=0.00..4951.00 
rows=25 width=49)
->  Seq Scan on l2 l  (cost=0.00..5653.66 
rows=285466 width=49)



NB. Just addressing one of the inherited tables works fine.

metadb=> explain select * from l1 order by amount,lineitem_key limit 10;


Well, you don't have an index it can use to find the smallest 
(amount,lineitem) across all of lX. If PG was smart enough to figure out 
that it only needed to check l1, then you do. Unfortunately it isn't.


If you add the constraint you use to partition by, does that help you?

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [SQL] Am I wasting my time with partitions?

2008-02-19 Thread Stuart Brooks


It seems to me that postgresql doesn't use indexes when being asked 
for an ordered result sets from a partitioned table. I have an 
application where this is critical, but I was hoping to use partitions 
because of the ease of rotating out old rows.



metadb=> explain select * from l order by amount,lineitem_key limit 10;
QUERY PLAN
->  Seq Scan on l  (cost=0.00..16.90 rows=690 
width=88)
->  Seq Scan on l1 l  (cost=0.00..4951.00 
rows=25 width=49)
->  Seq Scan on l2 l  (cost=0.00..5653.66 
rows=285466 width=49)



NB. Just addressing one of the inherited tables works fine.

metadb=> explain select * from l1 order by amount,lineitem_key limit 10;


Well, you don't have an index it can use to find the smallest 
(amount,lineitem) across all of lX. If PG was smart enough to figure 
out that it only needed to check l1, then you do. Unfortunately it isn't.


You're right, it can't determine which of the partitions will have the 
smallest value, but what it could do is pull the smallest value from 
each and compare. In the absence of the LIMIT there wouldn't be much 
which could be done, but the the LIMIT means it only actually needs to 
pull 10 rows from each partition. An alternative way of doing this would be:


(SELECT * FROM L1 ORDER BY amount,lineitem_key LIMIT 10)
UNION
(SELECT * FROM L2 ORDER BY amount,lineitem_key LIMIT 10)
ORDER BY amount,lineitem_key LIMIT 10;

Unfortunately this means one can't just address the parent table, but it 
does essentially what I'd hoped postgres would do for me :) It would be 
quite a long query if there were 100 partitions!

If you add the constraint you use to partition by, does that help you?

I tried to strip the example down to its bare essentials but in this 
case I would be partitioning by lineitem_key and would obviously index 
and add a CONSTRAINT on that as well. I don't think it would help 
though, the query needs to merge from all tables.


Thanks for the response,
Stuart


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [SQL] SELECT DISTINCT

2008-02-19 Thread Richard Huxton

Shavonne Marietta Wijesinghe wrote:

Hello

I have 2 records with the same value. Even when i do a select with
DISTINCT, it returns me both the records instead of one.

SELECT DISTINCT ON (TE_COGNOME) TE_COGNOME, TE_NOME, N_GEN_TEST,
TE_SESSO, TE_ATTNASC, TE_LUONASC, TE_INDI, TE_DTNASC, TE_PROVSTATO,
TE_PROV, TE_PATERNITA, TE_RICHIESTA FROM MOD48_02 WHERE TE_COGNOME
LIKE 'WIJ%' AND TRIM(DELETED) IS NULL ORDER BY TE_COGNOME, N_GEN_TEST
DESC;


Can you provide:
1. The output of your query (just the duplicate rows will be fine)
2. The output of "\d MOD48_02"?
3. The output of:
   SELECT xmin,xmax, ':' || TE_COGNOME || ':' FROM MOD48_02
   WHERE TE_COGNOME = ???
   For the duplicate value of course

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org