[GENERAL] How to match sets?

2009-09-13 Thread Alban Hertroys

Greetings!

I'm having some troubles creating a query, or rather, I can write one  
that works but the approach feels wrong! The problem at hand boils  
down to finding a record in a group where each result of two result- 
sets matches on some columns.


The actual data I need to match isn't directly from tables but both  
sides of the equation are the results of a set-returning function that  
breaks up a unit string into separate tokens (base-unit  exponent).


An example of the two sets I need to join are, at the left hand side:
 unit  | token | exponent
---+---+--
m.s^-1 | m | 1
m.s^-1 | s | -1
m.s^-2 | m | 1
m.s^-2 | s | -2

And at the right hand side:
 token | exponent
---+--
 m | 1
 s | -2

The goal of the query is to find which unit at the left hand side  
matches all the tokens and exponents at the right hand side, which  
would be 'm.s^-2' in the above example. The order in which the tokens  
are returned can be random, there isn't really a defined order as it  
doesn't change the meaning of a unit.


I do have a possible solution using array_accum [1][2] on an ordered  
version (on unit,token,exponent) of these sets. It's not a pretty  
solution though, I'm not happy with it - it's a transformation (from a  
set to an array) where I feel none should be necessary. Isn't there a  
better solution?


To illustrate, I'd prefer to perform a query somewhat like this:

SELECT unit
  FROM unit, tokenize_unit('m.s^-2') AS token
 WHERE each(unit.token) = each(token.token)
 GROUP BY unit;

But I'm pretty sure it's not possible to use aggregates in the WHERE- 
clause.


Definitions for the above are:

CREATE TYPE unit_token AS (
unittext,
exponentint
);

CREATE OR REPLACE FUNCTION tokenize_unit(unit text)
RETURNS SETOF unit_token
AS '@MODULE_PATH@', 'tokenize_unit_text'
LANGUAGE C IMMUTABLE STRICT;

CREATE TABLE token (
unittextNOT NULL REFERENCES unit,
token   unit_token NOT NULL
);

[1] array_accum is an aggregate from the documentation that transforms  
a set into an array.
[2] The SRF's actually return a type unit_token(token text, exponent  
int) which makes using array_accum and comparisons easier.


Regards,
Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4aacebc413788472316367!



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to match sets?

2009-09-13 Thread Harald Fuchs
In article c07f9bfd-5fc1-4b8b-ba87-c8bdc47d0...@solfertje.student.utwente.nl,
Alban Hertroys dal...@solfertje.student.utwente.nl writes:

 An example of the two sets I need to join are, at the left hand side:
  unit  | token | exponent
 ---+---+--
 m.s^-1 | m | 1
 m.s^-1 | s | -1
 m.s^-2 | m | 1
 m.s^-2 | s | -2

 And at the right hand side:
  token | exponent
 ---+--
  m | 1
  s | -2

 The goal of the query is to find which unit at the left hand side
 matches all the tokens and exponents at the right hand side, which
 would be 'm.s^-2' in the above example. The order in which the tokens
 are returned can be random, there isn't really a defined order as it
 doesn't change the meaning of a unit.

 I do have a possible solution using array_accum [1][2] on an ordered
 version (on unit,token,exponent) of these sets. It's not a pretty
 solution though, I'm not happy with it - it's a transformation (from a
 set to an array) where I feel none should be necessary. Isn't there a
 better solution?

Hm, how about a double negation, i.e. return all units except those
with a non-match?  In SQL:

SELECT t1.unit
FROM t1
EXCEPT
SELECT t1.unit
FROM t1
LEFT JOIN t2 ON t2.token = t1.token AND t2.exponent = t1.exponent
WHERE t2.token IS NULL


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to match sets?

2009-09-13 Thread Ludwig Kniprath

Hello,
this query on the two tables you suggested (named test_left and
test_right) returns the correct result without transformations:

select distinct
   t1.unit
from
   test_left as t1 inner join
   test_left as t2 on t1.unit = t2.unit and t1.token != t2.token and
t1.exponent != t2.exponent

   inner join test_right as t3 on t1.token = t3.token and t1.exponent =
t3.exponent
   inner join test_right as t4 on t2.token = t4.token and t2.exponent =
t4.exponent;

Regards
Ludwig Kniprath


Alban Hertroys schrieb:

Greetings!

I'm having some troubles creating a query, or rather, I can write one 
that works but the approach feels wrong! The problem at hand boils 
down to finding a record in a group where each result of two 
result-sets matches on some columns.


The actual data I need to match isn't directly from tables but both 
sides of the equation are the results of a set-returning function that 
breaks up a unit string into separate tokens (base-unit  exponent).


An example of the two sets I need to join are, at the left hand side:
 unit  | token | exponent
---+---+--
m.s^-1 | m | 1
m.s^-1 | s | -1
m.s^-2 | m | 1
m.s^-2 | s | -2

And at the right hand side:
 token | exponent
---+--
 m | 1
 s | -2

The goal of the query is to find which unit at the left hand side 
matches all the tokens and exponents at the right hand side, which 
would be 'm.s^-2' in the above example. The order in which the tokens 
are returned can be random, there isn't really a defined order as it 
doesn't change the meaning of a unit.


I do have a possible solution using array_accum [1][2] on an ordered 
version (on unit,token,exponent) of these sets. It's not a pretty 
solution though, I'm not happy with it - it's a transformation (from a 
set to an array) where I feel none should be necessary. Isn't there a 
better solution?


To illustrate, I'd prefer to perform a query somewhat like this:

SELECT unit
  FROM unit, tokenize_unit('m.s^-2') AS token
 WHERE each(unit.token) = each(token.token)
 GROUP BY unit;

But I'm pretty sure it's not possible to use aggregates in the 
WHERE-clause.


Definitions for the above are:

CREATE TYPE unit_token AS (
unittext,
exponentint
);

CREATE OR REPLACE FUNCTION tokenize_unit(unit text)
RETURNS SETOF unit_token
AS '@MODULE_PATH@', 'tokenize_unit_text'
LANGUAGE C IMMUTABLE STRICT;

CREATE TABLE token (
unittextNOT NULL REFERENCES unit,
tokenunit_token NOT NULL
);

[1] array_accum is an aggregate from the documentation that transforms 
a set into an array.
[2] The SRF's actually return a type unit_token(token text, exponent 
int) which makes using array_accum and comparisons easier.


Regards,
Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4aacebc413788472316367!







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general