[GENERAL] How to match sets?
Greetings! I'm having some troubles creating a query, or rather, I can write one that works but the approach feels wrong! The problem at hand boils down to finding a record in a group where each result of two result- sets matches on some columns. The actual data I need to match isn't directly from tables but both sides of the equation are the results of a set-returning function that breaks up a unit string into separate tokens (base-unit exponent). An example of the two sets I need to join are, at the left hand side: unit | token | exponent ---+---+-- m.s^-1 | m | 1 m.s^-1 | s | -1 m.s^-2 | m | 1 m.s^-2 | s | -2 And at the right hand side: token | exponent ---+-- m | 1 s | -2 The goal of the query is to find which unit at the left hand side matches all the tokens and exponents at the right hand side, which would be 'm.s^-2' in the above example. The order in which the tokens are returned can be random, there isn't really a defined order as it doesn't change the meaning of a unit. I do have a possible solution using array_accum [1][2] on an ordered version (on unit,token,exponent) of these sets. It's not a pretty solution though, I'm not happy with it - it's a transformation (from a set to an array) where I feel none should be necessary. Isn't there a better solution? To illustrate, I'd prefer to perform a query somewhat like this: SELECT unit FROM unit, tokenize_unit('m.s^-2') AS token WHERE each(unit.token) = each(token.token) GROUP BY unit; But I'm pretty sure it's not possible to use aggregates in the WHERE- clause. Definitions for the above are: CREATE TYPE unit_token AS ( unittext, exponentint ); CREATE OR REPLACE FUNCTION tokenize_unit(unit text) RETURNS SETOF unit_token AS '@MODULE_PATH@', 'tokenize_unit_text' LANGUAGE C IMMUTABLE STRICT; CREATE TABLE token ( unittextNOT NULL REFERENCES unit, token unit_token NOT NULL ); [1] array_accum is an aggregate from the documentation that transforms a set into an array. [2] The SRF's actually return a type unit_token(token text, exponent int) which makes using array_accum and comparisons easier. Regards, Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4aacebc413788472316367! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to match sets?
In article c07f9bfd-5fc1-4b8b-ba87-c8bdc47d0...@solfertje.student.utwente.nl, Alban Hertroys dal...@solfertje.student.utwente.nl writes: An example of the two sets I need to join are, at the left hand side: unit | token | exponent ---+---+-- m.s^-1 | m | 1 m.s^-1 | s | -1 m.s^-2 | m | 1 m.s^-2 | s | -2 And at the right hand side: token | exponent ---+-- m | 1 s | -2 The goal of the query is to find which unit at the left hand side matches all the tokens and exponents at the right hand side, which would be 'm.s^-2' in the above example. The order in which the tokens are returned can be random, there isn't really a defined order as it doesn't change the meaning of a unit. I do have a possible solution using array_accum [1][2] on an ordered version (on unit,token,exponent) of these sets. It's not a pretty solution though, I'm not happy with it - it's a transformation (from a set to an array) where I feel none should be necessary. Isn't there a better solution? Hm, how about a double negation, i.e. return all units except those with a non-match? In SQL: SELECT t1.unit FROM t1 EXCEPT SELECT t1.unit FROM t1 LEFT JOIN t2 ON t2.token = t1.token AND t2.exponent = t1.exponent WHERE t2.token IS NULL -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] How to match sets?
Hello, this query on the two tables you suggested (named test_left and test_right) returns the correct result without transformations: select distinct t1.unit from test_left as t1 inner join test_left as t2 on t1.unit = t2.unit and t1.token != t2.token and t1.exponent != t2.exponent inner join test_right as t3 on t1.token = t3.token and t1.exponent = t3.exponent inner join test_right as t4 on t2.token = t4.token and t2.exponent = t4.exponent; Regards Ludwig Kniprath Alban Hertroys schrieb: Greetings! I'm having some troubles creating a query, or rather, I can write one that works but the approach feels wrong! The problem at hand boils down to finding a record in a group where each result of two result-sets matches on some columns. The actual data I need to match isn't directly from tables but both sides of the equation are the results of a set-returning function that breaks up a unit string into separate tokens (base-unit exponent). An example of the two sets I need to join are, at the left hand side: unit | token | exponent ---+---+-- m.s^-1 | m | 1 m.s^-1 | s | -1 m.s^-2 | m | 1 m.s^-2 | s | -2 And at the right hand side: token | exponent ---+-- m | 1 s | -2 The goal of the query is to find which unit at the left hand side matches all the tokens and exponents at the right hand side, which would be 'm.s^-2' in the above example. The order in which the tokens are returned can be random, there isn't really a defined order as it doesn't change the meaning of a unit. I do have a possible solution using array_accum [1][2] on an ordered version (on unit,token,exponent) of these sets. It's not a pretty solution though, I'm not happy with it - it's a transformation (from a set to an array) where I feel none should be necessary. Isn't there a better solution? To illustrate, I'd prefer to perform a query somewhat like this: SELECT unit FROM unit, tokenize_unit('m.s^-2') AS token WHERE each(unit.token) = each(token.token) GROUP BY unit; But I'm pretty sure it's not possible to use aggregates in the WHERE-clause. Definitions for the above are: CREATE TYPE unit_token AS ( unittext, exponentint ); CREATE OR REPLACE FUNCTION tokenize_unit(unit text) RETURNS SETOF unit_token AS '@MODULE_PATH@', 'tokenize_unit_text' LANGUAGE C IMMUTABLE STRICT; CREATE TABLE token ( unittextNOT NULL REFERENCES unit, tokenunit_token NOT NULL ); [1] array_accum is an aggregate from the documentation that transforms a set into an array. [2] The SRF's actually return a type unit_token(token text, exponent int) which makes using array_accum and comparisons easier. Regards, Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4aacebc413788472316367! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general