Dan,

Hi, I remember reading about an SQL query type which did something like
select all 'aircraft hangers' which contained exactly (or at least) some
given set of aircraft.

Or did it select the list of pilots qualified to fly all the airplanes in
the hanger...
Anyway, I forget the syntax (and the fancy name for this kind of query).


I think the concept you're after is relational division. The aeroplane hangar was one of Celko's examples. Another from him is at http://www.artfulsoftware.com/queries.php#28, other examples at http://www.artfulsoftware.com/queries.php#22, http://www.artfulsoftware.com/queries.php#33,

PB


Dan Bolser wrote:

Hi, I remember reading about an SQL query type which did something like
select all 'aircraft hangers' which contained exactly (or at least) some
given set of aircraft.

Or did it select the list of pilots qualified to fly all the airplanes in
the hanger...
Anyway, I forget the syntax (and the fancy name for this kind of query).

What I want to do is the following, given this data...


Table: ATTRIBUTE_LIST;

ID      ATTRIBUTE
W       A
W       B
W       C
X       A
X       B
X       C
Y       A
Y       B
Y       C
Y       D
Z       E

--> SQL MAGIC -->

Table: CLUSTERS

G_ID    ID
1       W
1       X
2       Y
3       Z


That is, to group together all ID's with the same 'set' of ATTRIBUTES.

Currently I am doing this using 'GROUP_CONCAT', but my attribute list just
went above the limit for the GROUP_CONCAT column...

+---------+------+--------------------------------------+
| Level   | Code | Message                              |
+---------+------+--------------------------------------+
| Warning | 1260 | 7 line(s) were cut by GROUP_CONCAT() |
+---------+------+--------------------------------------+

My query looks roughly like this...


SET @i:=0, @x:='', @row:='';
#
DROP    TABLE CLUSTERS;
CREATE  TABLE CLUSTERS
 (PRIMARY KEY (ID), INDEX (G_ID))
#
SELECT
 ID, G_ID
 #
FROM
(
 SELECT
   ID,
   #
   @x:= ATTR_LIST               AS HIDDEN1,
   #
   IF(@row = @x, @i, @i:[EMAIL PROTECTED]) AS G_ID,
   #
      @row:= @x                 AS HIDDEN2
   #
FROM ( SELECT ID,
     GROUP_CONCAT(ATTRIBUTE) AS ATTR_LIST,
   FROM
     ATTRIBUTE_LIST
   GROUP BY
     ID
   #
 ) AS vt1
 #
 ORDER BY          -- This is very important for
   ATTR_LIST       -- the overall query.
 #
) AS vt2;

(And thats the highly simplified version!)


I can't shake the feeling that this 'string based' approach (while quite
speedy) is inherently messy, and that a proper 'set based' approach
should exist, and shouldn't have the limitation in the number of
attributes that the above method has.

In general I would really like to (somehow) develop a suite of easy to use
'SQL CLUSTER' commands, as the data mining community needs that kind of
thing in nice general (set based) abundance :)

Anyway, thanks for any feedback on any of the above,

Dan.




--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.323 / Virus Database: 267.8.15/49 - Release Date: 7/14/2005


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to