Bravo! Ken. Great explaination!  -  Shawn

                                                                                       
                                  
                      Michael Stassen                                                  
                                  
                      <[EMAIL PROTECTED]        To:       Ken Easson <[EMAIL 
PROTECTED]>                                   
                      erizon.net>               cc:       [EMAIL PROTECTED]            
                              
                                                Fax to:                                
                                  
                      06/14/2004 11:16          Subject:  Re: using a column value in 
IN() in a join condition.          
                      PM                                                               
                                  
                                                                                       
                                  
                                                                                       
                                  





Ken Easson wrote:

> Hello,
>
> I am trying to retrieve a cross join of two tables. Table one contains an
id
> column, table two contains a column that can list up to three id's from
> table one.
>
> SELECT dix_ondemand_genre.id, dix_ondemand_shows.genre,
>        id IN(dix_ondemand_shows.genre) as test
> FROM dix_ondemand_genre CROSS JOIN dix_ondemand_shows
> ON id IN(dix_ondemand_shows.genre);
>
> This select finds only the first value of the IN() column, however i want
to

That's not how IN works.  IN expects a comma separated list of values to
compare.  One column = one value.  Because id is an INT, mysql expects each

value in IN's list to be an INT.  The only value in the list for the first
row is '1,2,6', which evaluates as 1 when interpreted as an INT, because
everything starting with the first non-numeric character (the comma) is
ignored.

> return every possible result for all genres (Cartesian product):

This isn't really a Cartesian Product, as you don't want every row in one
table joined with every row in the other.  You do have a join condition,
it's just a little complicated.

> to illustrate - here is a sample output of what i do get:
>
> id | genre
> 1  | 1,2,6
> 3  | 3,6
> 4  | 4,6
>
> But this is what i want to have:
> 1 | 1,2,6
> 2 | 1,2,6
> 3 | 3,6
> 4 | 4,6
> 5 | NULL
> 6 | 1,2,6
> 6 | 3,6
> 6 | 4,6

I expect that FIND_IN_SET() will do what you are trying to with IN:

   SELECT genre.id, shows.genre
   FROM dix_ondemand_genre genre, dix_ondemand_shows shows
   WHERE FIND_IN_SET(genre.id, shows.genre);

This will work, but it will be slow, as there is no way mysql can use use
an
index on shows.genre to help do the job.

> actually - what i really want is to be able to group the id column to
count
> how many genres are returned for each id, however the group by isn't the
> problem... I've tried quite a number of Join types, all to no avail.

Wait, do you mean for each show id?  That's the opposite of the above.  For

each genre in the genres table, we will get a row for each show in the
shows
table which has that genre in shows.genre.  Adding COUNT and GROUP BY will
get a count of how many shows are in each genre:

   SELECT genre.id, count(*)
   FROM dix_ondemand_genre genre, dix_ondemand_shows shows
   WHERE FIND_IN_SET(genre.id, shows.genre)
   GROUP BY genre.id;

Again, this will be slow.

If you really want the number of genres per show, you need to count the
commas in shows.genre and add one.  You don't need a join.

> any help would be great.
>
> ken easson

The problems you're having here are a good example of why this isn't the
best way to organize your data.

It seems you have something like

  Table genres
   id    name    ...
    1    comedy
    2    drama

  Table shows
   id    name    ...    genre
    1                   1,2,6
    2                   3,6

(I dropped "dix_ondemand_" to save typing.)

Databases work best when there is one value in each column, but you are
cramming up to three values in the shows.genre column.  You have a
many-to-many relationship between genres and shows.  Each genre is
represented by several shows, and each show may have up to 3 genres.  This
is properly handled by a third, relationship table.

Create a new table show_genres with 2 columns, show_id and genre_id.
Create
one row for each show-genre relationship:

   show_id  genre_id
      1        1
      1        2
      1        6
      2        3
      2        6

This table replaces the genre list column in your shows table.

How many shows are in each genre?

   SELECT g.id, g.name, COUNT(*) FROM genres g, show_genres sg
   WHERE g.id = sg.genre_id
   GROUP BY g.id;

How many genres are in each show?

   SELECT s.id, s.name, COUNT(*) FROM shows s, show_genres sg
   WHERE s.id = sg.show_id
   GROUP BY s.id;

Which shows are comedies?

   SELECT s.id, s.name
   FROM show s, genres g, show_genres sg
   WHERE s.id = sg.show_id
   AND g.id = sg.genre_id
   AND g.name = 'comedy';

Which genres are represented by "The Show"?

   SELECT g.name
   FROM show s, genres g, show_genres sg
   WHERE s.id = sg.show_id
   AND g.id = sg.genre_id
   AND s.name = 'The Show';

And so on.  Best of all, these queries will be quick, because they will be
able to use the indexes you put on the relevant columns.

Michael



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]







-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to