RE: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-18 Thread hsv
 2013/06/13 23:08 +, Rick James 
FIND_IN_SET might work the cleanest...
WHERE FIND_IN_SET('action', genres) OR/AND [NOT] ...

And have genres look like 'action,drama,foobar', that is comma-separators, and 
no need for leading/trailing comma.
That would also work for genres = '1,3,10,19,38' and FIND_IN_SET('19', genres)

And you seem no fan of named BITs (SET), either. *sigh*


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-13 Thread hsv
 2013/06/11 12:59 -0700, Daevid Vincent 
Also, just for SG this is how we are currently implementing it, but we feel
the REGEXP is killing our queries and while clever is a bit hacky and
nullifies any indexes we have on the genres column as it requires a
file_sort table scan to compare substrings basically...

SELECT * FROM scene_all_genres WHERE scene_id = 17;

scene_id  genres  
  
  17  1|3|10|19|38|53|58|59|  

SELECT * FROM scene_all_genres WHERE scene_id = 11;

scene_id  genres 
  ---
  11  1|10|19|31|32|59|  

Except that, it seems to me, it somehow reflects the reality of assigning 
attributes to the scenes (movies?) that you catalog. In a way, it looks very 
much like using a bitstring wherin each place stands for one attribute. If you 
then have also a bitstring for each user s likes and one for rows (peeves), 
telling howmany 1s are at the same place for the genres and liking (bit-AND, 
MySQL  followed by telling the number of 1s), and same for the genres and 
the row or peeve yields a number howmany match for liking, and how many match 
for becoming peeved. If the liking is enough greater than the becoming peeved, 
the scene and the user match.

Unhappily, although this, using bitstring for set of attributes to match, is an 
old and well understood topic, MySQL s support for bitstrings is poor, limited 
to integers (as C is so limited)--that is, to 64 bits. If you have more, you 
have to use more words. There is, furthermore, no function for telling 
howmany 1s (or 0s) there are in an integer.

Now, if, in a more perfect world, MySQL had bitstring, and, furthermore, MySQL 
s SET were mapped onto bitstring, where it belongs, you could not only use bit 
operations (MySQL s  | ^), but also name the bits as you like.

The problem with writing one s own bit-telling function is, of course, time, 
and hiding useful information from the optimizer. In any case, here is a 
function for it, using an old well worn trick that depends on binary arithmetic:

delimiter ?
create function bittell(B INTEGER) RETURNS INTEGER
DETERMINISTIC
NO SQL
COMMENT 'Howmany 1s in argument?'
begin
declare E integer;
SET E = 0;
WHILE B  0 DO
set B = (B-1)  B, E = E + 1;
end WHILE;
RETURN E;
end ?
delimiter ;

If you stick with the character-string set, with a slight change in 
representation you can use a simpler-looking pattern--not more efficient, if 
MySQL s implementation is good, but of easier reading: separate the decimal 
numerals with a character that is neither a decimal digit nor a REGEXP 
operator, and bound the whole string with it--comma or semicolon (among others) 
are good.
   ',1,10,19,31,32,59,' REGEXP ',10,+.*,38,' is 0
   ',1,10,19,31,32,59,' REGEXP ',10,+.*,32,' is 1
(See also FIND_IN_SET.)

Somewhere I read that for lack of support bitstring has been withdrawn from the 
SQL standard. This is such an obvious use; why is it not supported?


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-13 Thread hsv
 2013/06/11 12:59 -0700, Daevid Vincent 
Also, just for SG this is how we are currently implementing it, but we feel
the REGEXP is killing our queries and while clever is a bit hacky and
nullifies any indexes we have on the genres column as it requires a
file_sort table scan to compare substrings basically...

SELECT * FROM scene_all_genres WHERE scene_id = 17;

scene_id  genres  
  
  17  1|3|10|19|38|53|58|59|  

SELECT * FROM scene_all_genres WHERE scene_id = 11;

scene_id  genres 
  ---
  11  1|10|19|31|32|59|  

Except that, it seems to me, it somehow reflects the reality of assigning 
attributes to the scenes (movies?) that you catalog. In a way, it looks very 
much like using a bitstring wherin each place stands for one attribute. If, 
say, the bitstring for that which the user gladly picks something is called 
glad, and that for which the user is loath to pick something is called 
loath, an expression for fulfilling all attributes is
(glad  genre) = glad AND (loath  genre) = 0,
with no bit-telling. 


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-12 Thread Rick James
Thinking out of the box... (And posting my reply at the 'wrong' end of the 
email.)...

Are there fewer than 64 genres?  Use a SET or BIGINT UNSIGNED.

AND sg.`genre_id` IN (10,38)
AND sg.`genre_id` NOT IN (22,61)

--
AND  genre  ((110) | (138)) != 0
AND  genre  ((122) | (161))  = 0

This would avoid having that extra table, and save a lot of space.

If you have more than 64 genres, then 
Plan A: clump them into some kind of application grouping and use multiple 
INTs/SETs.
Plan B: do mod  div arithmetic to compute which genre field to tackle.

For B, something like:
AND  (genre1  (10)) + (genre3  (18)) != 0
AND  (genre2  (12)) + (genre6  (11))  = 0
(That's assuming 10 bits per genre# field.  I would use 32 and INT UNSIGNED.)



 -Original Message-
 From: Daevid Vincent [mailto:dae...@daevid.com]
 Sent: Tuesday, June 11, 2013 4:17 PM
 To: mysql@lists.mysql.com
 Cc: 'shawn green'
 Subject: RE: How do I select all rows of table that have some rows in
 another table (AND, not OR)
 
 
 
  -Original Message-
  From: shawn green [mailto:shawn.l.gr...@oracle.com]
  Sent: Tuesday, June 11, 2013 2:16 PM
  To: mysql@lists.mysql.com
  Subject: Re: How do I select all rows of table that have some rows in
  another table (AND, not OR)
 
  Hello Daevid,
 
  On 6/11/2013 3:59 PM, Daevid Vincent wrote:
   I am trying to implement a filter so that a user could select
   various
  genres
   they want in or out. Perhaps they like 'action' and 'car chases'
   but don't like 'foreign' and 'drifting' (or whatever...)
  
   So I want something sort of like this, however IN() is using an OR
   comparison when I need it to be an AND
  
   SELECT DISTINCT
s.`scene_id` AS `id`,
GROUP_CONCAT(sg.`genre_id`) FROM
`dvds` AS d
JOIN `scenes_list` AS s
ON s.`dvd_id` = d.`dvd_id`
JOIN `scenes_genres` AS sg
ON sg.`scene_id` = s.`scene_id`
AND sg.`genre_id` IN (10,38)
AND sg.`genre_id` NOT IN (22,61) GROUP BY
   s.`scene_id`;
  
   This is giving me way way too many rows returned.
  
   For example, I would expect this scene_id to be in the result set:
  
   SELECT * FROM scenes_genres WHERE scene_id = 17;
  
   scene_id  genre_id
     --
  17   1
  17   3
  17  10 --
  17  19
  17  38 --
  17  53
  17  58
  17  59
  
   And this scene ID to NOT be in the result set:
  
   SELECT * FROM scenes_genres WHERE scene_id = 11;
  
   scene_id  genre_id
     --
  11   1
  11  10 --
  11  19
  11  31
  11  32
-- but does not have 38
  11  59
  
   I've tried various subselect ideas, but of course this fails b/c
 genre_id
   can't be multiple things at one time (AND)
  
   JOIN `scenes_genres` AS sg
 ON sg.`scene_id` = s.`scene_id`
AND sg.`genre_id` IN (
  SELECT `genre_id` FROM `scenes_genres`
   WHERE `genre_id` = 10
  AND `genre_id` = 38
  AND `genre_id`  22
  AND `genre_id`  61
   )
  
   And straight up like this failure too...
  
  JOIN `scenes_genres` AS sg
ON sg.`scene_id` = s.`scene_id`
   AND (sg.`genre_id` = 10 AND sg.`genre_id` = 38)
   AND NOT (sg.`genre_id` = 22 AND sg.`genre_id` = 61)
  
   So I'm sort of out of ideas at this point and hoping someone has a
   way
 to
  do
   this.
  
   Also, just for SG this is how we are currently implementing it, but
   we
  feel
   the REGEXP is killing our queries and while clever is a bit hacky
   and nullifies any indexes we have on the genres column as it
   requires a file_sort table scan to compare substrings basically...
  
   SELECT * FROM scene_all_genres WHERE scene_id = 17;
  
   scene_id  genres
     
  17  1|3|10|19|38|53|58|59|
  
   SELECT * FROM scene_all_genres WHERE scene_id = 11;
  
   scene_id  genres
     ---
  11  1|10|19|31|32|59|
  
   SELECT DISTINCT
s.`scene_id` AS `id`,
sg.`genres`
   FROM
`scene_all_genres` AS sg,
`dvds` AS d,
`scenes_list` AS s
   WHERE  dvd_id` = d.`dvd_id`
AND sg.`scene_id` = s.`scene_id`
AND sg.`genres` REGEXP '[[::]]10[[::]].*[[::]]38[[::]]'
AND sg.`genres` NOT REGEXP
  '(([[::]]22[[::]])|([[::]]61[[::]]))'
  
   ;
  
   http://dev.mysql.com/doc/refman/5.0/en/regexp.html#operator_regexp
  
   [[::]], [[::]]
  
   These markers stand for word boundaries. They match the beginning
   and
 end
  of
   words, respectively. A word is a sequence of word characters

Re: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-12 Thread shawn green

Hello Daevid,

On 6/11/2013 7:17 PM, Daevid Vincent wrote:




-Original Message-
From: shawn green [mailto:shawn.l.gr...@oracle.com]
Sent: Tuesday, June 11, 2013 2:16 PM
To: mysql@lists.mysql.com
Subject: Re: How do I select all rows of table that have some rows in
another table (AND, not OR)

Hello Daevid,

... snip ...


Shawn, thank you for taking the time to reply. I wasn't expecting the
solution to be so much work with multiple statements like that. I was
thinking it could be done in one (or two, as in split out a portion of it in
PHP and re-insert it to the original SQL to avoid a JOIN or something). Part
of the issue is that we use PHP to generate the $sql string by appending
bits and pieces depending on the search criteria thereby keeping the 'path'
through the SQL statement simple and relatively linear. To implement this
would require significant re-writing and/or special cases where we could
introduce errors or omissions in the future. The frustrating part is that
the REGEXP query we use now only takes about 2 seconds on my DEV VM (same
database as PROD), however when the RDBMS is loaded it then takes up to 30
seconds so in theory it's not even that inefficient given the # rows. We do
use memcached for the results, but since there are so many combinations a
user could choose, our hit ratio is not so great and therefore the cache
isn't doing us much good and this is why the RDBMS can get loaded up easily.

How can an OR be so simple using IN() but AND be so overly complex?
Seems that mysql should have another function for ALL() that works just like
IN() to handle this kind of scenario.




As I said, we could have used a single command but you would have not 
been able to review the 'best' match scenario only those rows that were 
'complete' matches.


And, those three commands can easily be encapsulated within a stored 
procedure. Pass in two strings (one listing the values to find, one 
listing the values to reject) and use PREPARED STATEMENTS within the 
procedure to build the IN() lists within the first and second commands. 
 The value in the 'HAVING' clause in the last command (unless you use 
the other option of reviewing the list of 'closest' matches) can be set 
to the number of items in the list of things to find parameter to your 
procedure.


I was demonstrating a principle you could use and not necessarily giving 
you a full solution.  Some tweaking may be required.


Also,  by encapsulating what I wrote within a stored procedure, this 
changes my 3 statements to a single call that you can easily configure 
from your PHP application.


Regards,
--
Shawn Green
MySQL Principal Technical Support Engineer
Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
Office: Blountville, TN

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-12 Thread Daevid Vincent
This idea is so fancy pants and clever I *wish* it could have worked for me.
I checked and we actually have 65 genres currently (with more to come I'm
sure) LOL *FML*. I'm almost ashamed I didn't think of this. I 3 me some
bitmasks and this solution is so elegant. It's unfortunate there isn't a way
to use more than 64-bits natively.

You're RICK JAMES Bitch! :-p   (please tell me you know the Dave Chappelles
skit I'm referring to)

 -Original Message-
 From: Rick James [mailto:rja...@yahoo-inc.com]
 Sent: Wednesday, June 12, 2013 9:39 AM
 To: Daevid Vincent; mysql@lists.mysql.com
 Cc: 'shawn green'
 Subject: RE: How do I select all rows of table that have some rows in
 another table (AND, not OR)
 
 Thinking out of the box... (And posting my reply at the 'wrong' end of the
 email.)...
 
 Are there fewer than 64 genres?  Use a SET or BIGINT UNSIGNED.
 
 AND sg.`genre_id` IN (10,38)
 AND sg.`genre_id` NOT IN (22,61)
 
 --
 AND  genre  ((110) | (138)) != 0
 AND  genre  ((122) | (161))  = 0
 
 This would avoid having that extra table, and save a lot of space.
 
 If you have more than 64 genres, then
 Plan A: clump them into some kind of application grouping and use multiple
 INTs/SETs.
 Plan B: do mod  div arithmetic to compute which genre field to tackle.
 
 For B, something like:
 AND  (genre1  (10)) + (genre3  (18)) != 0
 AND  (genre2  (12)) + (genre6  (11))  = 0
 (That's assuming 10 bits per genre# field.  I would use 32 and INT
 UNSIGNED.)
 
 
 
  -Original Message-
  From: Daevid Vincent [mailto:dae...@daevid.com]
  Sent: Tuesday, June 11, 2013 4:17 PM
  To: mysql@lists.mysql.com
  Cc: 'shawn green'
  Subject: RE: How do I select all rows of table that have some rows in
  another table (AND, not OR)
 
 
 
   -Original Message-
   From: shawn green [mailto:shawn.l.gr...@oracle.com]
   Sent: Tuesday, June 11, 2013 2:16 PM
   To: mysql@lists.mysql.com
   Subject: Re: How do I select all rows of table that have some rows in
   another table (AND, not OR)
  
   Hello Daevid,
  
   On 6/11/2013 3:59 PM, Daevid Vincent wrote:
I am trying to implement a filter so that a user could select
various
   genres
they want in or out. Perhaps they like 'action' and 'car chases'
but don't like 'foreign' and 'drifting' (or whatever...)
   
So I want something sort of like this, however IN() is using an OR
comparison when I need it to be an AND
   
SELECT DISTINCT
 s.`scene_id` AS `id`,
 GROUP_CONCAT(sg.`genre_id`) FROM
 `dvds` AS d
 JOIN `scenes_list` AS s
 ON s.`dvd_id` = d.`dvd_id`
 JOIN `scenes_genres` AS sg
 ON sg.`scene_id` = s.`scene_id`
 AND sg.`genre_id` IN (10,38)
 AND sg.`genre_id` NOT IN (22,61) GROUP BY
s.`scene_id`;
   
This is giving me way way too many rows returned.
   
For example, I would expect this scene_id to be in the result set:
   
SELECT * FROM scenes_genres WHERE scene_id = 17;
   
scene_id  genre_id
  --
   17   1
   17   3
   17  10 --
   17  19
   17  38 --
   17  53
   17  58
   17  59
   
And this scene ID to NOT be in the result set:
   
SELECT * FROM scenes_genres WHERE scene_id = 11;
   
scene_id  genre_id
  --
   11   1
   11  10 --
   11  19
   11  31
   11  32
   -- but does not have 38
   11  59
   
I've tried various subselect ideas, but of course this fails b/c
  genre_id
can't be multiple things at one time (AND)
   
JOIN `scenes_genres` AS sg
  ON sg.`scene_id` = s.`scene_id`
 AND sg.`genre_id` IN (
   SELECT `genre_id` FROM `scenes_genres`
WHERE `genre_id` = 10
   AND `genre_id` = 38
   AND `genre_id`  22
   AND `genre_id`  61
)
   
And straight up like this failure too...
   
   JOIN `scenes_genres` AS sg
 ON sg.`scene_id` = s.`scene_id`
AND (sg.`genre_id` = 10 AND sg.`genre_id` = 38)
AND NOT (sg.`genre_id` = 22 AND sg.`genre_id` = 61)
   
So I'm sort of out of ideas at this point and hoping someone has a
way
  to
   do
this.
   
Also, just for SG this is how we are currently implementing it, but
we
   feel
the REGEXP is killing our queries and while clever is a bit hacky
and nullifies any indexes we have on the genres column as it
requires a file_sort table scan to compare substrings basically...
   
SELECT * FROM scene_all_genres WHERE scene_id = 17;
   
scene_id  genres

RE: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-12 Thread Daevid Vincent
Oh! I must have misread. I didn't see how you had a solution for  64 bits.
I may have to experiment with that!

 -Original Message-
 From: Daevid Vincent [mailto:dae...@daevid.com]
 Sent: Wednesday, June 12, 2013 11:26 AM
 To: mysql@lists.mysql.com
 Subject: RE: How do I select all rows of table that have some rows in
 another table (AND, not OR)
 
 This idea is so fancy pants and clever I *wish* it could have worked for
 me. I checked and we actually have 65 genres currently (with more to come
 I'm sure) LOL *FML*. I'm almost ashamed I didn't think of this. I 3 me
 some bitmasks and this solution is so elegant. It's unfortunate there
 isn't a way to use more than 64-bits natively.
 
 You're RICK JAMES Bitch! :-p   (please tell me you know the Dave
 Chappelles skit I'm referring to)
 
  -Original Message-
  From: Rick James [mailto:rja...@yahoo-inc.com]
  Sent: Wednesday, June 12, 2013 9:39 AM
  To: Daevid Vincent; mysql@lists.mysql.com
  Cc: 'shawn green'
  Subject: RE: How do I select all rows of table that have some rows in
  another table (AND, not OR)
 
  Thinking out of the box... (And posting my reply at the 'wrong' end of
 the
  email.)...
 
  Are there fewer than 64 genres?  Use a SET or BIGINT UNSIGNED.
 
  AND sg.`genre_id` IN (10,38)
  AND sg.`genre_id` NOT IN (22,61)
 
  --
  AND  genre  ((110) | (138)) != 0
  AND  genre  ((122) | (161))  = 0
 
  This would avoid having that extra table, and save a lot of space.
 
  If you have more than 64 genres, then
  Plan A: clump them into some kind of application grouping and use
 multiple
  INTs/SETs.
  Plan B: do mod  div arithmetic to compute which genre field to tackle.
 
  For B, something like:
  AND  (genre1  (10)) + (genre3  (18)) != 0
  AND  (genre2  (12)) + (genre6  (11))  = 0
  (That's assuming 10 bits per genre# field.  I would use 32 and INT
  UNSIGNED.)
 
 
 
   -Original Message-
   From: Daevid Vincent [mailto:dae...@daevid.com]
   Sent: Tuesday, June 11, 2013 4:17 PM
   To: mysql@lists.mysql.com
   Cc: 'shawn green'
   Subject: RE: How do I select all rows of table that have some rows in
   another table (AND, not OR)
  
  
  
-Original Message-
From: shawn green [mailto:shawn.l.gr...@oracle.com]
Sent: Tuesday, June 11, 2013 2:16 PM
To: mysql@lists.mysql.com
Subject: Re: How do I select all rows of table that have some rows
 in
another table (AND, not OR)
   
Hello Daevid,
   
On 6/11/2013 3:59 PM, Daevid Vincent wrote:
 I am trying to implement a filter so that a user could select
 various
genres
 they want in or out. Perhaps they like 'action' and 'car
 chases'
 but don't like 'foreign' and 'drifting' (or whatever...)

 So I want something sort of like this, however IN() is using an
 OR
 comparison when I need it to be an AND

 SELECT DISTINCT
  s.`scene_id` AS `id`,
  GROUP_CONCAT(sg.`genre_id`) FROM
  `dvds` AS d
  JOIN `scenes_list` AS s
  ON s.`dvd_id` = d.`dvd_id`
  JOIN `scenes_genres` AS sg
  ON sg.`scene_id` = s.`scene_id`
  AND sg.`genre_id` IN (10,38)
  AND sg.`genre_id` NOT IN (22,61) GROUP BY
 s.`scene_id`;

 This is giving me way way too many rows returned.

 For example, I would expect this scene_id to be in the result set:

 SELECT * FROM scenes_genres WHERE scene_id = 17;

 scene_id  genre_id
   --
17   1
17   3
17  10 --
17  19
17  38 --
17  53
17  58
17  59

 And this scene ID to NOT be in the result set:

 SELECT * FROM scenes_genres WHERE scene_id = 11;

 scene_id  genre_id
   --
11   1
11  10 --
11  19
11  31
11  32
  -- but does not have 38
11  59

 I've tried various subselect ideas, but of course this fails b/c
   genre_id
 can't be multiple things at one time (AND)

 JOIN `scenes_genres` AS sg
   ON sg.`scene_id` = s.`scene_id`
  AND sg.`genre_id` IN (
SELECT `genre_id` FROM `scenes_genres`
 WHERE `genre_id` = 10
AND `genre_id` = 38
AND `genre_id`  22
AND `genre_id`  61
 )

 And straight up like this failure too...

JOIN `scenes_genres` AS sg
  ON sg.`scene_id` = s.`scene_id`
 AND (sg.`genre_id` = 10 AND sg.`genre_id` = 38)
 AND NOT (sg.`genre_id` = 22 AND sg.`genre_id` = 61)

 So I'm sort of out

RE: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-12 Thread Rick James
I'm the ORIGINAL Rick James, B  (And, I'm still alive.)  LOL

If you are using PHP, you might want to stop at 31 bits per INT/SET.  PHP seems 
not to yet be in the 64-bit world.

 -Original Message-
 From: Daevid Vincent [mailto:dae...@daevid.com]
 Sent: Wednesday, June 12, 2013 11:26 AM
 To: mysql@lists.mysql.com
 Subject: RE: How do I select all rows of table that have some rows in
 another table (AND, not OR)
 
 This idea is so fancy pants and clever I *wish* it could have worked for
 me.
 I checked and we actually have 65 genres currently (with more to come I'm
 sure) LOL *FML*. I'm almost ashamed I didn't think of this. I 3 me some
 bitmasks and this solution is so elegant. It's unfortunate there isn't a
 way to use more than 64-bits natively.
 
 You're RICK JAMES Bitch! :-p   (please tell me you know the Dave
 Chappelles
 skit I'm referring to)
 
  -Original Message-
  From: Rick James [mailto:rja...@yahoo-inc.com]
  Sent: Wednesday, June 12, 2013 9:39 AM
  To: Daevid Vincent; mysql@lists.mysql.com
  Cc: 'shawn green'
  Subject: RE: How do I select all rows of table that have some rows in
  another table (AND, not OR)
 
  Thinking out of the box... (And posting my reply at the 'wrong' end of
  the email.)...
 
  Are there fewer than 64 genres?  Use a SET or BIGINT UNSIGNED.
 
  AND sg.`genre_id` IN (10,38)
  AND sg.`genre_id` NOT IN (22,61)
 
  --
  AND  genre  ((110) | (138)) != 0
  AND  genre  ((122) | (161))  = 0
 
  This would avoid having that extra table, and save a lot of space.
 
  If you have more than 64 genres, then
  Plan A: clump them into some kind of application grouping and use
  multiple INTs/SETs.
  Plan B: do mod  div arithmetic to compute which genre field to tackle.
 
  For B, something like:
  AND  (genre1  (10)) + (genre3  (18)) != 0
  AND  (genre2  (12)) + (genre6  (11))  = 0 (That's assuming
  10 bits per genre# field.  I would use 32 and INT
  UNSIGNED.)
 
 
 
   -Original Message-
   From: Daevid Vincent [mailto:dae...@daevid.com]
   Sent: Tuesday, June 11, 2013 4:17 PM
   To: mysql@lists.mysql.com
   Cc: 'shawn green'
   Subject: RE: How do I select all rows of table that have some rows
   in another table (AND, not OR)
  
  
  
-Original Message-
From: shawn green [mailto:shawn.l.gr...@oracle.com]
Sent: Tuesday, June 11, 2013 2:16 PM
To: mysql@lists.mysql.com
Subject: Re: How do I select all rows of table that have some rows
in another table (AND, not OR)
   
Hello Daevid,
   
On 6/11/2013 3:59 PM, Daevid Vincent wrote:
 I am trying to implement a filter so that a user could select
 various
genres
 they want in or out. Perhaps they like 'action' and 'car
 chases'
 but don't like 'foreign' and 'drifting' (or whatever...)

 So I want something sort of like this, however IN() is using an
 OR
 comparison when I need it to be an AND

 SELECT DISTINCT
  s.`scene_id` AS `id`,
  GROUP_CONCAT(sg.`genre_id`) FROM
  `dvds` AS d
  JOIN `scenes_list` AS s
  ON s.`dvd_id` = d.`dvd_id`
  JOIN `scenes_genres` AS sg
  ON sg.`scene_id` = s.`scene_id`
  AND sg.`genre_id` IN (10,38)
  AND sg.`genre_id` NOT IN (22,61) GROUP BY
 s.`scene_id`;

 This is giving me way way too many rows returned.

 For example, I would expect this scene_id to be in the result set:

 SELECT * FROM scenes_genres WHERE scene_id = 17;

 scene_id  genre_id
   --
17   1
17   3
17  10 --
17  19
17  38 --
17  53
17  58
17  59

 And this scene ID to NOT be in the result set:

 SELECT * FROM scenes_genres WHERE scene_id = 11;

 scene_id  genre_id
   --
11   1
11  10 --
11  19
11  31
11  32
  -- but does not have 38
11  59

 I've tried various subselect ideas, but of course this fails b/c
   genre_id
 can't be multiple things at one time (AND)

 JOIN `scenes_genres` AS sg
   ON sg.`scene_id` = s.`scene_id`
  AND sg.`genre_id` IN (
SELECT `genre_id` FROM `scenes_genres`
 WHERE `genre_id` = 10
AND `genre_id` = 38
AND `genre_id`  22
AND `genre_id`  61
 )

 And straight up like this failure too...

JOIN `scenes_genres` AS sg
  ON sg.`scene_id` = s.`scene_id`
 AND (sg.`genre_id` = 10 AND sg.`genre_id` = 38)
 AND NOT (sg

RE: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-12 Thread Paul Halliday
I am so, so glad that someone finally said what I think each time I see a 
message from you Mr. James. 

 Original message 
From: Rick James rja...@yahoo-inc.com 
Date: 06-12-2013  8:45 PM  (GMT-04:00) 
To: Daevid Vincent dae...@daevid.com,mysql@lists.mysql.com 
Subject: RE: How do I select all rows of table that have some rows in
  another table (AND, not OR) 
 
I'm the ORIGINAL Rick James, B  (And, I'm still alive.)  LOL

If you are using PHP, you might want to stop at 31 bits per INT/SET.  PHP seems 
not to yet be in the 64-bit world.

 -Original Message-
 From: Daevid Vincent [mailto:dae...@daevid.com]
 Sent: Wednesday, June 12, 2013 11:26 AM
 To: mysql@lists.mysql.com
 Subject: RE: How do I select all rows of table that have some rows in
 another table (AND, not OR)
 
 This idea is so fancy pants and clever I *wish* it could have worked for
 me.
 I checked and we actually have 65 genres currently (with more to come I'm
 sure) LOL *FML*. I'm almost ashamed I didn't think of this. I 3 me some
 bitmasks and this solution is so elegant. It's unfortunate there isn't a
 way to use more than 64-bits natively.
 
 You're RICK JAMES Bitch! :-p   (please tell me you know the Dave
 Chappelles
 skit I'm referring to)
 
  -Original Message-
  From: Rick James [mailto:rja...@yahoo-inc.com]
  Sent: Wednesday, June 12, 2013 9:39 AM
  To: Daevid Vincent; mysql@lists.mysql.com
  Cc: 'shawn green'
  Subject: RE: How do I select all rows of table that have some rows in
  another table (AND, not OR)
 
  Thinking out of the box... (And posting my reply at the 'wrong' end of
  the email.)...
 
  Are there fewer than 64 genres?  Use a SET or BIGINT UNSIGNED.
 
  AND sg.`genre_id` IN (10,38)
  AND sg.`genre_id` NOT IN (22,61)
 
  --
  AND  genre  ((110) | (138)) != 0
  AND  genre  ((122) | (161))  = 0
 
  This would avoid having that extra table, and save a lot of space.
 
  If you have more than 64 genres, then
  Plan A: clump them into some kind of application grouping and use
  multiple INTs/SETs.
  Plan B: do mod  div arithmetic to compute which genre field to tackle.
 
  For B, something like:
  AND  (genre1  (10)) + (genre3  (18)) != 0
  AND  (genre2  (12)) + (genre6  (11))  = 0 (That's assuming
  10 bits per genre# field.  I would use 32 and INT
  UNSIGNED.)
 
 
 
   -Original Message-
   From: Daevid Vincent [mailto:dae...@daevid.com]
   Sent: Tuesday, June 11, 2013 4:17 PM
   To: mysql@lists.mysql.com
   Cc: 'shawn green'
   Subject: RE: How do I select all rows of table that have some rows
   in another table (AND, not OR)
  
  
  
-Original Message-
From: shawn green [mailto:shawn.l.gr...@oracle.com]
Sent: Tuesday, June 11, 2013 2:16 PM
To: mysql@lists.mysql.com
Subject: Re: How do I select all rows of table that have some rows
in another table (AND, not OR)
   
Hello Daevid,
   
On 6/11/2013 3:59 PM, Daevid Vincent wrote:
 I am trying to implement a filter so that a user could select
 various
genres
 they want in or out. Perhaps they like 'action' and 'car
 chases'
 but don't like 'foreign' and 'drifting' (or whatever...)

 So I want something sort of like this, however IN() is using an
 OR
 comparison when I need it to be an AND

 SELECT DISTINCT
  s.`scene_id` AS `id`,
  GROUP_CONCAT(sg.`genre_id`) FROM
  `dvds` AS d
  JOIN `scenes_list` AS s
  ON s.`dvd_id` = d.`dvd_id`
  JOIN `scenes_genres` AS sg
  ON sg.`scene_id` = s.`scene_id`
  AND sg.`genre_id` IN (10,38)
  AND sg.`genre_id` NOT IN (22,61) GROUP BY
 s.`scene_id`;

 This is giving me way way too many rows returned.

 For example, I would expect this scene_id to be in the result set:

 SELECT * FROM scenes_genres WHERE scene_id = 17;

 scene_id  genre_id
   --
    17   1
    17   3
    17  10 --
    17  19
    17  38 --
    17  53
    17  58
    17  59

 And this scene ID to NOT be in the result set:

 SELECT * FROM scenes_genres WHERE scene_id = 11;

 scene_id  genre_id
   --
    11   1
    11  10 --
    11  19
    11  31
    11  32
    -- but does not have 38
    11  59

 I've tried various subselect ideas, but of course this fails b/c
   genre_id
 can't be multiple things at one time (AND)

 JOIN `scenes_genres` AS sg
   ON sg.`scene_id` = s.`scene_id`
  AND sg.`genre_id` IN (
    SELECT `genre_id` FROM `scenes_genres`
 WHERE `genre_id` = 10

Re: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-11 Thread shawn green

Hello Daevid,

On 6/11/2013 3:59 PM, Daevid Vincent wrote:

I am trying to implement a filter so that a user could select various genres
they want in or out. Perhaps they like 'action' and 'car chases' but
don't like 'foreign' and 'drifting' (or whatever...)

So I want something sort of like this, however IN() is using an OR
comparison when I need it to be an AND

SELECT DISTINCT
 s.`scene_id` AS `id`,
 GROUP_CONCAT(sg.`genre_id`)
FROM
 `dvds` AS d
 JOIN `scenes_list` AS s
 ON s.`dvd_id` = d.`dvd_id`
 JOIN `scenes_genres` AS sg
 ON sg.`scene_id` = s.`scene_id`
 AND sg.`genre_id` IN (10,38)
 AND sg.`genre_id` NOT IN (22,61)
GROUP BY s.`scene_id`;

This is giving me way way too many rows returned.

For example, I would expect this scene_id to be in the result set:

SELECT * FROM scenes_genres WHERE scene_id = 17;

scene_id  genre_id
  --
   17   1
   17   3
   17  10 --
   17  19
   17  38 --
   17  53
   17  58
   17  59

And this scene ID to NOT be in the result set:

SELECT * FROM scenes_genres WHERE scene_id = 11;

scene_id  genre_id
  --
   11   1
   11  10 --
   11  19
   11  31
   11  32
   -- but does not have 38
   11  59

I've tried various subselect ideas, but of course this fails b/c genre_id
can't be multiple things at one time (AND)

JOIN `scenes_genres` AS sg
  ON sg.`scene_id` = s.`scene_id`
 AND sg.`genre_id` IN (
   SELECT `genre_id` FROM `scenes_genres`
WHERE `genre_id` = 10
   AND `genre_id` = 38
   AND `genre_id`  22
   AND `genre_id`  61
)

And straight up like this failure too...

   JOIN `scenes_genres` AS sg
 ON sg.`scene_id` = s.`scene_id`
AND (sg.`genre_id` = 10 AND sg.`genre_id` = 38)
AND NOT (sg.`genre_id` = 22 AND sg.`genre_id` = 61)

So I'm sort of out of ideas at this point and hoping someone has a way to do
this.

Also, just for SG this is how we are currently implementing it, but we feel
the REGEXP is killing our queries and while clever is a bit hacky and
nullifies any indexes we have on the genres column as it requires a
file_sort table scan to compare substrings basically...

SELECT * FROM scene_all_genres WHERE scene_id = 17;

scene_id  genres
  
   17  1|3|10|19|38|53|58|59|

SELECT * FROM scene_all_genres WHERE scene_id = 11;

scene_id  genres
  ---
   11  1|10|19|31|32|59|

SELECT DISTINCT
 s.`scene_id` AS `id`,
 sg.`genres`
FROM
 `scene_all_genres` AS sg,
 `dvds` AS d,
 `scenes_list` AS s
WHEREdvd_id` = d.`dvd_id`
 AND sg.`scene_id` = s.`scene_id`
 AND sg.`genres` REGEXP '[[::]]10[[::]].*[[::]]38[[::]]'
 AND sg.`genres` NOT REGEXP '(([[::]]22[[::]])|([[::]]61[[::]]))'

;

http://dev.mysql.com/doc/refman/5.0/en/regexp.html#operator_regexp

[[::]], [[::]]

These markers stand for word boundaries. They match the beginning and end of
words, respectively. A word is a sequence of word characters that is not
preceded by or followed by word characters. A word character is an
alphanumeric character in the alnum class or an underscore (_).



To me it looks like you want to know how to match N for N when looking 
for subset of properties. It's a multi-step process using plain SQL but 
it's fast:


1) Build a list of scenes containing the genres you want to see

CREATE TEMPORARY TABLE tmpList(key(scene_id, genre_id))
SELECT scene_id, genre_id
FROM scenes_genres
WHERE genre_id` IN (10,38)  -- in this case you have 2 terms to mach

2) From that list, determine which of those scenes also contain unwanted 
genres and remove them.


DELETE tmpList
FROM tmpList t
INNER JOIN scenes_genres sg
  on sg.scene_id = t.scene_id
  and sg.genre_id IN (22,61)

# at this point, tmpList contains all scenes that have any of the 
desired genres but none of the unwanted ones.


3) Check to see if any scene has all N matches.

SELECT scene_id, count(genre_id) as matches
FROM tmpList
GROUP BY scene_id
HAVING matches = 2
/* ^--- this is the number of terms you are trying to match. */


Of course you can modify the last query to eliminate the HAVING clause 
and pick the top 5 matching scenes (even if they are partial matches) 
like this


SELECT scene_id, count(genre_id) as matches
FROM tmpList
GROUP BY scene_id
ORDER BY matches DESC
LIMIT 5

Let us know if this is not what you wanted to do.



Note to the rest of the list:

Yes, we could have done most of this with self-joins or subqueries, an 
EXISTS, and a NOT EXISTS inside a single statement but I think the 
execution time would have been miserable. I also think that that 
approach also 

RE: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-11 Thread Daevid Vincent


 -Original Message-
 From: shawn green [mailto:shawn.l.gr...@oracle.com]
 Sent: Tuesday, June 11, 2013 2:16 PM
 To: mysql@lists.mysql.com
 Subject: Re: How do I select all rows of table that have some rows in
 another table (AND, not OR)
 
 Hello Daevid,
 
 On 6/11/2013 3:59 PM, Daevid Vincent wrote:
  I am trying to implement a filter so that a user could select various
 genres
  they want in or out. Perhaps they like 'action' and 'car chases' but
  don't like 'foreign' and 'drifting' (or whatever...)
 
  So I want something sort of like this, however IN() is using an OR
  comparison when I need it to be an AND
 
  SELECT DISTINCT
   s.`scene_id` AS `id`,
   GROUP_CONCAT(sg.`genre_id`)
  FROM
   `dvds` AS d
   JOIN `scenes_list` AS s
   ON s.`dvd_id` = d.`dvd_id`
   JOIN `scenes_genres` AS sg
   ON sg.`scene_id` = s.`scene_id`
   AND sg.`genre_id` IN (10,38)
   AND sg.`genre_id` NOT IN (22,61)
  GROUP BY s.`scene_id`;
 
  This is giving me way way too many rows returned.
 
  For example, I would expect this scene_id to be in the result set:
 
  SELECT * FROM scenes_genres WHERE scene_id = 17;
 
  scene_id  genre_id
    --
 17   1
 17   3
 17  10 --
 17  19
 17  38 --
 17  53
 17  58
 17  59
 
  And this scene ID to NOT be in the result set:
 
  SELECT * FROM scenes_genres WHERE scene_id = 11;
 
  scene_id  genre_id
    --
 11   1
 11  10 --
 11  19
 11  31
 11  32
 -- but does not have 38
 11  59
 
  I've tried various subselect ideas, but of course this fails b/c
genre_id
  can't be multiple things at one time (AND)
 
  JOIN `scenes_genres` AS sg
ON sg.`scene_id` = s.`scene_id`
   AND sg.`genre_id` IN (
 SELECT `genre_id` FROM `scenes_genres`
  WHERE `genre_id` = 10
 AND `genre_id` = 38
 AND `genre_id`  22
 AND `genre_id`  61
  )
 
  And straight up like this failure too...
 
 JOIN `scenes_genres` AS sg
   ON sg.`scene_id` = s.`scene_id`
  AND (sg.`genre_id` = 10 AND sg.`genre_id` = 38)
  AND NOT (sg.`genre_id` = 22 AND sg.`genre_id` = 61)
 
  So I'm sort of out of ideas at this point and hoping someone has a way
to
 do
  this.
 
  Also, just for SG this is how we are currently implementing it, but we
 feel
  the REGEXP is killing our queries and while clever is a bit hacky and
  nullifies any indexes we have on the genres column as it requires a
  file_sort table scan to compare substrings basically...
 
  SELECT * FROM scene_all_genres WHERE scene_id = 17;
 
  scene_id  genres
    
 17  1|3|10|19|38|53|58|59|
 
  SELECT * FROM scene_all_genres WHERE scene_id = 11;
 
  scene_id  genres
    ---
 11  1|10|19|31|32|59|
 
  SELECT DISTINCT
   s.`scene_id` AS `id`,
   sg.`genres`
  FROM
   `scene_all_genres` AS sg,
   `dvds` AS d,
   `scenes_list` AS s
  WHEREdvd_id` = d.`dvd_id`
   AND sg.`scene_id` = s.`scene_id`
   AND sg.`genres` REGEXP '[[::]]10[[::]].*[[::]]38[[::]]'
   AND sg.`genres` NOT REGEXP
 '(([[::]]22[[::]])|([[::]]61[[::]]))'
 
  ;
 
  http://dev.mysql.com/doc/refman/5.0/en/regexp.html#operator_regexp
 
  [[::]], [[::]]
 
  These markers stand for word boundaries. They match the beginning and
end
 of
  words, respectively. A word is a sequence of word characters that is not
  preceded by or followed by word characters. A word character is an
  alphanumeric character in the alnum class or an underscore (_).
 
 
 To me it looks like you want to know how to match N for N when looking
 for subset of properties. It's a multi-step process using plain SQL but
 it's fast:
 
 1) Build a list of scenes containing the genres you want to see
 
 CREATE TEMPORARY TABLE tmpList(key(scene_id, genre_id))
 SELECT scene_id, genre_id
 FROM scenes_genres
 WHERE genre_id` IN (10,38)  -- in this case you have 2 terms to mach
 
 2) From that list, determine which of those scenes also contain unwanted
 genres and remove them.
 
 DELETE tmpList
 FROM tmpList t
 INNER JOIN scenes_genres sg
on sg.scene_id = t.scene_id
and sg.genre_id IN (22,61)
 
 # at this point, tmpList contains all scenes that have any of the
 desired genres but none of the unwanted ones.
 
 3) Check to see if any scene has all N matches.
 
 SELECT scene_id, count(genre_id) as matches
 FROM tmpList
 GROUP BY scene_id
 HAVING matches = 2
 /* ^--- this is the number of terms you are trying to match. */
 
 
 Of course you can modify the last query to eliminate the HAVING clause
 and pick