>>>> 2013/06/11 12:59 -0700, Daevid Vincent >>>> Also, just for S&G this is how we are currently implementing it, but we feel the REGEXP is killing our queries and while "clever" is a bit hacky and nullifies any indexes we have on the genres column as it requires a file_sort table scan to compare substrings basically...
SELECT * FROM scene_all_genres WHERE scene_id = 17; scene_id genres -------- ------------------------ 17 1|3|10|19|38|53|58|59| SELECT * FROM scene_all_genres WHERE scene_id = 11; scene_id genres -------- ------------------- 11 1|10|19|31|32|59| <<<<<<<< Except that, it seems to me, it somehow reflects the reality of assigning attributes to the scenes (movies?) that you catalog. In a way, it looks very much like using a bitstring wherin each place stands for one attribute. If you then have also a bitstring for each user s likes and one for rows (peeves), telling howmany 1s are at the same place for the "genres" and liking (bit-AND, MySQL "&" followed by telling the number of 1s), and same for the "genres" and the row or peeve yields a number howmany match for liking, and how many match for becoming peeved. If the liking is enough greater than the becoming peeved, the scene and the user match. Unhappily, although this, using bitstring for set of attributes to match, is an old and well understood topic, MySQL s support for bitstrings is poor, limited to integers (as C is so limited)--that is, to 64 bits. If you have more, you have to use more "words". There is, furthermore, no function for telling howmany 1s (or 0s) there are in an integer. Now, if, in a more perfect world, MySQL had bitstring, and, furthermore, MySQL s SET were mapped onto bitstring, where it belongs, you could not only use bit operations (MySQL s & | ^), but also name the bits as you like. The problem with writing one s own bit-telling function is, of course, time, and hiding useful information from the optimizer. In any case, here is a function for it, using an old well worn trick that depends on binary arithmetic: delimiter ? create function bittell(B INTEGER) RETURNS INTEGER DETERMINISTIC NO SQL COMMENT 'Howmany 1s in argument?' begin declare E integer; SET E = 0; WHILE B <> 0 DO set B = (B-1) & B, E = E + 1; end WHILE; RETURN E; end ? delimiter ; If you stick with the character-string set, with a slight change in representation you can use a simpler-looking pattern--not more efficient, if MySQL s implementation is good, but of easier reading: separate the decimal numerals with a character that is neither a decimal digit nor a REGEXP operator, and bound the whole string with it--comma or semicolon (among others) are good. ',1,10,19,31,32,59,' REGEXP ',10,+.*,38,' is 0 ',1,10,19,31,32,59,' REGEXP ',10,+.*,32,' is 1 (See also FIND_IN_SET.) Somewhere I read that for lack of support bitstring has been withdrawn from the SQL standard. This is such an obvious use; why is it not supported? -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql